Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1620226: Perform CRL generation without exceeding memory and maxing CPU #2204

Merged
merged 1 commit into from Jan 15, 2019

Conversation

mstead
Copy link
Contributor

@mstead mstead commented Jan 11, 2019

Using a HashSet instead of an ArrayList to store the serials improves
the performance of .contains when checking if an entry in the CRL file
should be removed due to expiry.

Batch lookup/process fetched serials to keep the memory footprint down
and to allow garbage collection of the serials earlier. The number of
records per batch can be configured via the candlepin config file.

NOTE: Due to the way the CRL generation framework works, when batching
the serial processing, we end up re-scanning the crl num_serials/batch_size times.
This isn't ideal, but we will make due until we make the move to generating
a plain text serial file.

Testing

  1. Deploy candlepin from the master branch.
  2. Stop tomcat, drop the candlepin DB and replace it with one loaded with millions of cert serials:
$ wget http://file.rdu.redhat.com/mstead/pr/1620226/db-with-lots-of-serials-for-crl-testing.tar.gz
$ sudo systemctl stop tomcat
$ tar xvzf db-with-lots-of-serials-for-crl-testing.tar.gz
$ dropdb -U candlepin candlepin && createdb -U candlepin candlepin && psql -U candlepin candlepin < lots-of-serials-for-crl-gen-testing.sql
  1. Restart tomcat and manually run the CRL job. It won't take long before OOM exception is thrown.
$ sudo systemctl start tomcat
$ curl -k -u admin:admin  -X POST https://localhost:8443/candlepin/jobs/schedule/CertificateRevocationListTask
  1. Set up logging so that you can track the progress of the job.
log4j.logger.org.candlepin.util.CrlFileUtil=DEBUG
  1. Deploy candlepin from this PR's branch against the same database.
$ bin/deploy
  1. In a separate terminal, tail the candlepin log so that you can see what's going on.
# Generally I tail the log just to get the job id, then use grep to narrow to just that job.
tail -f /var/log/candlepin/candlepin.log
# once the job is running, re-tail and use grep.
tail -f /var/log/candlepin/candlepin.log | grep <YOUR_JOB_ID>
  1. In a separate terminal, use 'top' to monitor tomcat's cpu and memory usage.
top | grep tomcat
  1. Run the CRL job again. It will take about 1h - 1h20m to fully complete. Memory usage shouldn't rise more than a couple Gig and CPU should only go to 100% in bursts.
curl -k -u admin:admin  -X POST https://localhost:8443/candlepin/jobs/schedule/CertificateRevocationListTask

@mstead
Copy link
Contributor Author

mstead commented Jan 14, 2019

retest this please

*/
public CandlepinQuery<Long> getUncollectedRevokedCertSerials() {
DetachedCriteria criteria = DetachedCriteria.forClass(CertificateSerial.class)
.add(Restrictions.gt("expiration", getExpiryRestriction()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

public boolean syncCRLWithDB(File file) throws IOException {
List<Long> uncollected = this.certificateSerialCurator.getUncollectedRevokedCertSerials().list();
List<BigInteger> revoke = new ArrayList<>(uncollected.size());
public int batchSyncCRLWithDB(File crlFile, int batchSize) throws IOException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given how we use the batchSize, is it something we want to do once here rather than implicitly forcing the configured value to be looked up by the two callers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm Ok with either. I was just trying to avoid injecting the candlepin config into the curator. No big deal if you'd like to see this changed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Up to you, given discussions around this class.

For what it's worth, I see the config lookup being done here internally to the CrlFileUtil rather than the curator. But if the curator itself ended up needing it, it's already injected into every curator through the AbstractHibernatCurator; so we're good there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair points, I'll move it into the CrlFileUtil.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DONE. Once you give the changes a once over I'll squash the update commit before it is merged.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good

@Ceiu
Copy link
Contributor

Ceiu commented Jan 14, 2019

This is good to be merged, but I want to give a chance to respond to the comments above regarding the location of the config reading.

@Ceiu
Copy link
Contributor

Ceiu commented Jan 15, 2019

Changes look good. Squash away when you have time.

Using a HashSet instead of an ArrayList to store the serials improves
the performance of .contains when checking if an entry in the CRL file
should be removed due to expiry.

Batch lookup/process fetched serials to keep the memory footprint down
and to allow garbage collection of the serials earlier. The number of
records per batch can be configured via the candlepin config file.

NOTE: Due to the way the CRL generation framework works, when batching
the serial processing, we end up re-scanning the crl num_serials/batch_size times.
This isn't ideal, but we will make due until we make the move to generating
a plain text serial file.
@mstead
Copy link
Contributor Author

mstead commented Jan 15, 2019

Changes look good. Squash away when you have time.

DONE

@Ceiu Ceiu merged commit b2bbbc1 into master Jan 15, 2019
@Ceiu Ceiu deleted the mstead/M/CRL-performance-fix branch January 15, 2019 16:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants