New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

look up rights metadata for rights statements faster #705

Merged
merged 3 commits into from Jul 19, 2017

Conversation

Projects
None yet
2 participants
@jrochkind
Contributor

jrochkind commented Jul 18, 2017

We were using , from QA gem.

This ended up re-loading YAML from disk on every call. Either the QA code needs caching
it doesn't have, or we're using QA wrong, or both. Rather than figure it out,
we write our own simple wrapper to get the info from the same .yml file,
and make sure it's cached globally. Nothing fancy about this no reason
not to write ourselves.

look up rights metadata for rights statements faster
We were using , from QA gem.

This ended up re-loading YAML from disk on every call. Either the QA code needs caching
it doesn't have, or we're using QA wrong, or both.  Rather than figure it out,
we write our own simple wrapper to get the info from the same .yml file,
and make sure it's cached globally. Nothing fancy about this no reason
not to write ourselves.
@jrochkind

This comment has been minimized.

Show comment
Hide comment
@jrochkind

jrochkind Jul 18, 2017

Contributor

Before PR, we were looking up stuff from .yml file via CC/QA with calls like this:

CurationConcerns::LicenseService.new.authority.find(identifier)

Profiling suggested that there might have been a lot of time spent in YAML.load. Was this reloading YAML from disk all the time instead of using a cached copy?

Looking at stack traces and QA code (we are currently using qa 0.11.1), I first noticed this:

https://github.com/samvera/questioning_authority/blob/v0.11.1/lib/qa/authorities/local/file_based_authority.rb#L28

Indeed there is no caching going on, every time you ask this authority to find or search it calls #terms... which reads and parses the YAML from disk with no caching at all.

At first I considered adding caching here. But adding caching to FileBasedAuthority only works if you are re-using a file-based authority instance. The API we are using started with CurationConcerns::LicenseService.new.authority.find(identifier), which created a new FileBasedAuthority object each time. Verified by putting some logging/byebug in there, indeed FileBasedAuthority#initialize was called many many times per our multi-member show page.

It's possible there is some other API in CC or QA we could use that would re-use a FileBasedAuthority object instead of creating a new one each time. This stuff isn't documented very well, I'm not certain. Even if it were, FileBasedAuthority is still not caching.

I decided QA was just not intended for this use case (or any use case but auto-completing an edit field via an AJAX request), doesn't neccesarily have the API or semantics to support it, and rather than try to figure it out and/or hack it to do so -- reading a YAML file and pulling data out of it is not a complicated thing, it made sense just to spend the 45 minutes writing the couple dozen straightforward lines of code to just read the YAML file and cache it.

Contributor

jrochkind commented Jul 18, 2017

Before PR, we were looking up stuff from .yml file via CC/QA with calls like this:

CurationConcerns::LicenseService.new.authority.find(identifier)

Profiling suggested that there might have been a lot of time spent in YAML.load. Was this reloading YAML from disk all the time instead of using a cached copy?

Looking at stack traces and QA code (we are currently using qa 0.11.1), I first noticed this:

https://github.com/samvera/questioning_authority/blob/v0.11.1/lib/qa/authorities/local/file_based_authority.rb#L28

Indeed there is no caching going on, every time you ask this authority to find or search it calls #terms... which reads and parses the YAML from disk with no caching at all.

At first I considered adding caching here. But adding caching to FileBasedAuthority only works if you are re-using a file-based authority instance. The API we are using started with CurationConcerns::LicenseService.new.authority.find(identifier), which created a new FileBasedAuthority object each time. Verified by putting some logging/byebug in there, indeed FileBasedAuthority#initialize was called many many times per our multi-member show page.

It's possible there is some other API in CC or QA we could use that would re-use a FileBasedAuthority object instead of creating a new one each time. This stuff isn't documented very well, I'm not certain. Even if it were, FileBasedAuthority is still not caching.

I decided QA was just not intended for this use case (or any use case but auto-completing an edit field via an AJAX request), doesn't neccesarily have the API or semantics to support it, and rather than try to figure it out and/or hack it to do so -- reading a YAML file and pulling data out of it is not a complicated thing, it made sense just to spend the 45 minutes writing the couple dozen straightforward lines of code to just read the YAML file and cache it.

@jrochkind

This comment has been minimized.

Show comment
Hide comment
@jrochkind

jrochkind Jul 18, 2017

Contributor

On the mutant riiif branch I was working on, this took my 200-member demo record from around 2.1s to around 1.8s.

Contributor

jrochkind commented Jul 18, 2017

On the mutant riiif branch I was working on, this took my 200-member demo record from around 2.1s to around 1.8s.

@hackmastera

This comment has been minimized.

Show comment
Hide comment
@hackmastera

hackmastera Jul 19, 2017

Contributor

refs #700

Contributor

hackmastera commented Jul 19, 2017

refs #700

@jrochkind jrochkind merged commit 5004835 into master Jul 19, 2017

0 of 2 checks passed

continuous-integration/travis-ci/pr The Travis CI build is in progress
Details
continuous-integration/travis-ci/push The Travis CI build is in progress
Details

@jrochkind jrochkind deleted the rights_statement_lookup_perf branch Jul 19, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment