Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Creative Commons API call in-request causes slow downs #195

Closed
bracken opened this issue Aug 1, 2015 · 11 comments
Closed

Creative Commons API call in-request causes slow downs #195

bracken opened this issue Aug 1, 2015 · 11 comments
Assignees

Comments

@bracken
Copy link
Contributor

bracken commented Aug 1, 2015

I didn't dive deep enough into this to solve it for real, so I'm not submitting a pull request, sorry.

But, for some reason every page load of our books would end up hitting the code to call Creative Commons and figure out a license thing. The call often takes 300-600ms, but can get up to 1500ms sometimes. Either way, that's not overhead we want in-request. We already place license stuff on the bottom of pages with a custom plugin and so this isn't needed.

If it's common for books to need this api called often it should probably be moved to a background process or something.

It's likely that we're setting some property incorrectly or are doing something dumb in another plugin, so this might not be as much of a problem for others as it is for us. Sorry I didn't take the time to understand it all completely.

Anyway, I just disabled the call in our code and will maintain that when we upgrade as needed: lumenlearning/candela@e6821b4

Not ideal, but it was hurting us.

Thanks!

@greatislander
Copy link
Contributor

@bdolor What are your thoughts on this?

@greatislander greatislander added this to the Q3 2015 milestone Aug 3, 2015
@greatislander greatislander self-assigned this Aug 3, 2015
@bdolor
Copy link
Contributor

bdolor commented Aug 4, 2015

Generally speaking, I see the benefits APIs bring to the table — the internets would be a lonely place without them. This particular API call brings some translation capabilities, page author/license overrides and machine readable code to the license declaration. The biggest gain is having the value of the license end up in the export routines which is something no other plugin could provide. The license value is also integrated into the PressBooks API which has also found it's way into the 'Search and Import' feature of PressBooks Textbook. Indeed, moving away from using the built-in functionality divorces us from a lot of things the academic world needs — attribution is a fundamental aspect of each of the flavours of Creative Commons licenses, I needed to build an automated way to maintain attribution across multiple derivatives of a book, or book parts. I would advise using the built-in license generator for these reasons. No other plugin will do this for you.

The millisecond performance concern described has been mitigated against by ensuring transients are used. A database call will be made to return the license value if it has not been changed within the last 24 hours. Of course, transients are further '...sped up by caching plugins' . Further, if the built-in creative commons functionality is not enabled, it will not make API calls.

  1. Performance concerns can be avoided entirely by not enabling the 'display the copyright license' in Theme Options.
  2. If displaying the license is enabled, performance is improved by relying on transients, and further sped up by memcache plugins.
  3. If performance concerns overrides any benefits that enhanced functionality may bring, then you don't like the internet ;-)

We have a number of large books in our collection as well, and have not felt the performance burden described above.

@bracken
Copy link
Contributor Author

bracken commented Aug 4, 2015

Disabling the theme option is great for my needs. Much better than my terrible hack. Thanks.

This particular one isn't caused by large books. And yes, caching/transients would help mitigate consistently calling this API.

I think that it's great to fetch licensing data from this API, it can just be potentially problematic if you do it during a normal user request. That means that we're relying on that other server to be running in order to serve our page. Even if the call only happens once an hour or day because of caching, if that server is down when we make the call then the user's request in our system will wait for that to time out. And then it still won't be cached again because there was no response. So the next request will also try to hit the API. So we'll be slow until the other server is back up, or responding quicker again. That's why some kind of background task would be more ideal.

@bdolor
Copy link
Contributor

bdolor commented Aug 4, 2015

I understand where you're coming from. I don't disagree with arguments that vouch for improvements to the robustness of a feature should services not be reliable for a period of time.

If the solution is to be a background process the implementation details of that process is key, however. If you don't serve the API call at page load, then the problem to solve is what inexpensive iterative process could be run on a multi-site instance with hundreds of books, each with thousands of pages? At that scale, iterative processes I can think of come with a performance cost (affecting page load). When would you run it? What would the trigger be? I think these are interesting problems to solve. I know you are not submitting a PR with this, but I am curious what other solutions are out there.

@bracken
Copy link
Contributor Author

bracken commented Aug 4, 2015

I definitely haven't put enough thought or research into this to answer that confidently. And since it appears this isn't a practical problem for most, just a potential one, it's probably not worth solving. :)

That said, I would imagine a solution something like this:

  • Put that transient value into a real meta field, with an expiration timestamp. (basically a self-managed transient value)
  • in pressbooks_copyright_license if you passed the expiration timestamp, queue up the api call with WP cron background task to get new info, but serve up what you have already even though it's stale.
  • the background task does the api call, updates the meta field and expiration timestamp.

You could also set up a trigger at page save time to check if it changed and queue up that same background task.

@nciske
Copy link
Contributor

nciske commented Aug 4, 2015

Sounds like TLC Transients would solve this gracefully:
https://github.com/markjaquith/WP-TLC-Transients

@bracken
Copy link
Contributor Author

bracken commented Aug 4, 2015

yeah, that looks great.

@greatislander
Copy link
Contributor

Stepping back a bit, I'm wondering whether we can't make the API call when the license is selected on the Book Information page and just store the result then? Does the Creative Commons license language change that frequently? I could see only two cases where new licensing language would need to be fetched:

  1. The user changes the license on the Book Info page, which would trigger another API call and save the updated license text.
  2. The book's language is changed, also on the Book Info page—and the API call could then be triggered again to fetch the appropriately localized license.

Am I missing the way this is supposed to work entirely, or could this be a more straightforward way of cutting out a number of these API calls? Thoughts, @bdolor? (FYI @hughmcguire.)

(Tangentially, this also reminds me of the issue we ran into when the CC API was down entirely, which my proposed solution could solve—if it doesn't cause other problems of which I'm unaware.)

Thanks all!

@bdolor
Copy link
Contributor

bdolor commented Aug 11, 2015

There are two use cases:

  1. the book license
  2. the page/chapter license

The conditions that would trigger new license language for a book are:

  1. change of license on Book Info page <-- you mention this
  2. change of language on Book Info page <-- also mentioned
  3. change of title of the Book on Book Info page
  4. change of the copyright holder field on Book Info page
  5. change of the site/blog url/permalink

The conditions that would trigger new license language on a page/chapter are:

  1. change of page/chapter license
  2. change of page/chapter author(s)
  3. change of page/chapter title
  4. change of page/chapter url/permalink
  5. change of book language

The problem that we're trying to solve, and is addressed to a certain degree already is what to do if the service is down. That's a good problem to solve and right now the results of the API are stored locally for a period of time to reduce the risk of that dependency. Decreasing the frequency of those calls by increasing the amount of time that the license attribution is stored or moving where it is stored locally, still won't solve much if the API was down. The undesirable scenario to avoid is if the API call was made at the time the service was down and then that empty return value was stored for a longer period of time than it takes for the API to come back up.

Services being down poses a bigger problem. If we cut ourselves off from the service we don't derive any benefit from it. If we rely on the service without storing anything from it, we create a dependency that is vulnerable to interruption.

For me the balance was in asking myself how long would it be tolerable to have incorrect license information on the book/page before it was automatically corrected and how long would I give a web service time to get its act together. The answer I came up with was 24 hours. The unfortunate thing is that we experienced an outage longer than 24 hours, however, I am under the impression that particular outage was the exception and not the rule.

Over to you.

@greatislander greatislander modified the milestone: Q3 2015 Sep 9, 2015
@greatislander greatislander mentioned this issue Jun 20, 2017
4 tasks
@greatislander
Copy link
Contributor

Closing; this will be revisited in #805.

@greatislander
Copy link
Contributor

@bracken @bdolor So, @connerbw noticed that the logic in this was never actually falling back on the transient and so every page load was hitting the Creative Commons API. We fixed it. pressbooks/pressbooks-book#20

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants