New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

githubapi: Optimize List performance when there are many notifications. #4

Merged
merged 2 commits into from Nov 12, 2018

Conversation

Projects
None yet
1 participant
@dmitshur
Member

dmitshur commented Nov 11, 2018

The List method of the githubapi implementation can become excessively slow when there are many GitHub notifications to fetch.

This is because the current implementation is far from optimal: it lists all notifications via a single GitHub REST API v3 call, then for each notification, it makes at least one API call to fetch more detailed information about that notification. It does all this completely sequentially. As a result, it can take over 15 seconds when there are 150~ GitHub notifications.

I've been putting up with it, because I've been waiting for GitHub GraphQL API v4 to add support for notifications. Then, the entire List endpoint could become a single GraphQL query, avoiding all this inefficiency.

However, on Friday, @heschik came over and asked me to check my GitHub notifications for something noteworthy. Embarrassingly, it took 15~ seconds to do so, ruining the time-sensitive humorous moment.

So, this PR optimizes performance of the List method to prevent that from happening again. :) It does so by making two changes, separated by logical commits:

  1. Batch all individual GraphQL queries into a single one. This was blocked on resolving shurcooL/githubv4#17. Luckily, I was able to make good progress on it as described in shurcooL/githubv4#17 (comment).

  2. Cache notifications. Since using GraphQL precludes from using caching at the HTTP layer, it's very helpful for successive List calls to do it at the application level.

After these two changes, the List time for 145~ notifications went from 15 seconds to 3 seconds on first call, and 450~ ms on immediately following calls.

P.S. I should not have 145+ unread notifications, but that's another topic. I mostly kept them around to be able to test these changes.

dmitshur added some commits Nov 11, 2018

githubapi: Batch GraphQL queries in List.
One of the advantages of using GraphQL is that it's possible to fetch
all the required information in a single query. We were making many
GraphQL queries, one for each notification. This can become very slow
and inefficient when there are many notifications.

Ideally, the entire List endpoint should be implemented with a single
GraphQL query. However, it's not possible because GitHub GraphQL API v4
still doesn't offer access to notifications the way GitHub API v3 does.

So, we do the best we can for now, and batch all GraphQL queries into
a single query. Use top-level aliases to combine multiple queries into
one. Use reflect.StructOf to construct the query struct type at runtime.
This is functional, although perhaps there are opportunities to make it
more user friendly in the graphql/githubv4 libraries. That will be
investigated in the future.

The performance of List endpoint when listing 145 GitHub notifications
improves from 15~ seconds to 3~ seconds after this change.

Updates shurcooL/githubv4#17.
githubapi: Cache notifications in List.
Since there's no caching done at HTTP layer when using GraphQL, perform
some application-level caching instead. When List endpoint is called
in succession, most of the notifications are likely to remain unchanged,
and therefore benefit from being cached.

After this change, List execution time goes from 3.0~ seconds the first
time (uncached) to 450~ ms the second and successive times (cached).

@dmitshur dmitshur merged commit bcc2b30 into master Nov 12, 2018

2 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details

@dmitshur dmitshur deleted the optimize-githubapi branch Nov 12, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment