Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimization on sync_versions to use ls-remote on Git VCS #6930

Merged
merged 4 commits into from Apr 30, 2020

Conversation

humitos
Copy link
Member

@humitos humitos commented Apr 20, 2020

When a webhook is received we synchronize the branches/tags to create new
versions under Read the Docs. For this process we do a full clone of the
repository and iterate over the branches/tags.

As this process of cloning the repository just for updating versions could be
expensive, this commit adds the ability to use ls-remote on Git VCS just to
query the repository for its branches/tags without performing a full clone.

Closes #6508

When a webhook is received we synchronize the branches/tags to create new
versions under Read the Docs. For this process we do a full clone of the
repository and iterate over the branches/tags.

As this process of cloning the repository just for updating versions could be
expensive, this commit adds the ability to use `ls-remote` on Git VCS just to
query the repository for its branches/tags without performing a full clone.
@humitos humitos requested a review from Apr 20, 2020
if tag.endswith('^{}'):
# skip annotated tags since they are duplicated
continue
Copy link
Member Author

@humitos humitos Apr 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This formatting is weird, but it is git's default.

$ git ls-remote https://github.com/readthedocs/test-builds | grep annotated
886108064d63dde44c14f657c994c63233e9c3ab	refs/tags/annotated-tag
d39922ff32b8bc5c0182e7dd976fd921dba0d873	refs/tags/annotated-tag^{}

^ is a reserved character and branches/tags can't use it. See https://git-scm.com/docs/git-check-ref-format

Copy link
Member

@ericholscher ericholscher left a comment

I'd be 👍 on this if we can ship it behind a feature flag to start, that way we can work out the kinks on our own projects.

humitos added 3 commits Apr 28, 2020
When SyncRepositoryTaskStep calls `sync_repo` we decide if we need a full clone
of just listing the remote branches/tags to sync versions between Read the Docs
and the repository.
@humitos
Copy link
Member Author

humitos commented Apr 28, 2020

@ericholscher I added a feature flag for this and the code is working as I expected.

I created a new method update_versions_from_repository because I can't modify SyncRepository.sync_repo since it's shared with UpdateDocsTaskStep and we need to do the full clone always there.

This new method, will decide to follow the normal sync_repo flow or call directly sync_versions when the feature is enabled and the VCS backend supports it.

Copy link
Member

@ericholscher ericholscher left a comment

Seems worth testing. I'm worried it will likely break, since we've added another way of parsing git tags/branches, but this is definitely a much better solution.

@ericholscher
Copy link
Member

ericholscher commented Apr 29, 2020

We should probably have a set of tests that hits a real repository and confirms we parse them properly, or something.

@humitos
Copy link
Member Author

humitos commented Apr 29, 2020

@ericholscher this test uses a real git repository https://github.com/readthedocs/readthedocs.org/pull/6930/files#diff-da4bc8cdc4f46ca0050b3c974ebf2ca0R47 (created under a tmp directory by make_test_git function). I can add more tests with any other specific case you may be thinking about.

@stsewd
Copy link
Member

stsewd commented Apr 29, 2020

@humitos annotated tags should be one of them

@humitos
Copy link
Member Author

humitos commented Apr 29, 2020

@humitos
Copy link
Member Author

humitos commented Apr 30, 2020

Merging this so we can test it in the next release with some of our projects.

@humitos humitos merged commit 73b846f into master Apr 30, 2020
2 checks passed
@humitos humitos deleted the humitos/sync-lsremote branch Apr 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants