Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Git backend: Improve heuristic checking if GITLAB_MR_PULL_PATTERN is needed #10397

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

jcfr
Copy link

@jcfr jcfr commented Jun 6, 2023

This commit is an incremental improvement toward having full support for self-hosted GitLab instance.

It ensures that the expected sources are fetched when a build is triggered after a merge request event is processed in the context of a GitLab integration webhook associated with a self-hosted GitLab instance.

Related issues:

@jcfr jcfr requested a review from a team as a code owner June 6, 2023 19:01
@jcfr jcfr requested a review from stsewd June 6, 2023 19:01
@jcfr jcfr force-pushed the 9464-improve-git-backend-provider-detection branch 2 times, most recently from be2922b to f230881 Compare June 6, 2023 19:16
@jcfr
Copy link
Author

jcfr commented Jun 6, 2023

Since design decision are needed to have a complete integration of self-hosted GitLab instances, the scope of this changes is purposefully kept small.

Unclear how to address the remaining ci/circleci: checks error:

readthedocs/vcs_support/backends/git.py
  Line: 8
    pylint: import-error / Unable to import 'git.exc'

…needed

This commit is an incremental improvement toward having full support
for self-hosted GitLab instance.

It ensures that the expected sources are fetched when a build is triggered
after a merge request event is processed in the context of a GitLab integration
webhook associated with a self-hosted GitLab instance.
@jcfr jcfr force-pushed the 9464-improve-git-backend-provider-detection branch from f230881 to 0d4e6cb Compare June 6, 2023 19:41
Copy link
Member

@humitos humitos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have the full context of this PR yet, but I just skimmed it pretty quick and made some questions. It doesn't seem like a lot of code, but I just need to find the time to prioritize it.

On the other hand, it would be great to have some tests cases for this.

Comment on lines +164 to +166
# ImportError: cannot import name 'Project' from partially initialized module
# 'readthedocs.projects.models' (most likely due to a circular import)
_gitlab_webhook = "gitlab_webhook" # Integration.GITLAB_WEBHOOK
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's a circular import issue, we could import this constant from inside this static method to solve this problem.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I will update accordingly

Comment on lines +168 to +171
if (
project.integrations.count() == 1
and project.integrations.first().integration_type == _gitlab_webhook
):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure to understand the context behind checking for .count() == 1 here? Isn't there another way to be 100% sure it was triggered by a merge request? Can't we check the body of the original request to decide this or similar?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have access to this webhook data during the build process?

The current behavior will guess the provider type by pattern matching the repo URL with supported oauth integrations.

It does seem a bit strange to have to look up which webhook integrations exist in order to resolve the Git provider's type in each and every build. But I'm not sure if there's currently much else available for the build process.

It could seem like a good approach to store the type of Git provider either when the webhook is added or when it's called. Or have a manually configured Project setting.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review 🙏

To provide some background, the current approach was added to minimize the amount of changes, avoid breaking existing workflow and slightly improve the support for self-hosted GitLab instance.

What is needed ?

Add support for "triggering context"

To properly support this, we would probably need to support querying details about the triggering context.

This information would then then be used to:

  • customize constants like GITLAB_URL, GITLAB_COMMIT_URL, GITLAB_MERGE_REQUEST_COMMIT_URL and GITLAB_MERGE_REQUEST_URL currently set in projects/constants.py 1
  • query the source of the trigger in vcs_support/backends/git.py 2

Support for updating build status

To support updating status on instance different from the official GitLab.com and GitHub.com, I think we following could be done:

  • the Integration model could be updated to have an additional field called api_token3 to be used in in oauth.services.base.(create_session|get_session) 4 itself called from oauth.services.gitlab.send_build_status 5
  • the integration details UI could then be updated to include a new "API Token" field

Footnotes

  1. https://github.com/readthedocs/readthedocs.org/blob/9.13.2/readthedocs/projects/constants.py#L368-L383

  2. https://github.com/readthedocs/readthedocs.org/blob/9.13.2/readthedocs/vcs_support/backends/git.py#L153-L172

  3. https://github.com/readthedocs/readthedocs.org/blob/9.13.2/readthedocs/integrations/models.py#L264-L327

  4. https://github.com/readthedocs/readthedocs.org/blob/9.13.2/readthedocs/oauth/services/base.py#L77-L116

  5. https://github.com/readthedocs/readthedocs.org/blob/9.13.2/readthedocs/oauth/services/gitlab.py#L522-L587

Copy link
Author

@jcfr jcfr Jun 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I 'm not sure to understand the context behind checking for .count() == 1 here?

Since it seems there is no way to have details about the triggering context1, the proposed patch is done by assuming that:

  • if there is only one integration
  • and it the integration version type is EXTERNAL
  • and if the integration type is Integration.GITLAB_WEBHOOK
  • and if it is NOT coming from the gitlab.com provider

In that specific case, it is then "safe" to consider it was triggered by a merge request coming from a GitLab hosted instance.

Hence the function _is_trigger_integration_gitlab in this PR

Isn't there another way to be 100% sure it was triggered by a merge request?

Given the current implementation, I didn't find a way.

Can't we check the body of the original request to decide this or similar?

This could indeed be done where the webhook is handled2 by looking for X-Gitlab-Instance (e.g X-Gitlab-Instance: https://gitlab.kitware.com).

Possible patch to readthedocs/api/v2/views/integrations.py for adding get_external_provider_url()
diff --git a/readthedocs/api/v2/views/integrations.py b/readthedocs/api/v2/views/integrations.py
index 3ff940dd9..c610b7e71 100644
--- a/readthedocs/api/v2/views/integrations.py
+++ b/readthedocs/api/v2/views/integrations.py
@@ -50,6 +50,7 @@ GITLAB_MERGE_REQUEST_MERGE = "merge"
 GITLAB_MERGE_REQUEST_OPEN = "open"
 GITLAB_MERGE_REQUEST_REOPEN = "reopen"
 GITLAB_MERGE_REQUEST_UPDATE = "update"
+GITLAB_INSTANCE_HEADER= "HTTP_X_GITLAB_INSTANCE"
 GITLAB_TOKEN_HEADER = "HTTP_X_GITLAB_TOKEN"
 GITLAB_PUSH = "push"
 GITLAB_NULL_HASH = "0" * 40
@@ -138,6 +139,10 @@ class WebhookMixin:
         """Handle webhook payload."""
         raise NotImplementedError
 
+    def get_external_provider_url(self):
+        """Get External provider URL from payload header."""
+        raise NotImplementedError
+
     def get_external_version_data(self):
         """Get External Version data from payload."""
         raise NotImplementedError
@@ -237,6 +242,7 @@ class WebhookMixin:
         :param project: Project instance
         :type project: readthedocs.projects.models.Project
         """
+        provider_url = self.get_external_provider_url()
         version_data = self.get_external_version_data()
         # create or get external version object using `verbose_name`.
         external_version = get_or_create_external_version(
@@ -569,6 +575,10 @@ class GitLabWebhookView(WebhookMixin, APIView):
             return False
         return token == secret
 
+    def get_external_provider_url(self):
+        """Get External provider URL from payload header."""
+        return self.request.META.get(GITLAB_INSTANCE_HEADER, "https://gitlab.com")
+
     def get_external_version_data(self):
         """Get commit SHA and merge request number from payload."""
         try:

What is not clear is how to pass the all the way down to the git backend. Hence the discussion above to support passing down the triggering context details.

Footnotes

  1. That information does not seem to be associated with any of the objects currently available from vcs_support.backends.git.fetch() (see https://github.com/readthedocs/readthedocs.org/blob/9.13.2/readthedocs/vcs_support/base.py#L57-L75 and https://github.com/readthedocs/readthedocs.org/blob/9.13.2/readthedocs/vcs_support/backends/git.py)

  2. https://github.com/readthedocs/readthedocs.org/blob/9.13.2/readthedocs/api/v2/views/integrations.py.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the other hand, it would be great to have some tests cases for this.

Ditto. Not being familiar with the code base, I would appreciate some guidance (e.g similar to test to use an example)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants