Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for artifacts uploaded by actions/upload-artifact@v4 #1791

Merged
merged 10 commits into from
Mar 9, 2024

Conversation

gsmet
Copy link
Contributor

@gsmet gsmet commented Feb 13, 2024

The artifacts upload by actions/upload-artifact@v4 are hosted on a new infrastructure which has several constraints:

  • we will have an error if we push the Authorization header to it, which was the case when using the Java 11 HttpClient (and is considered a bad practice so it is good to have fixed it anyway)
  • the host name is dynamic so our test infrastructure was having problems with proxying the request - I was able to make it work with some Wiremock magic (damn, it was hard!)

All these problems are sorted out by this pull request and we are now testing an artifact uploaded by v3 and the other uploaded by v4.

Fixes #1790

Note that in the end the problem was only present for the Java 11+ HttpClient as both the Okhttp client and the UrlConnection client actually cleans up the Authorization header when redirected to another server.
Personally, I think it's border line a CVE in the JDK but I wouldn't hold my breath in getting it fixed so I think it's worth fixing it ourselves.

Before submitting a PR:

  • Changes must not break binary backwards compatibility. If you are unclear on how to make the change you think is needed while maintaining backward compatibility, CONTRIBUTING.md for details.
  • Add JavaDocs and other comments explaining the behavior.
  • When adding or updating methods that fetch entities, add @link JavaDoc entries to the relevant documentation on https://docs.github.com/en/rest .
  • Add tests that cover any added or changed code. This generally requires capturing snapshot test data. See CONTRIBUTING.md for details.
  • Run mvn -D enable-ci clean install site locally. If this command doesn't succeed, your change will not pass CI.
  • Push your changes to a branch other than main. You will create your PR from that branch.

When creating a PR:

  • Fill in the "Description" above with clear summary of the changes. This includes:
    • If this PR fixes one or more issues, include "Fixes #" lines for each issue.
    • Provide links to relevant documentation on https://docs.github.com/en/rest where possible. If not including links, explain why not.
  • All lines of new code should be covered by tests as reported by code coverage. Any lines that are not covered must have PR comments explaining why they cannot be covered. For example, "Reaching this particular exception is hard and is not a particular common scenario."
  • Enable "Allow edits from maintainers".

The artifacts upload by actions/upload-artifact@v4 are hosted on a new
infrastructure which has several constraints:
- we will have an error if we push the Authorization header to it, which
  was the case when using the Java 11 HttpClient (and is considered a
  bad practice so it is good to have fixed it anyway)
- the host name is dynamic so our test infrastructure was having
  problems with proxying the request

All these problems are sorted out by this pull request and we are now
testing an artifact uploaded by v3 and one uploaded by v4.

Fixes hub4j#1790
Copy link

codecov bot commented Feb 13, 2024

Codecov Report

Attention: Patch coverage is 73.68421% with 10 lines in your changes are missing coverage. Please review.

Project coverage is 80.66%. Comparing base (be00e51) to head (5557e97).
Report is 21 commits behind head on main.

❗ Current head 5557e97 differs from pull request most recent head 07acfde. Consider uploading reports for the commit 07acfde to get more accurate results

Files Patch % Lines
src/main/java/org/kohsuke/github/GitHubClient.java 72.22% 5 Missing and 5 partials ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #1791      +/-   ##
============================================
+ Coverage     80.64%   80.66%   +0.02%     
- Complexity     2322     2338      +16     
============================================
  Files           219      220       +1     
  Lines          7027     7077      +50     
  Branches        371      377       +6     
============================================
+ Hits           5667     5709      +42     
- Misses         1128     1132       +4     
- Partials        232      236       +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@gsmet
Copy link
Contributor Author

gsmet commented Feb 13, 2024

I added a commit to for Git longpaths support on CI as the CI failures were due to paths that were too long.

@gsmet
Copy link
Contributor Author

gsmet commented Feb 13, 2024

CI is all green, now, ready to get merged!

@bitwiseman
Copy link
Member

@gsmet

I added a commit to for Git longpaths support on CI as the CI failures were due to paths that were too long.

Oof. Hm. I didn't know about core.longpaths true and did a bunch of work recently to keep the recorded file names shorter. See f0a3695 .

I'd rather not require the long file name flag. If we go that route, we also need to add instructions to the CONTRIBUTING.md explaining what windows user need to do on their systems.

    // old
    "bodyFileName": "u72ug1ib1zbctek798hyrdyou28rbk6ssrokf37zxrpgubk95i__apis_pipelines_1_runs_120_signedartifactscontent-1.txt",
    // new
    "bodyFileName": "u72ug1ib1zbctek798hyrdyou28rbk6ssrokf37zxrpgubk95i__apis_pipelines_1_runs_75_signedartifactscontent-446b2495-1bf6-4d5e-acde-daff02e51161.txt",

That string on the front is ridiculous and then the file name GUID isn't shortened by the Wiremock rule. This is a bug in the test framework. I'd rather fix it that require an additional setting for Windows.

For this PR (for expediency), would you be willing to revert the long file name commit and manually edit the file names to shorten them? Then file an issue, so I can update the Wiremock rule further to shorten those file names that are close to the limit?

return new HttpClientGitHubConnectorResponse(connectorRequest, httpResponse);
} catch (InterruptedException e) {
throw (InterruptedIOException) new InterruptedIOException(e.getMessage()).initCause(e);
}
}

private HttpResponse<InputStream> send(HttpRequest request, int redirects)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I see what you're trying to do here. Looks good!

But I'd like moved to GitHubClient.detectRedirect.

From what you described, authorization stripping is the behavior GitHub expects. That means, regardless of what GitHubConnector implementation is used, when the connector returns a redirect response this project must strip the Authorization header as appropriate and then follow the redirect. So, this functionality should go in GitHubClient instead of HttpClientGitHubConnector.

On the plus side, GitHubClient already has functionality for retries.

connectorResponse = connector.send(connectorRequest);
logResponse(connectorResponse);
noteRateLimit(request.rateLimitTarget(), connectorResponse);
detectKnownErrors(connectorResponse, request, handler != null);
logResponseBody(connectorResponse);
return createResponse(connectorResponse, handler);
} catch (RetryRequestException e) {
// retry requested by requested by error handler (rate limit handler for example)
if (retries > 0 && e.connectorRequest != null) {
connectorRequest = e.connectorRequest;
}

A retry exception can be thrown during error detection and the updated request contained in the exception will be used for the next attempt. We use this for token refresh, now we can use it for redirects. And we already detect redirects.

private void detectExpiredToken(GitHubConnectorResponse connectorResponse, GitHubRequest request)
throws IOException {
if (connectorResponse.statusCode() != HTTP_UNAUTHORIZED) {
return;
}
String originalAuthorization = connectorResponse.request().header("Authorization");
if (Objects.isNull(originalAuthorization) || originalAuthorization.isEmpty()) {
return;
}
GitHubConnectorRequest updatedRequest = prepareConnectorRequest(request);
String updatedAuthorization = updatedRequest.header("Authorization");
if (!originalAuthorization.equals(updatedAuthorization)) {
throw new RetryRequestException(updatedRequest);
}
}
private void detectRedirect(GitHubConnectorResponse connectorResponse) throws IOException {
if (connectorResponse.statusCode() == HTTP_MOVED_PERM || connectorResponse.statusCode() == HTTP_MOVED_TEMP) {
// GitHubClient depends on GitHubConnector implementations to follow any redirects automatically
// If this is not done and a redirect is requested, throw in order to maintain security and consistency
throw new HttpException(
"GitHubConnnector did not automatically follow redirect.\n"
+ "Change your http client configuration to automatically follow redirects as appropriate.",
connectorResponse.statusCode(),
"Redirect",
connectorResponse.request().url().toString());
}
}

Use detectExpiredToken as the template and replace detectRedirect with the logic you've implemented here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dunno both if it's worth it and if it's safe enough.
As for the former, as I said, I did my homework and all the other clients are having this behavior of stripping the Authorization header. So sure, we could add another client that wouldn't have this behavior at some point in the future but I also think it's probably better to let the client handle the redirects if it can do it properly.

I also wanted to be very safe and carefully implement the rule of the Java HttpClient to make sure I wouldn't break anything.

Anyway, maybe you're right, I'm just not sure it's a decision that is so obvious.

Comment on lines +52 to +57
private final static Pattern ACTIONS_USER_CONTENT_PATTERN = Pattern
.compile("https://pipelines[a-z0-9]*\\.actions\\.githubusercontent\\.com", Pattern.CASE_INSENSITIVE);
private final static Pattern BLOB_CORE_WINDOWS_PATTERN = Pattern
.compile("https://([a-z0-9]*\\.blob\\.core\\.windows\\.net)", Pattern.CASE_INSENSITIVE);
private final static String ORIGINAL_HOST = "originalHost";

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry this is such a mess. Well done updating it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I struggled a lot with it. Note that I don't think it's a limitation of what you did, I was surprised I didn't have some sort of context attached to requests by Wiremock so that I could add the host transparently there.
If at some point, they add a signature to check the URL is not tainted, we will be screwed :/

@@ -235,6 +264,8 @@ protected void after() {
recordSnapshot(this.codeloadServer(), "https://codeload.github.com", true);

recordSnapshot(this.actionsUserContentServer(), "https://pipelines.actions.githubusercontent.com", true);

recordSnapshot(this.blobCoreWindowsNetServer(), "https://productionresults.blob.core.windows.net", true);
}

private void recordSnapshot(WireMockServer server, String target, boolean isRawServer) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you look at the recordSnapshot, we add headers to the recording that are needed by tests.

To test this feature, we need to verify that the Authorization header is present on the regular requests and that it is stripped from the redirected request.

To do that I think we need to an option to use/record/verify a placeholder Authorization header as part of a test. You've done a ton of work here already. If you'd like, I can do make this update.

Copy link
Member

@bitwiseman bitwiseman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great work! I'm happy to pitch in if you want a hand with the remaining changes.

@gsmet
Copy link
Contributor Author

gsmet commented Feb 16, 2024

Hey @bitwiseman , thanks for having a look. TBH, I spent too much time on it already as I spent quite a lot of time researching the issue and then experiment to find a solution (I initially hoped I could tweak the Java HttpClient behavior without implementing the redirect logic myself, to no avail...). So if you have some cycles I would appreciate you having a look.

I'm around if you have any questions.

@bitwiseman
Copy link
Member

@gsmet
I re-wrote the test resource renaming functionality (again) to be way more robust. Ran the new behavior on tests that showed up with the longest files paths.

Code coverage showed the authorization removal functionality was not being triggered during local testing due to all the wiremock servers running on various ports of localhost. So now host and port must be the same in the redirect to be considered the same host.

Still don't have a test for authorization removal, but I might file that as an issue for a later PR.

@gastaldi
Copy link
Contributor

gastaldi commented Mar 6, 2024

The changes look good. Is this now good to be merged?

@gsmet
Copy link
Contributor Author

gsmet commented Mar 6, 2024

@bitwiseman can't say I'm fully convinced your version is better but if you prefer this version, go for it :).

I suppose you have tested locally that the test I added still works fine? I see the files renamed but the content wasn't updated thus why I ask.

@gsmet
Copy link
Contributor Author

gsmet commented Mar 7, 2024

@bitwiseman given #1790 (comment) , we probably need to accelerate a bit on this as it's going to be in the way of more and more users.

@@ -7,6 +7,9 @@
"headers": {
"Accept": {
"equalTo": "application/vnd.github.v3+json"
},
"Authorization": {
"absent" : true
Copy link
Member

@bitwiseman bitwiseman Mar 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gsmet Verified that the header is removed for the redirected requests!

@bitwiseman bitwiseman merged commit 1ebe446 into hub4j:main Mar 9, 2024
9 checks passed
@gsmet
Copy link
Contributor Author

gsmet commented Mar 9, 2024

Awesome, thanks for driving it home and merging!

@bitwiseman
Copy link
Member

@gsmet I'll get the release out with this change this weekend.

@gsmet
Copy link
Contributor Author

gsmet commented Mar 16, 2024

Awesome. upload-artifact@v4 is so much better. Looking forward to being able to move to it!

@gsmet
Copy link
Contributor Author

gsmet commented Mar 17, 2024

@bitwiseman I'm working on a pull request to fix quarkiverse/quarkus-github-app#580 and add the necessary events to GitHub API.

It would be nice if you could wait for it to be merged before releasing. It should arrive soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unable to download artifacts uploaded with upload-artifact@v4
3 participants