Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add Universe Domain Support #2435

Merged
merged 24 commits into from
Mar 13, 2024
Merged

feat: Add Universe Domain Support #2435

merged 24 commits into from
Mar 13, 2024

Conversation

lqiu96
Copy link
Contributor

@lqiu96 lqiu96 commented Feb 20, 2024

Universe Domain support can be done almost all inside this PR. There is an additional change required inside the google-api-client-services generator. Need to add the setter inside the {Service}.Builder implementation.

@Override
public Builder setUniverseDomain(String universeDomain) {
  return (Builder) super.setUniverseDomain(universeDomain);
}

The reason this needs to be in the generated client code is because it returns the parent's implementation returns a type of AbstractGoogleClient.Builder instead of Bigquery.Builder. Having this inside the child implementation would return the correct type (Bigquery.Builder).

Tested this locally with a local build of Bigquery Apiary client and a local build of the api-client library. Connection to the test environment works and the validation is called on each RPC invocation.

Changes required in Java-Storage and Java-Bigquery

{ApiaryClient}.Builder()
            .setRootUrl(options.getResolvedHost("{Apiary Client Service Name}"))
            .setUniverseDomain(options.getUniverseDomain())
.build()
  1. Setting the rootUrl will now use the resolvedHost and no longer use the apiary workaround
  2. Pass the universe domain in to the apiary client (even though setting the rootUrl will be the rootUrl used and the universe domain configuration value isn't used in resolving the endpoint, pass in the universe domain as this will allow users to call the universe domain getter).

Local Tests

  1. Local Bigquery Apiary - Able to connect + Universe Domain Validation
  2. Local Java-Bigquery client library - Able to connect + Universe Domain Validation

Next Steps:

PR in google-api-java-client-services: googleapis/google-api-java-client-services#19934

  • Requires merging in this PR and a release of a new feat (minor version bump).

@product-auto-label product-auto-label bot added the size: m Pull request size is medium. label Feb 20, 2024
@product-auto-label product-auto-label bot added size: xl Pull request size is extra large. and removed size: m Pull request size is medium. labels Feb 29, 2024
@product-auto-label product-auto-label bot added size: l Pull request size is large. and removed size: xl Pull request size is extra large. labels Mar 4, 2024
@lqiu96 lqiu96 requested a review from blakeli0 March 6, 2024 21:52
@lqiu96 lqiu96 marked this pull request as ready for review March 6, 2024 21:52
@lqiu96 lqiu96 requested a review from a team as a code owner March 6, 2024 21:52
this.httpRequestInitializer = httpRequestInitializer;
this.serviceName = parseServiceName(rootUrl);
this.isUserConfiguredEndpoint =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to determine if the rootUrl is user configured based on if it ends with GDU in the Builder? Is setting this.isUserConfiguredEndpoint = true; in the setter not enough ?
Is it for the scenarios that customers may extend this class and call this Builder directly? Because I think in the public client, rootUrl is hidden. e.g. in Bigquery

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it for the scenarios that customers may extend this class and call this Builder directly?

Yep, it's for the (hopefully) rare user use-case where users are directly extending this class and calling the builder. The apiary libraries don't expose the endpoint configurations in the Builder directly.

* done via {@link #setRootUrl(String)}.
*
* <p>For other uses cases that touch this Builder's constructor directly, check if the rootUrl
* passed in references the Google Default Universe (GDU). Any rootUrl value that is not set in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this logic is correct, customers could set regional endpoints which also end with GDU, see https://cloud.google.com/storage/docs/regional-endpoints

Copy link
Contributor Author

@lqiu96 lqiu96 Mar 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah. Turns out I think this works, but for the wrong reasons. I believe we're parsing out the serviceName and then reconstructing it to build the the same endpoint later on. It's marked as a non-user configured endpoint when it should be.

I think it might be better if I change the isUserConfiguredEndpoint logic to match for a strict regex which should cover the regional endpoints edge case.

@blakeli0
Copy link
Contributor

Changes required in Java-Storage and Java-Bigquery

{ApiaryClient}.Builder()
            .setRootUrl(options.getResolvedHost("{Apiary Client Service Name}"))
            .setUniverseDomain(options.getUniverseDomain())
.build()
  1. Setting the rootUrl will now use the resolvedHost and no longer use the apiary workaround
  2. Pass the universe domain in to the apiary client (even though setting the rootUrl will be the rootUrl used and the universe domain configuration value isn't used in resolving the endpoint, pass in the universe domain as this will allow users to call the universe domain getter).

Why they have to set both rootUrl and universeDomain? I think setting universeDomain alone should be good enough?

@lqiu96
Copy link
Contributor Author

lqiu96 commented Mar 11, 2024

Changes required in Java-Storage and Java-Bigquery

{ApiaryClient}.Builder()
            .setRootUrl(options.getResolvedHost("{Apiary Client Service Name}"))
            .setUniverseDomain(options.getUniverseDomain())
.build()
  1. Setting the rootUrl will now use the resolvedHost and no longer use the apiary workaround
  2. Pass the universe domain in to the apiary client (even though setting the rootUrl will be the rootUrl used and the universe domain configuration value isn't used in resolving the endpoint, pass in the universe domain as this will allow users to call the universe domain getter).

Why they have to set both rootUrl and universeDomain? I think setting universeDomain alone should be good enough?

The apiary wrapped libraries (java-storage and java-bigquery) may have some user configurations where they change the rootUrl value via settings there (for emulators, local endpoint, etc). We'll need those configurations so that we won't override them when the apiary library tries to resolve the endpoint in the code here.

@blakeli0
Copy link
Contributor

Changes required in Java-Storage and Java-Bigquery

{ApiaryClient}.Builder()
            .setRootUrl(options.getResolvedHost("{Apiary Client Service Name}"))
            .setUniverseDomain(options.getUniverseDomain())
.build()
  1. Setting the rootUrl will now use the resolvedHost and no longer use the apiary workaround
  2. Pass the universe domain in to the apiary client (even though setting the rootUrl will be the rootUrl used and the universe domain configuration value isn't used in resolving the endpoint, pass in the universe domain as this will allow users to call the universe domain getter).

Why they have to set both rootUrl and universeDomain? I think setting universeDomain alone should be good enough?

The apiary wrapped libraries (java-storage and java-bigquery) may have some user configurations where they change the rootUrl value via settings there (for emulators, local endpoint, etc). We'll need those configurations so that we won't override them when the apiary library tries to resolve the endpoint in the code here.

I see. If that's the case, I would say we don't have to pass universeDomain, setting the rootUrl should be good enough. Because

  1. The Apiary client is hidden from customers in Bigquery/Storage, even if customers need to, they would use ServiceOptions.getUniverseDomain().
  2. This is less confusing to Bigquery/Storage developers and customers. As they may think it is required to pass both values.
  3. This requires no change in Bigquery/Storage.

@lqiu96
Copy link
Contributor Author

lqiu96 commented Mar 11, 2024

I believe you're right regarding not needing the universe domain to resolve the host. I just realized that the universe domain is also needed as we need it to validate the universe domain. If not passed to the Apiary client, the resolved universe domain will always default to googleapis.com. Passing in the configured universe domain value will use that for validation.

Previously, the handwritten libraries were validating the universe domain, but since we're moving it the apiary client, I believe we'll need to pass the configured universe domain value here.

@blakeli0
Copy link
Contributor

Previously, the handwritten libraries were validating the universe domain, but since we're moving it the apiary client, I believe we'll need to pass the configured universe domain value here.

It maybe a scenario we missed before, do we still want to validate the universe domain if the endpoint/rootUrl is provided? I think we should not because the endpoint configuration would override the universe domain configuration if both are configured. Otherwise it would be confusing if we pass both to the client, use one for validation but use another one for making calls.

expectedUniverseDomain = credentials.getUniverseDomain();
}
if (!expectedUniverseDomain.equals(getUniverseDomain())) {
throw new IOException(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it an IOException? It is a comparison that does not involve IO operations, hence I think it should be a runtime exception.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah right. Will change.

* <p>The roolUrl from the Discovery Docs will always follow the format of
* https://{serviceName}(.mtls).googleapis.com/
*/
private String parseServiceName(String rootUrl) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can use the regex to extract the service name now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, will update,

@lqiu96
Copy link
Contributor Author

lqiu96 commented Mar 11, 2024

Previously, the handwritten libraries were validating the universe domain, but since we're moving it the apiary client, I believe we'll need to pass the configured universe domain value here.

It maybe a scenario we missed before, do we still want to validate the universe domain if the endpoint/rootUrl is provided? I think we should not because the endpoint configuration would override the universe domain configuration if both are configured. Otherwise it would be confusing if we pass both to the client, use one for validation but use another one for making calls.

It is a bit of weird scenario especially given Apiary's non-normal use case. I think we should be validating the universe domain even if we are providing the "correct" endpoint from ServiceOptions.

Off the top of my head, I can think of these scenarios in which I think validation is needed required:

  1. User sets custom rootUrl (to the correct non-GDU endpoint) and no universeDomain. This will create a connection to the right place and the universe domain is resolved to GDU. They supply the correct Credentials that points to non-GDU. While this does technically work as the Credentials is valid, this does not share the same functionality with the other client libs. GAPICs would throw an exception that the explicitly configured universe domain does not match before the call goes through.
  2. User sets custom rootUrl (to the correct non-GDU endpoint) and a universeDomain. However, they supply the incorrect Credentials that points to GDU. I believe this would fail once the RPC hits the server, but the call should not go through (it should fail during validation). GAPICs would throw an exception that the explicitly configured universe domain does not match.

Otherwise it would be confusing if we pass both to the client, use one for validation but use another one for making calls.

Agreed. Ideally the endpoint should be resolved inside the Apiary client itself for every use case. It's only a special case for these two handwritten apiary wrappers as Java-Core is used to resolved the endpoint.

I chose this path because Java-Core sets the the host value to be the value of DEFUALT_HOST (https://github.com/googleapis/sdk-platform-java/blob/4b44a7851dc1d4fd2ac21a54df6c24db5625223c/java-core/google-cloud-core/src/main/java/com/google/cloud/ServiceOptions.java#L86) if nothing is configured. One option that I thought about would be to have the two libraries do some additional logic to filter this out.
i.e.

String host = options.getHost().equals(DEFAULT_HOST) ? null : options.getHost();
{ApiaryClient}.Builder()
            .setRootUrl(host)
            .setUniverseDomain(options.getUniverseDomain())
.build()

I didn't go with this path since it's additional work for the two handwritten libraries.

* discovery doc. Follows the format of `https://{serviceName}(.mtls).googleapis.com/`
*/
Pattern defaultEndpointRegex =
Pattern.compile("https://[a-zA-Z]*(\\.mtls)?\\.googleapis.com/?");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a few quick test but I don't think this regex works, what worked for me is
https:\/\/[a-zA-Z]*(\.mtls)?\.googleapis.com\/?
Can you please make sure the regex is correct and add a few tests just for the regex?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh huh, I did add unit tests for this regex and it did pass:

@Test
public void testIsUserSetEndpoint_mtlsRootUrl() {
AbstractGoogleClient.Builder clientBuilder =
new MockGoogleClient.Builder(
TRANSPORT, "https://test.mtls.googleapis.com/", "", JSON_OBJECT_PARSER, null)
.setApplicationName("Test Application");
assertFalse(clientBuilder.isUserConfiguredEndpoint);
}
@Test
public void testIsUserSetEndpoint_nonGDURootUrl() {
AbstractGoogleClient.Builder clientBuilder =
new MockGoogleClient.Builder(
TRANSPORT, "https://test.random.com/", "", JSON_OBJECT_PARSER, null)
.setApplicationName("Test Application");
assertTrue(clientBuilder.isUserConfiguredEndpoint);
}
@Test
public void testIsUserSetEndpoint_regionalEndpoint() {
AbstractGoogleClient.Builder clientBuilder =
new MockGoogleClient.Builder(
TRANSPORT,
"https://us-east-4.coolservice.googleapis.com/",
"",
JSON_OBJECT_PARSER,
null)
.setApplicationName("Test Application");
assertTrue(clientBuilder.isUserConfiguredEndpoint);
}
. Do you have the values you tested with? I may have tested incorrectly.

this.httpRequestInitializer = httpRequestInitializer;
this.serviceName = parseServiceName(rootUrl);
this.isUserConfiguredEndpoint = !defaultEndpointRegex.matcher(this.rootUrl).matches();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still not convinced that we need this. Considering the following scenarios:

  1. rootUrl is set to non-GDU, serviceName would be parsed to null, hence we would use rootUrl as is in determiningEndpoint().
  2. rootUrl is set to GDU, isUserConfiguredEndpoint would be false anyway, and we would re-create the rootUrl with serviceName and universeDomain.

Copy link
Contributor Author

@lqiu96 lqiu96 Mar 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does work, but I think it has misleading logic inside for certain edge cases (i.e. regional endpoints). For example, when using a regional endpoint in speech: https://eu-speech.googleapis.com and passing that into the constructor, the logic in this PR would work. But I believe it sets the serviceName as eu-speech, when it should just be speech, and builds the endpoint, when it probably shouldn't because it is a custom endpoint.

The more I think about this, the more I'm leaning towards fixing the serviceName parsing logic the same way we did in GAPICs. The Apiary generator probably should be the one parsing the discovery doc and generating this value as a constant inside the library. That way we don't have worry about parsing the serviceName based on user params or user set values and the Apiary would have this value by default. I probably should have done that to begin with, but I didn't account for regional endpoints and I thought this way would be fine.

* Domain in the Credentials
*/
public void validateUniverseDomain() throws IOException {
if (httpRequestInitializer instanceof HttpCredentialsAdapter) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Change this to

if (!(httpRequestInitializer instanceof HttpCredentialsAdapter)) {
    return;
}

for better readability.

@@ -499,6 +513,7 @@
<project.httpclient.version>4.5.14</project.httpclient.version>
<project.commons-codec.version>1.16.0</project.commons-codec.version>
<project.oauth.version>1.35.0</project.oauth.version>
<project.auth.version>1.22.0</project.auth.version>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might discussed this in the past, but I'm a little surprised that this project is not dependent on auth library yet. We probably can not avoid it because we need to access the Credentials object for validation, but let's be more careful here as it may introduce dependency problem for customers. @suztomo Do you have any concerns regarding this new auth dependency?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not create a circular dependency. Looks good to me.

image

@lqiu96 lqiu96 merged commit 4adfed9 into main Mar 13, 2024
16 checks passed
@lqiu96 lqiu96 deleted the apiary-universe-domain branch March 13, 2024 19:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size: l Pull request size is large.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants