New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes GCS based ingestion #12445
Fixes GCS based ingestion #12445
Conversation
… bucket name can contain underscores
Thanks for the fix, @tejaswini-imply .
|
@@ -71,7 +71,7 @@ public CloudObjectLocation(@JsonProperty("bucket") String bucket, @JsonProperty( | |||
|
|||
public CloudObjectLocation(URI uri) | |||
{ | |||
this(uri.getHost(), uri.getPath()); | |||
this(uri.getAuthority(), uri.getPath()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we fall back to getAuthority() only if getHost() returns null?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I would be more comfortable with that too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would work as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
GCP allows buckets in GCS to contain underscores in their names. When this location is mapped to
java.net.URI
,URI#host
comes out to be null asURI
doesn't allow it to contain underscores which is being translated to bucket name hence mappingURI#authority
to bucket name.Description
Fixed
CloudObjectLocation
constructor fromURI
Previously bucket variable of the
CloudObjectLocation
is being set byuri.getHost()
value (butjava.net.URI
refuses to handle underscore in host since it's not a valid character and gcs allows underscores in bucket naming)This modification allows
uri.getAuthority()
to be mapped to bucket name which is allowing underscores to be present (Since all 3 S3, GCS, Azure Cloud buckets uris won't contain port or user information it is assumed that authority will return nothing more than the bucket name)Key changed/added classes in this PR
CloudObjectLocation
This PR has: