Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: prevent crash when checking if a missing file exists #856 #858

Merged
merged 1 commit into from Mar 16, 2022

Conversation

lbergelson
Copy link
Contributor

Fixes a crash that occurred when autoDetectRequesterPays is set and
a Files.exists() call is made on a file that doesn't exist.

Refs: #856

Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:

  • Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
  • Ensure the tests and linter pass
  • Code coverage does not decrease (if any source code was changed)
  • Appropriate docs were updated (if necessary)

Fixes #856 ☕️

If you write sample code, please follow the samples format.

@lbergelson lbergelson requested a review from a team as a code owner March 11, 2022 18:05
@product-auto-label product-auto-label bot added the api: storage Issues related to the googleapis/java-storage-nio API. label Mar 11, 2022
@@ -123,7 +123,7 @@ public static Builder builder() {
private int blockSize = CloudStorageFileSystem.BLOCK_SIZE_DEFAULT;
private int maxChannelReopens = 0;
private @Nullable String userProject = null;
// This of this as "clear userProject if not RequesterPays"
// Think of this as "clear userProject if not RequesterPays"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

opportunistic typo fix

@@ -987,7 +987,7 @@ public boolean requesterPays(String bucketName) {
* requester-pays.
*/
public CloudStorageFileSystemProvider withNoUserProject() {
return new CloudStorageFileSystemProvider("", this.storageOptions);
return new CloudStorageFileSystemProvider(null, this.storageOptions);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This matches how I've seen providers created when there's no project specified.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me this does not seem like a necessary change if we are already modifying the check in CloudStoragePath.java to check for both empty and null. I think we should either change this everywhere or change it nowhere and I think a refactoring change to make it consistent everywhere can be done at a later date.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not necessary, I can remove it. It just seemed weird that if you specify no user project then you get a null, but when it removes the user project you get an empty string. I think a refactoring pass to make it consistent would good in order to prevent similar bugs in the future though.

@lbergelson
Copy link
Contributor Author

I didn't change the attribute setting in the config because I wasn't sure what the right thing to do there is.
Should the attribute be set to null? Should it be removed? Is empty fine? It seems like it should always be normalized to either null or "" to prevent bugs like this in the future. Let me know what you think.
disableUserProject.put("userProject", "");

CloudStorageFileSystem(
CloudStorageFileSystemProvider provider, String bucket, CloudStorageConfiguration config) {
checkArgument(!bucket.isEmpty(), "bucket");
this.bucket = bucket;
if (config.useUserProjectOnlyForRequesterPaysBuckets()) {
if (Strings.isNullOrEmpty(config.userProject())) {
throw new IllegalArgumentException(
"If useUserProjectOnlyForRequesterPaysBuckets is set, then userProject must be set too.");
}
// detect whether we want to pay for these accesses or not.
if (!provider.requesterPays(bucket)) {
// update config (just to ease debugging, we're not actually using config.userProject later.
HashMap<String, String> disableUserProject = new HashMap<>();
disableUserProject.put("userProject", "");
config = CloudStorageConfiguration.fromMap(config, disableUserProject);
// update the provider (this is the most important bit)
provider = provider.withNoUserProject();
}
}
this.provider = provider;
this.config = config;
}

@cojenco cojenco added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Mar 14, 2022
@yoshi-kokoro yoshi-kokoro removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Mar 14, 2022
@cojenco cojenco added the owlbot:run Add this label to trigger the Owlbot post processor. label Mar 14, 2022
@gcf-owl-bot gcf-owl-bot bot removed the owlbot:run Add this label to trigger the Owlbot post processor. label Mar 14, 2022
@@ -987,7 +987,7 @@ public boolean requesterPays(String bucketName) {
* requester-pays.
*/
public CloudStorageFileSystemProvider withNoUserProject() {
return new CloudStorageFileSystemProvider("", this.storageOptions);
return new CloudStorageFileSystemProvider(null, this.storageOptions);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me this does not seem like a necessary change if we are already modifying the check in CloudStoragePath.java to check for both empty and null. I think we should either change this everywhere or change it nowhere and I think a refactoring change to make it consistent everywhere can be done at a later date.

@droazen
Copy link
Contributor

droazen commented Mar 14, 2022

@sydney-munro Note that this issue is manifesting for us in our downstream project as a "User project specified in the request is invalid" StorageException:

code:      400
message:   User project specified in the request is invalid.
reason:    invalid
location:  null
retryable: false
com.google.cloud.storage.StorageException: User project specified in the request is invalid.
	at com.google.cloud.storage.spi.v1.HttpStorageRpc.translate(HttpStorageRpc.java:233)
	at com.google.cloud.storage.spi.v1.HttpStorageRpc.list(HttpStorageRpc.java:376)
	at com.google.cloud.storage.StorageImpl.lambda$listBlobs$11(StorageImpl.java:391)
	at com.google.api.gax.retrying.DirectRetryingExecutor.submit(DirectRetryingExecutor.java:105)
	at com.google.cloud.RetryHelper.run(RetryHelper.java:76)
	at com.google.cloud.RetryHelper.runWithRetries(RetryHelper.java:50)
	at com.google.cloud.storage.Retrying.run(Retrying.java:51)
	at com.google.cloud.storage.StorageImpl.listBlobs(StorageImpl.java:388)
	at com.google.cloud.storage.StorageImpl.list(StorageImpl.java:359)
	at com.google.cloud.storage.contrib.nio.CloudStoragePath.seemsLikeADirectoryAndUsePseudoDirectories(CloudStoragePath.java:118)
	at com.google.cloud.storage.contrib.nio.CloudStorageFileSystemProvider.checkAccess(CloudStorageFileSystemProvider.java:743)
	at java.nio.file.Files.exists(Files.java:2385)
	at htsjdk.tribble.util.ParsingUtils.resourceExists(ParsingUtils.java:418)
	at htsjdk.tribble.TribbleIndexedFeatureReader.loadIndex(TribbleIndexedFeatureReader.java:162)
	at htsjdk.tribble.TribbleIndexedFeatureReader.hasIndex(TribbleIndexedFeatureReader.java:228)
	at org.broadinstitute.hellbender.engine.FeatureDataSource.<init>(FeatureDataSource.java:331)
	at org.broadinstitute.hellbender.engine.FeatureDataSource.<init>(FeatureDataSource.java:236)
	at org.broadinstitute.hellbender.engine.FeatureDataSource.<init>(FeatureDataSource.java:204)
	at org.broadinstitute.hellbender.engine.FeatureDataSource.<init>(FeatureDataSource.java:191)
	at org.broadinstitute.hellbender.engine.FeatureDataSource.<init>(FeatureDataSource.java:154)
	at org.broadinstitute.hellbender.utils.IntervalUtils.featureFileToIntervals(IntervalUtils.java:356)
	at org.broadinstitute.hellbender.utils.IntervalUtils.parseIntervalArguments(IntervalUtils.java:319)
	at org.broadinstitute.hellbender.utils.IntervalUtils.loadIntervals(IntervalUtils.java:239)
	at org.broadinstitute.hellbender.cmdline.argumentcollections.IntervalArgumentCollection.parseIntervals(IntervalArgumentCollection.java:200)
	at org.broadinstitute.hellbender.cmdline.argumentcollections.IntervalArgumentCollection.getTraversalParameters(IntervalArgumentCollection.java:180)
	at org.broadinstitute.hellbender.cmdline.argumentcollections.IntervalArgumentCollection.getIntervals(IntervalArgumentCollection.java:111)
	at org.broadinstitute.hellbender.engine.GATKTool.initializeIntervals(GATKTool.java:525)
	at org.broadinstitute.hellbender.engine.GATKTool.onStartup(GATKTool.java:728)
	at org.broadinstitute.hellbender.engine.AssemblyRegionWalker.onStartup(AssemblyRegionWalker.java:79)
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:138)
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192)
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211)
	at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
	at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
	at org.broadinstitute.hellbender.Main.main(Main.java:289)
Caused by: com.google.api.client.googleapis.json.GoogleJsonResponseException: 400 Bad Request
GET https://storage.googleapis.com/storage/v1/b/fc-secure-bd7b8bc9-f665-4269-997e-5a402088a369/o?maxResults=1&prefix=5c2db926-3b1c-479c-9ed3-a99ce518de91/omics_mutect2/60955825-7723-4bc9-8202-bdd9975bb5c0/call-mutect2/Mutect2/7d737efc-c8be-4a6d-8803-4f786129521a/call-SplitIntervals/glob-0fc990c5ca95eebc97c4c204e3e303e1/0000-scattered.interval_list.idx/&projection=full&userProject
{
  "code" : 400,
  "errors" : [ {
    "domain" : "global",
    "message" : "User project specified in the request is invalid.",
    "reason" : "invalid"
  } ],
  "message" : "User project specified in the request is invalid."
}
	at com.google.api.client.googleapis.json.GoogleJsonResponseException.from(GoogleJsonResponseException.java:146)
	at com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:118)
	at com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:37)
	at com.google.api.client.googleapis.services.AbstractGoogleClientRequest$1.interceptResponse(AbstractGoogleClientRequest.java:428)
	at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:1111)
	at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:514)
	at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:455)
	at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:565)
	at com.google.cloud.storage.spi.v1.HttpStorageRpc.list(HttpStorageRpc.java:366)
	... 33 more

(See broadinstitute/gatk#7716)

We believe that the patch in this PR will resolve this. Unfortunately we are seeing this error all over the place with the latest java-storage-nio release, since in practice we tend to have a mix of requester-pays and non-requester-pays inputs, and have to probe for the existence of files that may or may not be present.



Fixes a crash that occurred when autoDetectRequesterPays is set and
a Files.exists() call is made on a file that doesn't exist.

Refs: googleapis#856
@lbergelson
Copy link
Contributor Author

@sydney-munro I removed the change you asked about. I think it would be a good idea to make it consistent but so far I don't know of any actual issues because of it.

@sydney-munro sydney-munro added the automerge Merge the pull request once unit tests and other checks pass. label Mar 15, 2022
@lbergelson
Copy link
Contributor Author

@sydney-munro Thank you!

@sydney-munro sydney-munro added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Mar 15, 2022
@yoshi-kokoro yoshi-kokoro removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Mar 15, 2022
@gcf-merge-on-green
Copy link

Merge-on-green attempted to merge your PR for 6 hours, but it was not mergeable because either one of your required status checks failed, one of your required reviews was not approved, or there is a do not merge label. Learn more about your required status checks here: https://help.github.com/en/github/administering-a-repository/enabling-required-status-checks. You can remove and reapply the label to re-run the bot.

@gcf-merge-on-green gcf-merge-on-green bot removed the automerge Merge the pull request once unit tests and other checks pass. label Mar 16, 2022
@lbergelson
Copy link
Contributor Author

@sydney-munro It didn't merge for some reason. It looks like owl-bot hasn't completed yet. I'm not sure what that is though or how to fix the problem.

@sydney-munro sydney-munro added the owlbot:run Add this label to trigger the Owlbot post processor. label Mar 16, 2022
@droazen
Copy link
Contributor

droazen commented Mar 16, 2022

@sydney-munro Would it be possible to get a release with this patch once it's merged? Thanks!

@gcf-owl-bot gcf-owl-bot bot removed the owlbot:run Add this label to trigger the Owlbot post processor. label Mar 16, 2022
@lbergelson
Copy link
Contributor Author

@sydney-munro Thank you for the rerun. Looks like it's green now so hopefully it can be merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: storage Issues related to the googleapis/java-storage-nio API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Crash when trying to check if a missing file exists
6 participants