Skip to content

[services] update azure-storage-blob 12.22.0 to mitigate "Stream is already closed" #13032

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
May 11, 2023

Conversation

danking
Copy link
Contributor

@danking danking commented May 10, 2023

Fixes #12976

I do not fully understand the Azure git tagging scheme, but this commit appears to have made close idempotent. It was merged in June of 2022. That commit resolved an issue reporting an error very similar to our own.

All the azure version changes update the Azure packages to their latest versions as of 2023-05-09 1713 ET.

Unfortunately, newer Azure versions and Spark 3.3.0 have inconsistent io.netty requirements. Spark is stuck back in February 2022. Spark 3.4.0 does support a compatible version of io.netty, but we're months from seeing that in Dataproc.

This change packages up Azure and all its dependencies into one JAR of relocated classes. We must refer to those classes by their relocated names.

@danking danking force-pushed the update-azure-storage-blob branch 2 times, most recently from 56f437d to d12a12c Compare May 10, 2023 19:59
…lready closed"

I do not fully understand the Azure git tagging scheme, but [this commit](Azure/azure-sdk-for-java@054df3f)
appears to have made `close` idempotent. It was merged in June of 2022. That commit resolved [an issue](Azure/azure-sdk-for-java#24782) reporting
an error very similar to our own.

All the azure version changes update the Azure packages to their latest versions as of 2023-05-09 1713 ET.

Unfortunately, newer Azure versions and Spark 3.3.0 have inconsistent `io.netty` requirements. Spark
is stuck back in February 2022. Spark 3.4.0 does support a compatible version of `io.netty`, but
we're months from seeing that in Dataproc.

This change packages up Azure and all its dependencies into one JAR of relocated classes. We must
refer to those classes by their relocated names.
@danking danking force-pushed the update-azure-storage-blob branch from d12a12c to 9e28c82 Compare May 10, 2023 20:00
@danking danking closed this May 10, 2023
@danking
Copy link
Contributor Author

danking commented May 10, 2023

https://batch.azure.hail.is/batches/3834432/jobs/1

I feel close but its still not working. Now I'm getting some signature mismatch from the server.

@danking
Copy link
Contributor Author

danking commented May 10, 2023

I have no idea what this means

Caused by: is.hail.shadedazure.com.azure.storage.blob.models.BlobStorageException: If you are using a StorageSharedKeyCredential, and the server returned an error message that says 'Signature did not match', you can compare the string to sign with the one generated by the SDK. To log the string to sign, pass in the context key value pair 'Azure-Storage-Log-String-To-Sign': true to the appropriate method call.
If you are using a SAS token, and the server returned an error message that says 'Signature did not match', you can compare the string to sign with the one generated by the SDK. To log the string to sign, pass in the context key value pair 'Azure-Storage-Log-String-To-Sign': true to the appropriate generateSas method call.
Please remember to disable 'Azure-Storage-Log-String-To-Sign' before going to production as this string can potentially contain PII.
Status code 403, (empty body)

@danking
Copy link
Contributor Author

danking commented May 10, 2023

The suppressed message is:

	Suppressed: java.lang.Exception: #block terminated with an error
		at is.hail.shadedazure.reactor.core.publisher.BlockingSingleSubscriber.blockingGet(BlockingSingleSubscriber.java:99)
		at is.hail.shadedazure.reactor.core.publisher.Mono.block(Mono.java:1742)
		at is.hail.shadedazure.com.azure.storage.common.implementation.StorageImplUtils.blockWithOptionalTimeout(StorageImplUtils.java:133)
		at is.hail.shadedazure.com.azure.storage.blob.specialized.BlobClientBase.getPropertiesWithResponse(BlobClientBase.java:1379)
		at is.hail.shadedazure.com.azure.storage.blob.specialized.BlobClientBase.getProperties(BlobClientBase.java:1348)
		at is.hail.io.fs.AzureStorageFS.$anonfun$openNoCompression$1(AzureStorageFS.scala:223)
		at is.hail.io.fs.AzureStorageFS.$anonfun$handlePublicAccessError$1(AzureStorageFS.scala:175)
		at is.hail.services.package$.retryTransientErrors(package.scala:124)
		at is.hail.io.fs.AzureStorageFS.handlePublicAccessError(AzureStorageFS.scala:174)
		at is.hail.io.fs.AzureStorageFS.openNoCompression(AzureStorageFS.scala:220)
		at is.hail.io.fs.RouterFS.openNoCompression(RouterFS.scala:20)
		at is.hail.io.fs.FS.openNoCompression(FS.scala:322)
		at is.hail.io.fs.FS.openNoCompression$(FS.scala:322)
		at is.hail.io.fs.RouterFS.openNoCompression(RouterFS.scala:3)
		at is.hail.backend.service.ServiceBackendSocketAPI2$.$anonfun$main$3(ServiceBackend.scala:459)
		at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
		at is.hail.services.package$.retryTransientErrors(package.scala:124)
		at is.hail.backend.service.ServiceBackendSocketAPI2$.main(ServiceBackend.scala:459)
		at is.hail.backend.service.Main$.main(Main.scala:15)
		at is.hail.backend.service.Main.main(Main.scala)
		at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
		at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
		at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
		at java.lang.reflect.Method.invoke(Method.java:498)
		at is.hail.JVMEntryway$1.run(JVMEntryway.java:119)
		at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
		at java.util.concurrent.FutureTask.run(FutureTask.java:266)
		at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
		at java.util.concurrent.FutureTask.run(FutureTask.java:266)
		at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
		at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
		... 1 more

@danking
Copy link
Contributor Author

danking commented May 10, 2023

Oh shit this just means my service account can't write to this bucket.

@danking danking reopened this May 10, 2023
@danking
Copy link
Contributor Author

danking commented May 11, 2023

🤦

@danking danking closed this May 11, 2023
@danking danking reopened this May 11, 2023
@danking danking merged commit b57066f into hail-is:main May 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

QoB is raising "Stream is already closed" when closing an FS stream BlobOutputStream#close should be idempotent
2 participants