Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Java cannot find certification path #9354

Closed
2 of 13 tasks
GergelyKalmar opened this issue Feb 15, 2024 · 15 comments
Closed
2 of 13 tasks

Java cannot find certification path #9354

GergelyKalmar opened this issue Feb 15, 2024 · 15 comments

Comments

@GergelyKalmar
Copy link

Description

We're seeing the following error when running our test suite in GitHub Actions:

Caused by: java.io.IOException: Error getting subject token from metadata server: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target

The test suite runs fine locally on an Ubuntu 20.04 LTS desktop environment, it also runs fine inside a clean Ubuntu base image from Docker.

Platforms affected

  • Azure DevOps
  • GitHub Actions - Standard Runners
  • GitHub Actions - Larger Runners

Runner images affected

  • Ubuntu 20.04
  • Ubuntu 22.04
  • macOS 11
  • macOS 12
  • macOS 13
  • macOS 13 Arm64
  • macOS 14
  • macOS 14 Arm64
  • Windows Server 2019
  • Windows Server 2022

Image version and build link

See https://github.com/logikal-io/mindlab/actions/runs/7835150901/job/21629489569

Is it regression?

Probably not

Expected behavior

We expect our tests to succeed.

Actual behavior

The tests fail.

Repro steps

Run the test suite in the given repository in GitHub Actions.

Note that the issue has also been reported at GoogleCloudDataproc/hadoop-connectors#1106, but as it only appears on GitHub Actions and not when running locally or on a clean Ubuntu image, we think it might be related to the GitHub Actions image.

@mikhailkoliada
Copy link
Contributor

Hi! Please provide a minimal project to reproduce the problem

@GergelyKalmar
Copy link
Author

Sure, the project itself is open source, so you can clone or fork it from https://github.com/logikal-io/mindlab, that should reproduce the problem. Everything is public, including the workflows. You can run the failing test with pytest -k 'spark and gs:'.

If you need anything else, e.g. a Dockerfile with the working example, let me know.

@mikhailkoliada
Copy link
Contributor

well, I'd say it is still too broad to identify the problem, from what seen we still need the following info:

  1. how is java itself being invoked during the testing?
  2. are any other options being used for java invocation?
  3. what version of java / jdk vendor is being used?

@mikhailkoliada
Copy link
Contributor

Apart from that, as far as I understand the auth failure happens because of the lack of some jvm parameters such as javax.net.ssl.keyStore etc, are they handled correctly?

@GergelyKalmar
Copy link
Author

Good questions! The application is using PySpark to read a test file from a Google Cloud Storage bucket (using gcs-connector from https://github.com/GoogleCloudDataproc/hadoop-connectors). From what I understand Python is invoking Java under the hood via Py4J (https://www.py4j.org/py4j_java_gateway.html). I don't think there's any particular options used for this invocation, or if there are, that's handled by PySpark. I generally just install Hadoop and default-jdk (I think that's OpenJDK 11 on Ubuntu 20.04 LTS) and after setting JAVA_HOME, HADOOP_HOME, LD_LIBRARY_PATH_PREFIX and SPARK_DIST_CLASSPATH it usually just works.

I'm not sure if the problem is with reading from the bucket or with identity federation. I'm not setting any parameters like javax.net.ssl.keyStore, but I suppose I could if that is necessary. I'm wondering why it works locally without that.

@mikhailkoliada
Copy link
Contributor

@GergelyKalmar could you please describe your docker local workflow (Dockerfile/build commands/commands you run in container, what docker image is used as base, etc) to compare outputs from docker and from runner then?

@GergelyKalmar
Copy link
Author

Sure. You can use these commands in a Dockerfile:

FROM ubuntu:20.04

RUN apt-get update && apt-get install -y git python3-pip python3-venv \
    && rm -rf /var/lib/apt/lists/*

RUN pip3 install --no-cache-dir --upgrade pip pyorbs

ENV HOME=/root

WORKDIR $HOME
RUN git clone https://github.com/logikal-io/ansible-public-playbooks.git
WORKDIR $HOME/ansible-public-playbooks
RUN orb --make ansible --no-cache
RUN orb ansible -c './run-roles -r hadoop'

WORKDIR $HOME
RUN git clone https://github.com/logikal-io/mindlab.git
WORKDIR $HOME/mindlab
RUN git checkout update-dependencies
RUN orb --make mindlab --no-cache

Then you can run the image and the tests as follows:

docker build --tag test .
docker run -it -v $HOME/.config:/root/.config test /bin/bash
source ~/.bashrc.d/hadoop.bashrc
orb mindlab -c "pytest --live -k 'spark and gs:'"

Note that you need to have the gcloud CLI installed and the application default credentials needs to be available on your local machine too (see https://cloud.google.com/docs/authentication/provide-credentials-adc#google-idp).

The test reads a CSV file from a public bucket, so there should be no access problems.

@GergelyKalmar
Copy link
Author

If there is anything I can do to get some more information about the error (e.g. some Java logs) please let me know! I tried setting conf.set('spark.executor.extraJavaOptions', '-Djavax.net.debug=ssl') but it does not seem to have done anything in my case.

@GergelyKalmar
Copy link
Author

Changed the logs from the executors to the driver, that worked, there are some debug-level logs now (see https://github.com/logikal-io/mindlab/actions/runs/7940823201/job/21682513603). Here is what I see when running locally:

javax.net.ssl|DEBUG|11|Thread-3|2024-02-17 10:38:20.108 CET|SSLCipher.java:464|jdk.tls.keyLimits:  entry = AES/GCM/NoPadding KeyUpdate 2^37. AES/GCM/NOPADDING:KEYUPDATE = 137438953472
javax.net.ssl|DEBUG|11|Thread-3|2024-02-17 10:38:20.123 CET|SSLCipher.java:464|jdk.tls.keyLimits:  entry =  ChaCha20-Poly1305 KeyUpdate 2^37. CHACHA20-POLY1305:KEYUPDATE = 137438953472
javax.net.ssl|DEBUG|41|gcsfs-misc-0|2024-02-17 10:38:20.534 CET|Utilities.java:73|the previous server name in SNI (type=host_name (0), value=oauth2.googleapis.com) was replaced with (type=host_name (0), value=oauth2.googleapis.com)
javax.net.ssl|DEBUG|41|gcsfs-misc-0|2024-02-17 10:38:20.599 CET|SSLCipher.java:1866|KeyLimit read side: algorithm = AES/GCM/NOPADDING:KEYUPDATE
countdown value = 137438953472
javax.net.ssl|DEBUG|41|gcsfs-misc-0|2024-02-17 10:38:20.600 CET|SSLCipher.java:2020|KeyLimit write side: algorithm = AES/GCM/NOPADDING:KEYUPDATE
countdown value = 137438953472
javax.net.ssl|DEBUG|41|gcsfs-misc-0|2024-02-17 10:38:20.634 CET|SSLCipher.java:1866|KeyLimit read side: algorithm = AES/GCM/NOPADDING:KEYUPDATE
countdown value = 137438953472
javax.net.ssl|DEBUG|41|gcsfs-misc-0|2024-02-17 10:38:20.635 CET|SSLCipher.java:2020|KeyLimit write side: algorithm = AES/GCM/NOPADDING:KEYUPDATE
countdown value = 137438953472
javax.net.ssl|DEBUG|41|gcsfs-misc-0|2024-02-17 10:38:20.758 CET|Utilities.java:73|the previous server name in SNI (type=host_name (0), value=storage.googleapis.com) was replaced with (type=host_name (0), value=storage.googleapis.com)
javax.net.ssl|DEBUG|11|Thread-3|2024-02-17 10:38:20.758 CET|Utilities.java:73|the previous server name in SNI (type=host_name (0), value=storage.googleapis.com) was replaced with (type=host_name (0), value=storage.googleapis.com)
javax.net.ssl|DEBUG|11|Thread-3|2024-02-17 10:38:20.785 CET|SSLCipher.java:1866|KeyLimit read side: algorithm = AES/GCM/NOPADDING:KEYUPDATE
countdown value = 137438953472
javax.net.ssl|DEBUG|11|Thread-3|2024-02-17 10:38:20.786 CET|SSLCipher.java:2020|KeyLimit write side: algorithm = AES/GCM/NOPADDING:KEYUPDATE
countdown value = 137438953472
javax.net.ssl|DEBUG|41|gcsfs-misc-0|2024-02-17 10:38:20.788 CET|SSLCipher.java:1866|KeyLimit read side: algorithm = AES/GCM/NOPADDING:KEYUPDATE
countdown value = 137438953472
javax.net.ssl|DEBUG|41|gcsfs-misc-0|2024-02-17 10:38:20.788 CET|SSLCipher.java:2020|KeyLimit write side: algorithm = AES/GCM/NOPADDING:KEYUPDATE
countdown value = 137438953472
javax.net.ssl|DEBUG|41|gcsfs-misc-0|2024-02-17 10:38:20.796 CET|SSLCipher.java:1866|KeyLimit read side: algorithm = AES/GCM/NOPADDING:KEYUPDATE
countdown value = 137438953472
javax.net.ssl|DEBUG|41|gcsfs-misc-0|2024-02-17 10:38:20.797 CET|SSLCipher.java:2020|KeyLimit write side: algorithm = AES/GCM/NOPADDING:KEYUPDATE
countdown value = 137438953472
javax.net.ssl|DEBUG|11|Thread-3|2024-02-17 10:38:20.798 CET|SSLCipher.java:1866|KeyLimit read side: algorithm = AES/GCM/NOPADDING:KEYUPDATE
countdown value = 137438953472
javax.net.ssl|DEBUG|11|Thread-3|2024-02-17 10:38:20.799 CET|SSLCipher.java:2020|KeyLimit write side: algorithm = AES/GCM/NOPADDING:KEYUPDATE
countdown value = 137438953472
javax.net.ssl|DEBUG|4B|Executor task launch worker for task 0.0 in stage 0.0 (TID 0)|2024-02-17 10:38:27.246 CET|Utilities.java:73|the previous server name in SNI (type=host_name (0), value=storage.googleapis.com) was replaced with (type=host_name (0), value=storage.googleapis.com)
javax.net.ssl|DEBUG|4B|Executor task launch worker for task 0.0 in stage 0.0 (TID 0)|2024-02-17 10:38:27.270 CET|SSLCipher.java:1866|KeyLimit read side: algorithm = AES/GCM/NOPADDING:KEYUPDATE
countdown value = 137438953472
javax.net.ssl|DEBUG|4B|Executor task launch worker for task 0.0 in stage 0.0 (TID 0)|2024-02-17 10:38:27.270 CET|SSLCipher.java:2020|KeyLimit write side: algorithm = AES/GCM/NOPADDING:KEYUPDATE
countdown value = 137438953472
javax.net.ssl|DEBUG|4B|Executor task launch worker for task 0.0 in stage 0.0 (TID 0)|2024-02-17 10:38:27.271 CET|SSLCipher.java:1866|KeyLimit read side: algorithm = AES/GCM/NOPADDING:KEYUPDATE
countdown value = 137438953472
javax.net.ssl|DEBUG|4B|Executor task launch worker for task 0.0 in stage 0.0 (TID 0)|2024-02-17 10:38:27.272 CET|SSLCipher.java:2020|KeyLimit write side: algorithm = AES/GCM/NOPADDING:KEYUPDATE
countdown value = 137438953472

And here are the logs when running in GitHub Actions:

javax.net.ssl|DEBUG|10|Thread-3|2024-02-17 09:44:29.425 UTC|SSLCipher.java:464|jdk.tls.keyLimits:  entry = AES/GCM/NoPadding KeyUpdate 2^37. AES/GCM/NOPADDING:KEYUPDATE = 137438953472
javax.net.ssl|DEBUG|10|Thread-3|2024-02-17 09:44:29.434 UTC|SSLCipher.java:464|jdk.tls.keyLimits:  entry =  ChaCha20-Poly1305 KeyUpdate 2^37. CHACHA20-POLY1305:KEYUPDATE = 137438953472
javax.net.ssl|DEBUG|10|Thread-3|2024-02-17 09:44:29.752 UTC|Utilities.java:73|the previous server name in SNI (type=host_name (0), value=pipelinesghubeus12.actions.githubusercontent.com) was replaced with (type=host_name (0), value=pipelinesghubeus12.actions.githubusercontent.com)
javax.net.ssl|ERROR|10|Thread-3|2024-02-17 09:44:29.823 UTC|TransportContext.java:352|Fatal (CERTIFICATE_UNKNOWN): PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target (
"throwable" : {
  sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
  	at java.base/sun.security.validator.PKIXValidator.doBuild(PKIXValidator.java:439)
  	at java.base/sun.security.validator.PKIXValidator.engineValidate(PKIXValidator.java:306)
  	at java.base/sun.security.validator.Validator.validate(Validator.java:264)
  	at java.base/sun.security.ssl.X509TrustManagerImpl.validate(X509TrustManagerImpl.java:313)
  	at java.base/sun.security.ssl.X509TrustManagerImpl.checkTrusted(X509TrustManagerImpl.java:222)
  	at java.base/sun.security.ssl.X509TrustManagerImpl.checkServerTrusted(X509TrustManagerImpl.java:129)
  	at java.base/sun.security.ssl.CertificateMessage$T12CertificateConsumer.checkServerCerts(CertificateMessage.java:638)
  	at java.base/sun.security.ssl.CertificateStatus$CertificateStatusConsumer.consume(CertificateStatus.java:295)
  	at java.base/sun.security.ssl.SSLHandshake.consume(SSLHandshake.java:392)
  	at java.base/sun.security.ssl.HandshakeContext.dispatch(HandshakeContext.java:443)
  	at java.base/sun.security.ssl.HandshakeContext.dispatch(HandshakeContext.java:421)
  	at java.base/sun.security.ssl.TransportContext.dispatch(TransportContext.java:183)
  	at java.base/sun.security.ssl.SSLTransport.decode(SSLTransport.java:172)
  	at java.base/sun.security.ssl.SSLSocketImpl.decode(SSLSocketImpl.java:1511)
  	at java.base/sun.security.ssl.SSLSocketImpl.readHandshakeRecord(SSLSocketImpl.java:1421)
  	at java.base/sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:456)
  	at java.base/sun.security.ssl.SSLSocketImpl.startHandshake(SSLSocketImpl.java:427)
  	at java.base/sun.net.www.protocol.https.HttpsClient.afterConnect(HttpsClient.java:580)
  	at java.base/sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:201)
  	at java.base/sun.net.www.protocol.https.HttpsURLConnectionImpl.connect(HttpsURLConnectionImpl.java:168)
  	at com.google.cloud.hadoop.repackaged.gcs.com.google.api.client.http.javanet.NetHttpRequest.execute(NetHttpRequest.java:151)
  	at com.google.cloud.hadoop.repackaged.gcs.com.google.api.client.http.javanet.NetHttpRequest.execute(NetHttpRequest.java:84)
  	at com.google.cloud.hadoop.repackaged.gcs.com.google.api.client.http.HttpRequest.execute(HttpRequest.java:1012)
  	at com.google.cloud.hadoop.repackaged.gcs.com.google.auth.oauth2.IdentityPoolCredentials.getSubjectTokenFromMetadataServer(IdentityPoolCredentials.java:238)
  	at com.google.cloud.hadoop.repackaged.gcs.com.google.auth.oauth2.IdentityPoolCredentials.retrieveSubjectToken(IdentityPoolCredentials.java:188)
  	at com.google.cloud.hadoop.repackaged.gcs.com.google.auth.oauth2.IdentityPoolCredentials.refreshAccessToken(IdentityPoolCredentials.java:169)
  	at com.google.cloud.hadoop.repackaged.gcs.com.google.auth.oauth2.OAuth2Credentials$1.call(OAuth2Credentials.java:257)
  	at com.google.cloud.hadoop.repackaged.gcs.com.google.auth.oauth2.OAuth2Credentials$1.call(OAuth2Credentials.java:254)
  	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
  	at com.google.cloud.hadoop.repackaged.gcs.com.google.auth.oauth2.OAuth2Credentials$RefreshTask.run(OAuth2Credentials.java:623)
  	at com.google.cloud.hadoop.repackaged.gcs.com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:31)
  	at com.google.cloud.hadoop.repackaged.gcs.com.google.auth.oauth2.OAuth2Credentials$AsyncRefreshResult.executeIfNew(OAuth2Credentials.java:571)
  	at com.google.cloud.hadoop.repackaged.gcs.com.google.auth.oauth2.OAuth2Credentials.asyncFetch(OAuth2Credentials.java:220)
  	at com.google.cloud.hadoop.repackaged.gcs.com.google.auth.oauth2.OAuth2Credentials.getRequestMetadata(OAuth2Credentials.java:170)
  	at com.google.cloud.hadoop.repackaged.gcs.com.google.auth.oauth2.ExternalAccountCredentials.getRequestMetadata(ExternalAccountCredentials.java:334)
  	at com.google.cloud.hadoop.repackaged.gcs.com.google.auth.http.HttpCredentialsAdapter.initialize(HttpCredentialsAdapter.java:96)
  	at com.google.cloud.hadoop.repackaged.gcs.com.google.cloud.hadoop.util.RetryHttpInitializer.initialize(RetryHttpInitializer.java:80)
  	at com.google.cloud.hadoop.repackaged.gcs.com.google.cloud.hadoop.util.ChainingHttpRequestInitializer.initialize(ChainingHttpRequestInitializer.java:52)
  	at com.google.cloud.hadoop.repackaged.gcs.com.google.api.client.http.HttpRequestFactory.buildRequest(HttpRequestFactory.java:91)
  	at com.google.cloud.hadoop.repackaged.gcs.com.google.api.client.googleapis.services.AbstractGoogleClientRequest.buildHttpRequest(AbstractGoogleClientRequest.java:415)
  	at com.google.cloud.hadoop.repackaged.gcs.com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:525)
  	at com.google.cloud.hadoop.repackaged.gcs.com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:466)
  	at com.google.cloud.hadoop.repackaged.gcs.com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:576)
  	at com.google.cloud.hadoop.repackaged.gcs.com.google.cloud.hadoop.gcsio.GoogleCloudStorageImpl.getObject(GoogleCloudStorageImpl.java:1980)
  	at com.google.cloud.hadoop.repackaged.gcs.com.google.cloud.hadoop.gcsio.GoogleCloudStorageImpl.getItemInfo(GoogleCloudStorageImpl.java:1882)
  	at com.google.cloud.hadoop.repackaged.gcs.com.google.cloud.hadoop.gcsio.GoogleCloudStorageFileSystemImpl.getFileInfoInternal(GoogleCloudStorageFileSystemImpl.java:861)
  	at com.google.cloud.hadoop.repackaged.gcs.com.google.cloud.hadoop.gcsio.GoogleCloudStorageFileSystemImpl.getFileInfo(GoogleCloudStorageFileSystemImpl.java:833)
  	at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem.getFileStatus(GoogleHadoopFileSystem.java:724)
  	at org.apache.hadoop.fs.FileSystem.isDirectory(FileSystem.java:1777)
  	at org.apache.spark.sql.execution.streaming.FileStreamSink$.hasMetadata(FileStreamSink.scala:54)
  	at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:[366](https://github.com/logikal-io/mindlab/actions/runs/7940823201/job/21682513603#step:8:367))
  	at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:229)
  	at org.apache.spark.sql.DataFrameReader.$anonfun$load$2(DataFrameReader.scala:211)
  	at scala.Option.getOrElse(Option.scala:189)
  	at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:211)
  	at org.apache.spark.sql.DataFrameReader.csv(DataFrameReader.scala:538)
  	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
  	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
  	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:[374](https://github.com/logikal-io/mindlab/actions/runs/7940823201/job/21682513603#step:8:375))
  	at py4j.Gateway.invoke(Gateway.java:282)
  	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
  	at py4j.commands.CallCommand.execute(CallCommand.java:79)
  	at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182)
  	at py4j.ClientServerConnection.run(ClientServerConnection.java:106)
  	at java.base/java.lang.Thread.run(Thread.java:829)
  Caused by: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
  	at java.base/sun.security.provider.certpath.SunCertPathBuilder.build(SunCertPathBuilder.java:148)
  	at java.base/sun.security.provider.certpath.SunCertPathBuilder.engineBuild(SunCertPathBuilder.java:129)
  	at java.base/java.security.cert.CertPathBuilder.build(CertPathBuilder.java:297)
  	at java.base/sun.security.validator.PKIXValidator.doBuild(PKIXValidator.java:434)
  	... 67 more}
)
javax.net.ssl|DEBUG|10|Thread-3|2024-02-17 09:44:29.824 UTC|SSLSocketImpl.java:1741|close the underlying socket
javax.net.ssl|DEBUG|10|Thread-3|2024-02-17 09:44:29.824 UTC|SSLSocketImpl.java:1760|close the SSL connection (initiative)
24/02/17 09:44:29 WARN FileStreamSink: Assume no metadata directory. Error while looking for metadata directory in the path: gs://test-data-mindlab-logikal-io/order_line_items.csv.

@shamil-mubarakshin
Copy link
Contributor

Hey @GergelyKalmar,
I somehow can not reproduce the exact error. I have reproduced the workflow, omitting AWS steps and reducing Actions invocation layers. While it failed at the end, it did so with 'storage.objects.get' denied on resource error, gs bucket doesn't seem to be public. Log entries before error are resembling that of your local run:

----------------------------- Captured stdout call -----------------------------
javax.net.ssl|DEBUG|10|Thread-3|2024-02-20 11:59:55.020 UTC|SSLCipher.java:464|jdk.tls.keyLimits:  entry = AES/GCM/NoPadding KeyUpdate 2^37. AES/GCM/NOPADDING:KEYUPDATE = 137438953472
javax.net.ssl|DEBUG|10|Thread-3|2024-02-20 11:59:55.031 UTC|SSLCipher.java:464|jdk.tls.keyLimits:  entry =  ChaCha20-Poly1305 KeyUpdate 2^37. CHACHA20-POLY1305:KEYUPDATE = 137438953472
javax.net.ssl|DEBUG|3C|gcsfs-misc-0|2024-02-20 11:59:55.468 UTC|Utilities.java:73|the previous server name in SNI (type=host_name (0), value=oauth2.googleapis.com) was replaced with (type=host_name (0), value=oauth2.googleapis.com)
javax.net.ssl|DEBUG|3C|gcsfs-misc-0|2024-02-20 11:59:55.527 UTC|SSLCipher.java:1866|KeyLimit read side: algorithm = AES/GCM/NOPADDING:KEYUPDATE
countdown value = 137438953472
javax.net.ssl|DEBUG|3C|gcsfs-misc-0|2024-02-20 11:59:55.529 UTC|SSLCipher.java:2020|KeyLimit write side: algorithm = AES/GCM/NOPADDING:KEYUPDATE
countdown value = 137438953472
javax.net.ssl|DEBUG|3C|gcsfs-misc-0|2024-02-20 11:59:55.578 UTC|SSLCipher.java:1866|KeyLimit read side: algorithm = AES/GCM/NOPADDING:KEYUPDATE
countdown value = 137438953472
javax.net.ssl|DEBUG|3C|gcsfs-misc-0|2024-02-20 11:59:55.579 UTC|SSLCipher.java:2020|KeyLimit write side: algorithm = AES/GCM/NOPADDING:KEYUPDATE
countdown value = 137438953472
javax.net.ssl|DEBUG|10|Thread-3|2024-02-20 11:59:55.631 UTC|Utilities.java:73|the previous server name in SNI (type=host_name (0), value=storage.googleapis.com) was replaced with (type=host_name (0), value=storage.googleapis.com)
javax.net.ssl|DEBUG|3C|gcsfs-misc-0|2024-02-20 11:59:55.632 UTC|Utilities.java:73|the previous server name in SNI (type=host_name (0), value=storage.googleapis.com) was replaced with (type=host_name (0), value=storage.googleapis.com)
javax.net.ssl|DEBUG|10|Thread-3|2024-02-20 11:59:55.651 UTC|SSLCipher.java:1866|KeyLimit read side: algorithm = AES/GCM/NOPADDING:KEYUPDATE
countdown value = 137438953472
javax.net.ssl|DEBUG|10|Thread-3|2024-02-20 11:59:55.653 UTC|SSLCipher.java:2020|KeyLimit write side: algorithm = AES/GCM/NOPADDING:KEYUPDATE
countdown value = 137438953472
javax.net.ssl|DEBUG|3C|gcsfs-misc-0|2024-02-20 11:59:55.656 UTC|SSLCipher.java:1866|KeyLimit read side: algorithm = AES/GCM/NOPADDING:KEYUPDATE
countdown value = 137438953472
javax.net.ssl|DEBUG|3C|gcsfs-misc-0|2024-02-20 11:59:55.656 UTC|SSLCipher.java:2020|KeyLimit write side: algorithm = AES/GCM/NOPADDING:KEYUPDATE
countdown value = 137438953472
javax.net.ssl|DEBUG|10|Thread-3|2024-02-20 11:59:55.678 UTC|SSLCipher.java:1866|KeyLimit read side: algorithm = AES/GCM/NOPADDING:KEYUPDATE
countdown value = 137438953472
javax.net.ssl|DEBUG|10|Thread-3|2024-02-20 11:59:55.679 UTC|SSLCipher.java:2020|KeyLimit write side: algorithm = AES/GCM/NOPADDING:KEYUPDATE
countdown value = 137438953472
javax.net.ssl|DEBUG|3C|gcsfs-misc-0|2024-02-20 11:59:55.682 UTC|SSLCipher.java:1866|KeyLimit read side: algorithm = AES/GCM/NOPADDING:KEYUPDATE
countdown value = 137438953472
javax.net.ssl|DEBUG|3C|gcsfs-misc-0|2024-02-20 11:59:55.686 UTC|SSLCipher.java:2020|KeyLimit write side: algorithm = AES/GCM/NOPADDING:KEYUPDATE
countdown value = 137438953472
24/02/20 11:59:55 WARN FileStreamSink: Assume no metadata directory. Error while looking for metadata directory in the path: gs://test-data-mindlab-logikal-io/order_line_items.csv.
java.io.IOException: Error accessing gs://test-data-mindlab-logikal-io/order_line_items.csv
	at com.google.cloud.hadoop.repackaged.gcs.com.google.cloud.hadoop.gcsio.GoogleCloudStorageImpl.getObject(GoogleCloudStorageImpl.java:1986)
	at com.google.cloud.hadoop.repackaged.gcs.com.google.cloud.hadoop.gcsio.GoogleCloudStorageImpl.getItemInfo(GoogleCloudStorageImpl.java:1882)
	at com.google.cloud.hadoop.repackaged.gcs.com.google.cloud.hadoop.gcsio.GoogleCloudStorageFileSystemImpl.getFileInfoInternal(GoogleCloudStorageFileSystemImpl.java:861)
	at com.google.cloud.hadoop.repackaged.gcs.com.google.cloud.hadoop.gcsio.GoogleCloudStorageFileSystemImpl.getFileInfo(GoogleCloudStorageFileSystemImpl.java:833)
	at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem.getFileStatus(GoogleHadoopFileSystem.java:724)
	at org.apache.hadoop.fs.FileSystem.isDirectory(FileSystem.java:1777)
	at org.apache.spark.sql.execution.streaming.FileStreamSink$.hasMetadata(FileStreamSink.scala:54)
	at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:366)
	at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:229)
	at org.apache.spark.sql.DataFrameReader.$anonfun$load$2(DataFrameReader.scala:211)
	at scala.Option.getOrElse(Option.scala:189)
	at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:211)
	at org.apache.spark.sql.DataFrameReader.csv(DataFrameReader.scala:538)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:[566](https://github.com/shamil-mubarakshin/tests-repository/actions/runs/7973019374/job/21765994151#step:7:567))
	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:374)
	at py4j.Gateway.invoke(Gateway.java:282)
	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
	at py4j.commands.CallCommand.execute(CallCommand.java:79)
	at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182)
	at py4j.ClientServerConnection.run(ClientServerConnection.java:106)
	at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: com.google.cloud.hadoop.repackaged.gcs.com.google.api.client.googleapis.json.GoogleJsonResponseException: 403 Forbidden
GET https://storage.googleapis.com/storage/v1/b/test-data-mindlab-logikal-io/o/order_line_items.csv?fields=bucket,name,timeCreated,updated,generation,metageneration,size,contentType,contentEncoding,md5Hash,crc32c,metadata

I also tried invoking Dockerfile commands directly on a runner with results similar to above.

@GergelyKalmar
Copy link
Author

Hi @shamil-mubarakshin, thank you very much for trying it out! You are right, the object wasn't public. I changed the access so it should be publicly readable now, can you try again?

@shamil-mubarakshin
Copy link
Contributor

Direct invocation of Dockerfile commands succeeded.
Using actions workflow for logikal-io/make-orb@v1 failed with various 403 errors to bigquery and s3 bucket.

Those issues doesn't seem image related so far. Since now public access was enabled, could you re-run your workflow for update-dependencies branch?

@GergelyKalmar
Copy link
Author

Okay, very interesting. Indeed the failed tests are not the ones that I am running into, these can be ignored.

I had re-run the tests, however, they are still failing with the same error. It is wild, looks like the jobs are almost exactly the same. One difference I see is that my job is using workload identity federation with google-github-actions/auth@v2, while yours is not. Can you try reproducing it while using identity federation? I think that might be the culprit.

@GergelyKalmar
Copy link
Author

Opened a ticket in the meantime with the Google auth action, perhaps they have an idea what is happening. Nonetheless it would be great if this could be still reproduced with the GitHub Actions runner.

@shamil-mubarakshin
Copy link
Contributor

@GergelyKalmar, using WIF auth instead of Service Account JSON has resulted in failure. This leads me to conclusion that error is in how WIF auth creds are handled by application. I also tested adding certificate of URL defined in ACTIONS_ID_TOKEN_REQUEST_URL env var to default java keystores (/etc/ssl/certs/adoptium/cacerts, /etc/ssl/certs/java/cacerts) but that didn't change workflow's behavior. Which certificate is expected by hadoop is unclear.
With above, I will be closing the issue for now. Feel free to reach out in case of further concerns.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants