Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spark write to zenko - bucket does not exist #990

Closed
sspaeti opened this issue Mar 12, 2020 · 1 comment
Closed

spark write to zenko - bucket does not exist #990

sspaeti opened this issue Mar 12, 2020 · 1 comment

Comments

@sspaeti
Copy link

sspaeti commented Mar 12, 2020

I can write and read to S3 with the following command:

bin/spark-shell
--packages io.delta:delta-core_2.11:0.5.0,org.apache.hadoop:hadoop-aws:2.7.7
--conf spark.delta.logStore.class=org.apache.spark.sql.delta.storage.S3SingleDriverLogStore
--conf spark.hadoop.fs.s3a.access.key=my-key
--conf spark.hadoop.fs.s3a.secret.key=my-secret

spark.range(5).write.format("parquet").save("s3a://sq-delta1/parquettable1")

When I'm doing the same, including the path.style.access and other options, I always get the error :

  • java.io.IOException: Bucket mnt does not exist
    that the bucket does not exists. Why? When I try MinIO it also works.

Is there something I'm missing, or what could be the problem?

Cmd with Zenko:
bin/spark-shell
--packages io.delta:delta-core_2.11:0.5.0,org.apache.hadoop:hadoop-aws:2.7.7
--conf spark.delta.logStore.class=org.apache.spark.sql.delta.storage.S3SingleDriverLogStore
--conf spark.hadoop.fs.s3a.path.style.access=true
--conf com.amazonaws.services.s3.enableV4=true
--conf spark.hadoop.s3.endpoint.signingRegion=eu-west-hot
--conf spark.hadoop.fs.s3a.endpoint=http://zenko-url
--conf spark.hadoop.fs.s3a.access.key=my-key
--conf spark.hadoop.fs.s3a.secret.key=my-secret
--conf fs.s3a.impl=org.apache.hadoop.fs.s3a.S3AFileSystem

spark.range(5).write.format("parquet").save("s3a://mnt/delta/testparquet")

Full stack trace

(base) root@anaconda-0:/opt/conda# bin/spark-shell \

--packages io.delta:delta-core_2.11:0.5.0,org.apache.hadoop:hadoop-aws:2.7.7
--conf spark.delta.logStore.class=org.apache.spark.sql.delta.storage.S3SingleDriverLogStore
--conf spark.hadoop.fs.s3a.path.style.access=true
--conf com.amazonaws.services.s3.enableV4=true
--conf spark.hadoop.s3.endpoint.signingRegion=eu-west-hot
--conf spark.hadoop.fs.s3a.endpoint=http://zenko-url
--conf spark.hadoop.fs.s3a.access.key=my-key
--conf spark.hadoop.fs.s3a.secret.key=my-secret
--conf fs.s3a.impl=org.apache.hadoop.fs.s3a.S3AFileSystem
Warning: Ignoring non-spark config property: fs.s3a.impl=org.apache.hadoop.fs.s3a.S3AFileSystem
Warning: Ignoring non-spark config property: com.amazonaws.services.s3.enableV4=true

Ivy Default Cache set to: /root/.ivy2/cache
The jars for the packages stored in: /root/.ivy2/jars
:: loading settings :: url = jar:file:/opt/conda/lib/python2.7/site-packages/pyspark/jars/ivy-2.4.0.jar!/org/apache/ivy/core/settings/ivysettings.xml
io.delta#delta-core_2.11 added as a dependency
org.apache.hadoop#hadoop-aws added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent-1f15352a-51ee-4e1b-b10b-bc5ddeef4cdf;1.0
confs: [default]
found io.delta#delta-core_2.11;0.5.0 in central
found org.antlr#antlr4;4.7 in central
found org.antlr#antlr4-runtime;4.7 in central
found org.antlr#antlr-runtime;3.5.2 in central
found org.antlr#ST4;4.0.8 in central
found org.abego.treelayout#org.abego.treelayout.core;1.0.3 in central
found org.glassfish#javax.json;1.0.4 in central
found com.ibm.icu#icu4j;58.2 in central
found org.apache.hadoop#hadoop-aws;2.7.7 in central
found org.apache.hadoop#hadoop-common;2.7.7 in central
found org.apache.hadoop#hadoop-annotations;2.7.7 in central
found com.google.guava#guava;11.0.2 in central
found com.google.code.findbugs#jsr305;3.0.0 in central
found commons-cli#commons-cli;1.2 in central
found org.apache.commons#commons-math3;3.1.1 in central
found xmlenc#xmlenc;0.52 in central
found commons-httpclient#commons-httpclient;3.1 in central
found commons-logging#commons-logging;1.1.3 in central
found commons-codec#commons-codec;1.4 in central
found commons-io#commons-io;2.4 in central
found commons-net#commons-net;3.1 in central
found commons-collections#commons-collections;3.2.2 in central
found javax.servlet#servlet-api;2.5 in central
found org.mortbay.jetty#jetty;6.1.26 in central
found org.mortbay.jetty#jetty-util;6.1.26 in central
found org.mortbay.jetty#jetty-sslengine;6.1.26 in central
found com.sun.jersey#jersey-core;1.9 in central
found com.sun.jersey#jersey-json;1.9 in central
found org.codehaus.jettison#jettison;1.1 in central
found com.sun.xml.bind#jaxb-impl;2.2.3-1 in central
found javax.xml.bind#jaxb-api;2.2.2 in central
found javax.xml.stream#stax-api;1.0-2 in central
found javax.activation#activation;1.1 in central
found org.codehaus.jackson#jackson-core-asl;1.9.13 in central
found org.codehaus.jackson#jackson-mapper-asl;1.9.13 in central
found org.codehaus.jackson#jackson-jaxrs;1.9.13 in central
found org.codehaus.jackson#jackson-xc;1.9.13 in central
found com.sun.jersey#jersey-server;1.9 in central
found asm#asm;3.2 in central
found log4j#log4j;1.2.17 in central
found net.java.dev.jets3t#jets3t;0.9.0 in central
found org.apache.httpcomponents#httpclient;4.2.5 in central
found org.apache.httpcomponents#httpcore;4.2.5 in central
found com.jamesmurty.utils#java-xmlbuilder;0.4 in central
found commons-lang#commons-lang;2.6 in central
found commons-configuration#commons-configuration;1.6 in central
found commons-digester#commons-digester;1.8 in central
found commons-beanutils#commons-beanutils;1.7.0 in central
found commons-beanutils#commons-beanutils-core;1.8.0 in central
found org.slf4j#slf4j-api;1.7.10 in central
found org.apache.avro#avro;1.7.4 in central
found com.thoughtworks.paranamer#paranamer;2.3 in central
found org.xerial.snappy#snappy-java;1.0.4.1 in central
found org.apache.commons#commons-compress;1.4.1 in central
found org.tukaani#xz;1.0 in central
found com.google.protobuf#protobuf-java;2.5.0 in central
found com.google.code.gson#gson;2.2.4 in central
found org.apache.hadoop#hadoop-auth;2.7.7 in central
found org.apache.directory.server#apacheds-kerberos-codec;2.0.0-M15 in central
found org.apache.directory.server#apacheds-i18n;2.0.0-M15 in central
found org.apache.directory.api#api-asn1-api;1.0.0-M20 in central
found org.apache.directory.api#api-util;1.0.0-M20 in central
found org.apache.zookeeper#zookeeper;3.4.6 in central
found org.slf4j#slf4j-log4j12;1.7.10 in central
found io.netty#netty;3.6.2.Final in central
found org.apache.curator#curator-framework;2.7.1 in central
found org.apache.curator#curator-client;2.7.1 in central
found com.jcraft#jsch;0.1.54 in central
found org.apache.curator#curator-recipes;2.7.1 in central
found org.apache.htrace#htrace-core;3.1.0-incubating in central
found org.mortbay.jetty#servlet-api;2.5-20081211 in central
found javax.servlet.jsp#jsp-api;2.1 in central
found jline#jline;0.9.94 in central
found junit#junit;4.11 in central
found org.hamcrest#hamcrest-core;1.3 in central
found com.fasterxml.jackson.core#jackson-databind;2.2.3 in central
found com.fasterxml.jackson.core#jackson-annotations;2.2.3 in central
found com.fasterxml.jackson.core#jackson-core;2.2.3 in central
found com.amazonaws#aws-java-sdk;1.7.4 in central
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.apache.ivy.util.url.IvyAuthenticator (file:/opt/conda/lib/python2.7/site-packages/pyspark/jars/ivy-2.4.0.jar) to field java.net.Authenticator.theAuthenticator
WARNING: Please consider reporting this to the maintainers of org.apache.ivy.util.url.IvyAuthenticator
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
found joda-time#joda-time;2.10.5 in central
[2.10.5] joda-time#joda-time;[2.2,)
:: resolution report :: resolve 2264ms :: artifacts dl 40ms
:: modules in use:
asm#asm;3.2 from central in [default]
com.amazonaws#aws-java-sdk;1.7.4 from central in [default]
com.fasterxml.jackson.core#jackson-annotations;2.2.3 from central in [default]
com.fasterxml.jackson.core#jackson-core;2.2.3 from central in [default]
com.fasterxml.jackson.core#jackson-databind;2.2.3 from central in [default]
com.google.code.findbugs#jsr305;3.0.0 from central in [default]
com.google.code.gson#gson;2.2.4 from central in [default]
com.google.guava#guava;11.0.2 from central in [default]
com.google.protobuf#protobuf-java;2.5.0 from central in [default]
com.ibm.icu#icu4j;58.2 from central in [default]
com.jamesmurty.utils#java-xmlbuilder;0.4 from central in [default]
com.jcraft#jsch;0.1.54 from central in [default]
com.sun.jersey#jersey-core;1.9 from central in [default]
com.sun.jersey#jersey-json;1.9 from central in [default]
com.sun.jersey#jersey-server;1.9 from central in [default]
com.sun.xml.bind#jaxb-impl;2.2.3-1 from central in [default]
com.thoughtworks.paranamer#paranamer;2.3 from central in [default]
commons-beanutils#commons-beanutils;1.7.0 from central in [default]
commons-beanutils#commons-beanutils-core;1.8.0 from central in [default]
commons-cli#commons-cli;1.2 from central in [default]
commons-codec#commons-codec;1.4 from central in [default]
commons-collections#commons-collections;3.2.2 from central in [default]
commons-configuration#commons-configuration;1.6 from central in [default]
commons-digester#commons-digester;1.8 from central in [default]
commons-httpclient#commons-httpclient;3.1 from central in [default]
commons-io#commons-io;2.4 from central in [default]
commons-lang#commons-lang;2.6 from central in [default]
commons-logging#commons-logging;1.1.3 from central in [default]
commons-net#commons-net;3.1 from central in [default]
io.delta#delta-core_2.11;0.5.0 from central in [default]
io.netty#netty;3.6.2.Final from central in [default]
javax.activation#activation;1.1 from central in [default]
javax.servlet#servlet-api;2.5 from central in [default]
javax.servlet.jsp#jsp-api;2.1 from central in [default]
javax.xml.bind#jaxb-api;2.2.2 from central in [default]
javax.xml.stream#stax-api;1.0-2 from central in [default]
jline#jline;0.9.94 from central in [default]
joda-time#joda-time;2.10.5 from central in [default]
junit#junit;4.11 from central in [default]
log4j#log4j;1.2.17 from central in [default]
net.java.dev.jets3t#jets3t;0.9.0 from central in [default]
org.abego.treelayout#org.abego.treelayout.core;1.0.3 from central in [default]
org.antlr#ST4;4.0.8 from central in [default]
org.antlr#antlr-runtime;3.5.2 from central in [default]
org.antlr#antlr4;4.7 from central in [default]
org.antlr#antlr4-runtime;4.7 from central in [default]
org.apache.avro#avro;1.7.4 from central in [default]
org.apache.commons#commons-compress;1.4.1 from central in [default]
org.apache.commons#commons-math3;3.1.1 from central in [default]
org.apache.curator#curator-client;2.7.1 from central in [default]
org.apache.curator#curator-framework;2.7.1 from central in [default]
org.apache.curator#curator-recipes;2.7.1 from central in [default]
org.apache.directory.api#api-asn1-api;1.0.0-M20 from central in [default]
org.apache.directory.api#api-util;1.0.0-M20 from central in [default]
org.apache.directory.server#apacheds-i18n;2.0.0-M15 from central in [default]
org.apache.directory.server#apacheds-kerberos-codec;2.0.0-M15 from central in [default]
org.apache.hadoop#hadoop-annotations;2.7.7 from central in [default]
org.apache.hadoop#hadoop-auth;2.7.7 from central in [default]
org.apache.hadoop#hadoop-aws;2.7.7 from central in [default]
org.apache.hadoop#hadoop-common;2.7.7 from central in [default]
org.apache.htrace#htrace-core;3.1.0-incubating from central in [default]
org.apache.httpcomponents#httpclient;4.2.5 from central in [default]
org.apache.httpcomponents#httpcore;4.2.5 from central in [default]
org.apache.zookeeper#zookeeper;3.4.6 from central in [default]
org.codehaus.jackson#jackson-core-asl;1.9.13 from central in [default]
org.codehaus.jackson#jackson-jaxrs;1.9.13 from central in [default]
org.codehaus.jackson#jackson-mapper-asl;1.9.13 from central in [default]
org.codehaus.jackson#jackson-xc;1.9.13 from central in [default]
org.codehaus.jettison#jettison;1.1 from central in [default]
org.glassfish#javax.json;1.0.4 from central in [default]
org.hamcrest#hamcrest-core;1.3 from central in [default]
org.mortbay.jetty#jetty;6.1.26 from central in [default]
org.mortbay.jetty#jetty-sslengine;6.1.26 from central in [default]
org.mortbay.jetty#jetty-util;6.1.26 from central in [default]
org.mortbay.jetty#servlet-api;2.5-20081211 from central in [default]
org.slf4j#slf4j-api;1.7.10 from central in [default]
org.slf4j#slf4j-log4j12;1.7.10 from central in [default]
org.tukaani#xz;1.0 from central in [default]
org.xerial.snappy#snappy-java;1.0.4.1 from central in [default]
xmlenc#xmlenc;0.52 from central in [default]
---------------------------------------------------------------------
| | modules || artifacts |
| conf | number| search|dwnlded|evicted|| number|dwnlded|
---------------------------------------------------------------------
| default | 80 | 1 | 0 | 0 || 80 | 0 |
---------------------------------------------------------------------
:: retrieving :: org.apache.spark#spark-submit-parent-1f15352a-51ee-4e1b-b10b-bc5ddeef4cdf
confs: [default]
0 artifacts copied, 80 already retrieved (0kB/23ms)
20/03/12 13:36:35 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
20/03/12 13:36:42 WARN Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041.
20/03/12 13:36:42 WARN Utils: Service 'SparkUI' could not bind on port 4041. Attempting port 4042.
20/03/12 13:36:42 WARN Utils: Service 'SparkUI' could not bind on port 4042. Attempting port 4043.
20/03/12 13:36:42 WARN Utils: Service 'SparkUI' could not bind on port 4043. Attempting port 4044.
Spark context Web UI available at http://anaconda-0.anaconda.spark-test.svc.cluster.local:4044
Spark context available as 'sc' (master = local[*], app id = local-1584020202147).
Spark session available as 'spark'.
Welcome to
____ __
/ / ___ / /
\ / _ / _ `/ __/ '/
/
/ .__/_,// //_\ version 2.4.4
/
/

Using Scala version 2.11.12 (OpenJDK 64-Bit Server VM, Java 11.0.6)
Type in expressions to have them evaluated.
Type :help for more information.

scala> spark.range(5).write.format("parquet").save("s3a://mnt/delta/testparquet")

java.io.IOException: Bucket mnt does not exist

at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:298)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2669)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
at org.apache.spark.sql.execution.datasources.DataSource.planForWritingFileFormat(DataSource.scala:424)
at org.apache.spark.sql.execution.datasources.DataSource.planForWriting(DataSource.scala:524)
at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:290)
at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:271)
at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:229)
... 49 elided

@sspaeti
Copy link
Author

sspaeti commented Apr 15, 2020

Has been solved with re-deploying zenko again. Was probably due to several changes to zenko without re-deploying.

@sspaeti sspaeti closed this as completed Apr 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant