Skip to content
This repository has been archived by the owner on Apr 4, 2019. It is now read-only.

InvocationTargetException when trying to run a job #17

Closed
amit-o opened this issue Oct 2, 2016 · 10 comments
Closed

InvocationTargetException when trying to run a job #17

amit-o opened this issue Oct 2, 2016 · 10 comments

Comments

@amit-o
Copy link

amit-o commented Oct 2, 2016

Hi,

I'm trying to submit a job -

{"name":"test",
"config":{"name":"test",
"connector.class":"com.qubole.streamx.s3.S3SinkConnector",
"tasks.max":"1",
"flush.size":"100",
"s3.url":"s3a://test",
"wal.class":"com.qubole.streamx.s3.wal.DBWAL",
"hadoop.conf.dir":"/opt/hadoop/etc/hadoop/",
"topics":"test"}}

But I get a InvocationTargetException and then the job is killed.
My core-site.xml is -


<configuration>
<property>
        <name>fs.AbstractFileSystem.s3.impl</name>
        <value>org.apache.hadoop.fs.s3a.S3A</value>
</property>
<property>
        <name>fs.s3a.impl</name>
        <value>org.apache.hadoop.fs.s3a.S3AFileSystem</value>
</property>
<property>
        <name>fs.s3a.access.key</name>
        <value>xxxx</value>
</property>
<property>
        <name>fs.s3a.secret.key</name>
        <value>xxxxx</value>
</property>
<property>
        <name>fs.s3a.endpoint</name>
        <value>s3.eu-central-1.amazonaws.com</value>
</property>
</configuration>

Hadoop fs -ls s3a://test works fine.
What am I doing wrong here?

Thanks.

@PraveenSeluka
Copy link
Contributor

Hi @AmitTwingo, you need to provide the following (you need a DB to store WAL file)

"db.connection.url":"jdbc:mysql://localhost:53152/streamx"
"db.user":"xxxx"
"db.password":xxxx"

Am working on a change to make DB WAL optional so that its easy to get started. I will add that soon

@PraveenSeluka
Copy link
Contributor

Hi @AmitTwingo, I have made DBWAL optional and also updated the getting started section with all details. Can you try again now.

@amit-o
Copy link
Author

amit-o commented Oct 4, 2016

Brilliant, I'll give it another try tomorrow. Thanks.

@amit-o
Copy link
Author

amit-o commented Oct 5, 2016

Hi @PraveenSeluka , I'm still getting errors. I've rebuilt everything to make sure.
Here is the full error I'm getting -

[2016-10-05 06:36:06,071] INFO Couldn't start HdfsSinkConnector: (io.confluent.connect.hdfs.HdfsSinkTask:72)
org.apache.kafka.connect.errors.ConnectException: java.lang.reflect.InvocationTargetException
at io.confluent.connect.hdfs.storage.StorageFactory.createStorage(StorageFactory.java:40)
at io.confluent.connect.hdfs.DataWriter.(DataWriter.java:171)
at io.confluent.connect.hdfs.HdfsSinkTask.start(HdfsSinkTask.java:64)
at org.apache.kafka.connect.runtime.WorkerSinkTask.initializeAndStart(WorkerSinkTask.java:207)
at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:139)
at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:140)
at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:175)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at io.confluent.connect.hdfs.storage.StorageFactory.createStorage(StorageFactory.java:33)
... 11 more
Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: Status Code: 400, AWS Service: Amazon S3, AWS Request ID: 6D7E03A3816B7A05, AWS Error Code: null, AWS Error Message: Bad Request, S3 Extended Request ID: LJUvlB/fInY3P9pw1dnpMAsKT6RluwL1gJmpeIXTqNGP0IRSluO0HgJkEvLEQLu+Rxj0Z5BWHDo=
at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:798)
at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:421)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:232)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3528)
at com.amazonaws.services.s3.AmazonS3Client.headBucket(AmazonS3Client.java:1031)
at com.amazonaws.services.s3.AmazonS3Client.doesBucketExist(AmazonS3Client.java:994)
at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:154)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2596)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2630)
at org.apache.hadoop.fs.FileSystem$Cache.getUnique(FileSystem.java:2618)
at org.apache.hadoop.fs.FileSystem.newInstance(FileSystem.java:417)
at com.qubole.streamx.s3.S3Storage.(S3Storage.java:49)

It seems like it's an error while connecting to s3a, despite me being able to do it outside of streamx.
Prehaps this is a region issue? or an AWS4-HMAC-SHA256 issue? My bucket is in europe, so it is forced.

I've added -Dcom.amazonaws.services.s3.enableV4 -Dcom.amazonaws.services.s3.enforceV4 to the hadoop executable manually, if I remove these parameters I'm getting the same error as streamx. (Although to be fair it's a very generic one..)
Prehaps this is the issue?

@amit-o
Copy link
Author

amit-o commented Oct 5, 2016

I found one solution for now.
I've upgraded the hadoop libraries to 2.7.3 and it seems to be working now.
Obviously haven't ran enough tests to be 100% sure but at least I can access eu-central-1.
I'm guessing hadoop 2.6.0 doesn't support the v4-only regions.

@PraveenSeluka
Copy link
Contributor

I just found from my team that using S3A will solve this issue even with hadoop 2.6. You can try that out.

@amit-o
Copy link
Author

amit-o commented Oct 6, 2016

I have been using S3A though - bucket link and settings

@PraveenSeluka
Copy link
Contributor

Did you have any other additional hadoop jars in classpath ? StreamX already includes all required dependencies, including hadoop.

@PraveenSeluka
Copy link
Contributor

Hi @AmitTwingo were you able to proceed on this ?

@amit-o
Copy link
Author

amit-o commented Oct 19, 2016

Hi, sorry for the radio silence.
I didn't have any additional hadoop jars in my classpath.
After using the 2.7 jars we have switched the bucket to the default zone which worked without a problem (with the 2.6 jars).

@amit-o amit-o closed this as completed Oct 27, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants