New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integration with Amazon Elastic Map Reduce #96
Comments
AWS is just another Hadoop cluster. Take a look at the configuration On Mon, Oct 7, 2013 at 1:39 PM, revendless-team notifications@github.comwrote:
Costin |
Just to add, it makes sense to have a dedicated section in the docs with specific instructions to make it easy for folks to start up. It's not there yet but I hope it wouldn't take long until it will - ofc, you can help. Cheers. |
i've gotten this library working pretty much out of the box with elastic map reduce custom jar feature |
That's good to hear. Nevertheless I think it would help to have a pointer On Mon, Oct 7, 2013 at 4:40 PM, paulhredowl notifications@github.comwrote:
Costin |
not yet - will let you know once there's something in trunk. |
Just to be clear, es-hadoop works as it is with AWS ElasticMapReduce - as I've mentioned before issue is about getting some getting started documentation in. |
@paulhredowl How did you package your JAR? I used Eclipse to package a fat-JAR and I'm getting ClassNotFoundExceptions despite the class clearly being there. I couldn't tell if I did something wrong with my packaging or if I was bumping into an issue similar to #95. |
Can you paste the stacktrace in a separate issue? Are you using the proper binary (yarn vs non-yarn) from M2? You can switch to the nightly build jar which works transparently across both of them. |
I think I have it resolved. For whatever reason, when using the "mapreduce" Hadoop API, I have to call |
Hey Guys - from what I get in the documentation and from the thread - once I create an EMR Cluster I can just be able to point Elasticsearch to the the EMR Cluster and then try to "index" the data? Does this mean that Elasticsearch acts as form of abstraction layer that would let me have better insight to the data without having to write pig/hive jobs. Sorry if that sounds a little novice (I am a virgin at both - just a DBA trying to make a living) |
Closing this long standing issue. Not only there's documentation in place mentioning how to configure ES within a cloud environment but also we added the WAN support to allow working with a cluster only through a restricted set of gateway nodes. Cheers, |
No description provided.
The text was updated successfully, but these errors were encountered: