New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement GeoWaveInputFormat for mapreduce #84
Comments
This looks good to me, unless there was something else I missed? Closing, re-open if I'm wrong |
Small issue trying to compile against Apache Hadoop: userClassesTakesPrecedence isn't a method in JobContext. May be easiest to simply remove it, but I can get it from Configuration. Do you want me to open a separate issue or re-open this one? |
Which version of hadoop are you building against? |
I've tried 2.6.0, 2.5.0 and 2.4.0. All fail with: [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.3.2:compile (default-compile) on project geowave-accumulo: Compilation failure: Compilation failure: Looks as if there was some churn in the Mapreduce V2 api. The NativeMap(Reduce)Context classes override the userClassesTakesPrecedence method by calling the method from the Map(Reduce)Context class. Those methods aren't in Apache Hadoop. |
Thanks, that is confirmed based on this list of changes: http://doc.mapr.com/display/MapR/Recompiling+MapReduce+V1+Applications JobContext removed userClassesTakesPrecedence. But our default compilation at this time is against hadoop 2.5.0 (cdh5.2.0)...maybe we should start using the vanilla Hadoop dependencies rather than the cloudera distribution? |
I went ahead and submitted a pull request: #240, I did have a test failure building geowave-test, but I have the same test on master, so I'm thinking it could be my environment. Curious to hear what you think. |
We definitely want it to work with native apache hadoop distros (will probably swap the default over shortly as part of a push to get on maven central) - looks like maybe cloudera backported that method. Looks like there is some wierdness with the pull requests, profiles, and the way we were setting the travis build matrix. I'll run it down after lunch if the latest build doesn't fix it. |
Created a new ticket for this - #241 I also tested a quick tweak where everything seems to work. Has your re-implemented methods, and I just added the hadoop version to the build matrix. Didn't bother with profiles (profiles here didn't seem to be worthwile to me, since the artifacts uniquely identified the version, and we have multiple configurations (i.e. multiple versions of cloudera, accumulo, etc. ) It's running right now, but looks like it will pass. |
Sure you don't want them? I just finished a branch with everything working. :) I'll put a pull request on the new issue, but if you'd rather skip it, I understand. We will need to add an additional repo to the pom for hadoop-jetty for Hortonworks. |
Go for it, I don't have a strong preference one way or another - just wanted to see if there were any other back-ported classes that cropped up as problematic. I canceled the travis stuff in progress to free up worker instances - we can accept the pull request as soon as it finished/passes. |
This should be an intuitive abstraction on top of GeoWave with the value being the decoded/deserialized entry value and the key being data ID and adapter ID. It should also make the best attempt at providing a re-usable pattern for de-duplication.
The text was updated successfully, but these errors were encountered: