Implement GeoWaveInputFormat for mapreduce #84

rfecher · 2014-10-15T12:58:44Z

This should be an intuitive abstraction on top of GeoWave with the value being the decoded/deserialized entry value and the key being data ID and adapter ID. It should also make the best attempt at providing a re-usable pattern for de-duplication.

geowave input format (issue #84)

chrisbennight · 2014-11-20T02:16:34Z

This looks good to me, unless there was something else I missed? Closing, re-open if I'm wrong

dlyle65535 · 2015-02-18T11:17:31Z

Small issue trying to compile against Apache Hadoop: userClassesTakesPrecedence isn't a method in JobContext. May be easiest to simply remove it, but I can get it from Configuration. Do you want me to open a separate issue or re-open this one?

chrisbennight · 2015-02-18T11:53:06Z

Which version of hadoop are you building against?

dlyle65535 · 2015-02-18T12:26:14Z

I've tried 2.6.0, 2.5.0 and 2.4.0. All fail with:

[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.3.2:compile (default-compile) on project geowave-accumulo: Compilation failure: Compilation failure:
[ERROR] /Users/dml/projects/geowave/geowave-accumulo/target/munged/main/mil/nga/giat/geowave/accumulo/mapreduce/NativeReduceContext.java:[203,24] error: cannot find symbol
[ERROR]
[ERROR] KEYIN extends Object declared in class NativeReduceContext
[ERROR] VALUEIN extends Object declared in class NativeReduceContext
[ERROR] /Users/dml/projects/geowave/geowave-accumulo/target/munged/main/mil/nga/giat/geowave/accumulo/mapreduce/NativeReduceContext.java:[201,1] error: method does not override or implement a method from a supertype
[ERROR] /Users/dml/projects/geowave/geowave-accumulo/target/munged/main/mil/nga/giat/geowave/accumulo/mapreduce/NativeMapContext.java:[193,16] error: cannot find symbol
[ERROR]
[ERROR] KEYIN extends Object declared in class NativeMapContext
[ERROR] VALUEIN extends Object declared in class NativeMapContext
[ERROR] /Users/dml/projects/geowave/geowave-accumulo/target/munged/main/mil/nga/giat/geowave/accumulo/mapreduce/NativeMapContext.java:[191,1] error: method does not override or implement a method from a supertype
[ERROR] -> [Help 1]

Looks as if there was some churn in the Mapreduce V2 api. The NativeMap(Reduce)Context classes override the userClassesTakesPrecedence method by calling the method from the Map(Reduce)Context class. Those methods aren't in Apache Hadoop.

rfecher · 2015-02-18T12:40:45Z

Thanks, that is confirmed based on this list of changes: http://doc.mapr.com/display/MapR/Recompiling+MapReduce+V1+Applications

JobContext removed userClassesTakesPrecedence. But our default compilation at this time is against hadoop 2.5.0 (cdh5.2.0)...maybe we should start using the vanilla Hadoop dependencies rather than the cloudera distribution?

dlyle65535 · 2015-02-18T16:38:41Z

I went ahead and submitted a pull request: #240, I did have a test failure building geowave-test, but I have the same test on master, so I'm thinking it could be my environment. Curious to hear what you think.

chrisbennight · 2015-02-18T17:11:15Z

We definitely want it to work with native apache hadoop distros (will probably swap the default over shortly as part of a push to get on maven central) - looks like maybe cloudera backported that method.
Probably need to add those versions to the build matrix as well to catch issues like this in the future.

Looks like there is some wierdness with the pull requests, profiles, and the way we were setting the travis build matrix. I'll run it down after lunch if the latest build doesn't fix it.

chrisbennight · 2015-02-18T19:19:20Z

Created a new ticket for this - #241

I also tested a quick tweak where everything seems to work. Has your re-implemented methods, and I just added the hadoop version to the build matrix. Didn't bother with profiles (profiles here didn't seem to be worthwile to me, since the artifacts uniquely identified the version, and we have multiple configurations (i.e. multiple versions of cloudera, accumulo, etc. )
https://github.com/ngageoint/geowave/tree/apache-hadoop

It's running right now, but looks like it will pass.

dlyle65535 · 2015-02-18T19:22:37Z

Sure you don't want them? I just finished a branch with everything working. :) I'll put a pull request on the new issue, but if you'd rather skip it, I understand. We will need to add an additional repo to the pom for hadoop-jetty for Hortonworks.

chrisbennight · 2015-02-18T19:34:07Z

Go for it, I don't have a strong preference one way or another - just wanted to see if there were any other back-ported classes that cropped up as problematic. I canceled the travis stuff in progress to free up worker instances - we can accept the pull request as soon as it finished/passes.

rfecher added the enhancement label Oct 15, 2014

rfecher added this to the Current milestone Oct 15, 2014

rfecher self-assigned this Oct 15, 2014

rfecher added a commit that referenced this issue Nov 5, 2014

geowave input format (issue #84)

c3189c4

rfecher added a commit that referenced this issue Nov 5, 2014

geowave input format (issue #84)

d970798

rfecher added a commit that referenced this issue Nov 5, 2014

geowave input format (issue #84)

402ffdd

rfecher added a commit that referenced this issue Nov 5, 2014

geowave input format (issue #84)

6850a06

rfecher added a commit that referenced this issue Nov 5, 2014

geowave input format (issue #84)

26a0e3b

rfecher added a commit that referenced this issue Nov 6, 2014

geowave input format (issue #84)

4379688

rfecher added a commit that referenced this issue Nov 7, 2014

geowave input format (issue #84)

f8af3b6

rfecher added a commit that referenced this issue Nov 7, 2014

geowave input format (issue #84)

df73a50

rfecher added a commit that referenced this issue Nov 7, 2014

geowave input format (issue #84)

29255e0

rfecher added a commit that referenced this issue Nov 10, 2014

geowave input format (issue #84)

08a27ba

rfecher added a commit that referenced this issue Nov 10, 2014

Merge pull request #101 from ngageoint/GEOWAVE-84-squash

da1dcfd

geowave input format (issue #84)

rfecher added a commit that referenced this issue Nov 10, 2014

geowave input format (issue #84)

038b15b

chrisbennight closed this as completed Nov 20, 2014

chrisbennight mentioned this issue Feb 18, 2015

Apache Hadoop distros integration #241

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement GeoWaveInputFormat for mapreduce #84

Implement GeoWaveInputFormat for mapreduce #84

rfecher commented Oct 15, 2014

chrisbennight commented Nov 20, 2014

dlyle65535 commented Feb 18, 2015

chrisbennight commented Feb 18, 2015

dlyle65535 commented Feb 18, 2015

rfecher commented Feb 18, 2015

dlyle65535 commented Feb 18, 2015

chrisbennight commented Feb 18, 2015

chrisbennight commented Feb 18, 2015

dlyle65535 commented Feb 18, 2015

chrisbennight commented Feb 18, 2015

Implement GeoWaveInputFormat for mapreduce #84

Implement GeoWaveInputFormat for mapreduce #84

Comments

rfecher commented Oct 15, 2014

chrisbennight commented Nov 20, 2014

dlyle65535 commented Feb 18, 2015

chrisbennight commented Feb 18, 2015

dlyle65535 commented Feb 18, 2015

rfecher commented Feb 18, 2015

dlyle65535 commented Feb 18, 2015

chrisbennight commented Feb 18, 2015

chrisbennight commented Feb 18, 2015

dlyle65535 commented Feb 18, 2015

chrisbennight commented Feb 18, 2015