Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pom file dependency for Hadoop ("compile"->"provided") #15

Open
willp-bl opened this issue Apr 25, 2014 · 2 comments
Open

Pom file dependency for Hadoop ("compile"->"provided") #15

willp-bl opened this issue Apr 25, 2014 · 2 comments

Comments

@willp-bl
Copy link

The webarchive-commons pom file specifies a particular Hadoop version as a "compile" dependency, this should probably be "provided" so that jars are not duplicated as they will be on the cluster in any case.

Also - my cluster is CDH4 but the version in central relies on CDH3, not yet sure if this is what is causing me other issues

willp-bl pushed a commit to willp-bl/webarchive-commons that referenced this issue May 9, 2014
the ones provided on the cluster) - I can now use this on my CDH4 cluster
(via ukwa/webarchive-discovery/warc-hadoop-recordreaders 2.0-dev branch)

Ref Issue iipc#15
@anjackson
Copy link
Member

Still not had time to look at this yet. Of course, in the meantime, anyone reliant on this artefact can exclude the Hadoop artefact dependency in their pom.xml, and add their own override.

<dependency>
  <groupId>org.netpreserve.commons</groupId>
  <artifactId>webarchive-commons</artifactId>
  <version>1.1.3</version>
  <exclusions>
    <exclusion>
      <groupId>org.apache.hadoop</groupId>
      <artifactId>hadoop-core</artifactId>
    </exclusion>
  </exclusions>
</dependency>
<dependency>
  <groupId>org.apache.hadoop</groupId>
  <artifactId>hadoop-core</artifactId>
  <version>0.20.2-cdh3u4</version>
  <scope>provided</scope>
</dependency>

@johnerikhalse
Copy link
Contributor

johnerikhalse commented Apr 26, 2016

The Hadoop dependency is needed for reading (w)arcs in HDFS. I can't find any other uses in webarchive-commons. An OpenWayback deployment with warcs stored in HDFS is then dependent on having these libraries included.

The easy solution is to change dependency to provided here and add hadoop-core as a dependency to OpenWayback. Not sure if that requires a major release or if the change is small enough for a minor release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants