Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Uber JAR for Spline #163

Closed
wajda opened this issue Mar 21, 2019 · 6 comments

Comments

Projects
None yet
4 participants
@wajda
Copy link
Contributor

commented Mar 21, 2019

Requested by Arvind:

regarding the fat jar, it would be VERY useful if your team builds it and publishes to Maven. The reason for this is because in Azure the most popular Spark deployment is Azure Databricks. Now, Databricks (in Azure and in AWS, but I am only looking at Azure) users are predominantly coding via. notebooks as opposed to submitting a traditional JAR to the cluster. Given this notebook interface and the user profiles for our customers, we have seen that if there are too many dependency JARs have to be added as libraries, it causes adoption friction and the customer would most likely not use the library. Hence, if you can release an uber JAR officially on Maven that would be really helpful to drive adoption of Spline with Azure Databricks.

@wajda wajda added this to the 0.3.next milestone Mar 21, 2019

@wajda

This comment has been minimized.

Copy link
Contributor Author

commented Mar 21, 2019

@algattik

This comment has been minimized.

Copy link
Contributor

commented Mar 26, 2019

In addition/in parallel, it would be extremely useful to generate different jars / different uber-jars for the various Spark versions, rather than have transitive dependency resolution via maven profiles.
Other projects routinely do this, for example
https://search.maven.org/artifact/com.microsoft.azure/azure-cosmosdb-spark_2.3.0_2.11/1.3.3/jar

This would greatly simplify the installation e.g. on Databricks:
image

@jcnorman48

This comment has been minimized.

Copy link

commented Apr 23, 2019

+1 I think this would resolve the issue as well #160
I see the same ${spark.version} error in DataBricks and SBT

@wajda wajda closed this Apr 29, 2019

@wajda

This comment has been minimized.

Copy link
Contributor Author

commented May 2, 2019

@algattik , @jcnorman48
The Uber-jar is available in Maven central. Can you please test it give us feedback? Thank you.
https://search.maven.org/search?q=spline-bundle

@algattik

This comment has been minimized.

Copy link
Contributor

commented May 3, 2019

Running on Databricks:
java.lang.NoClassDefFoundError: scalaz/Semigroup
at za.co.absa.spline.core.harvester.DataLineageBuilder$$anonfun$getMetrics$1.apply(DataLineageBuilder.scala:143

I think you might be excluding too much. When I built my uber JAR I used:

<dependencies>
    <dependency>
      <groupId>za.co.absa.spline</groupId>
      <artifactId>spline-core</artifactId>
      <version>${spline.version}</version>
    </dependency>
    <dependency>
      <groupId>za.co.absa.spline</groupId>
      <artifactId>spline-core-spark-adapter-${spark.version}</artifactId>
      <version>${spline.version}</version>
      <exclusions>
        <exclusion>
          <groupId>org.apache.spark</groupId>
          <artifactId>*</artifactId>
        </exclusion>
      </exclusions>
    </dependency>
    <dependency>
      <groupId>za.co.absa.spline</groupId>
      <artifactId>spline-persistence-mongo</artifactId>
      <version>${spline.version}</version>
    </dependency>
  </dependencies>
@wajda

This comment has been minimized.

Copy link
Contributor Author

commented May 5, 2019

yep, got it fixed already :) 9771b74
Sorry, did have a chance to test it properly last time. Will release 0.3.8 asap.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.