Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SBT spDist not creating correct zip file #18

Closed
FRosner opened this issue Feb 9, 2016 · 9 comments
Closed

SBT spDist not creating correct zip file #18

FRosner opened this issue Feb 9, 2016 · 9 comments

Comments

@FRosner
Copy link

FRosner commented Feb 9, 2016

Hello,

I tried to publish a spark package release of https://github.com/FRosner/drunken-data-quality/tree/2.1.0 by using the "register a release" functionality on http://spark-packages.org/package/FRosner/drunken-data-quality.

I executed sbt clean spDist and it created drunken-data-quality-2.1.0-s_2.10.zip. I selected this zip file in the "register a release" form and got the following message:

release FRosner/drunken-data-quality:2.1.0 FRosner 2016-02-09 07:44:15 2016-02-09 07:52:03 error u"pom file doesn't exist in zip archive or is misnamed. Was looking for drunken-data-quality-2.1.0.pom, contents: [u'drunken-data-quality-2.1.0-s_2.10.jar', u'drunken-data-quality-2.1.0-s_2.10.pom']"

Why is it not putting the pom file correctly?

Thanks
Frank


Relates to FRosner/drunken-data-quality#53

@karlhigley
Copy link

I ran into similar issues and assume this has to do with setting spAppendScalaVersion := true. There seem to be some weird inconsistencies between what's required to register a release and what's required to produce a usable package/distribution.

@FRosner
Copy link
Author

FRosner commented Feb 11, 2016

Thanks @karlhigley. So is there any documentation on how to create cross scala version spark packages?

The only thing I found was "The sbt-spark-package plugin handles all these for you, therefore we highly recommend that you use the spDist command to generate the release artifact directly or spMakePom to generate the POM." but it seems not to be true.

Do you recommend switching spAppendScalaVersion := false?

@karlhigley
Copy link

Actually, I just tried that and ran into a similar issue to the one you're having, which is already captured as #17. Unfortunately, I'm no help!

@FRosner
Copy link
Author

FRosner commented Feb 12, 2016

I also wrote to the feedback list of @databricks for spark-packages but they are not very responsive. I like the idea a lot but I'm struggling from time to time with the implementation :(

@FRosner
Copy link
Author

FRosner commented Feb 15, 2016

From the feedback service I got the reply:

Regarding supporting multiple Scala versions, when using the sbt-spark-package plugin, you may use +spDist. Note the plus sign. You will also need to set the settings for "crossScalaVersions" and "spAppendScalaVersion". You may check out the build file of spark-csv for reference.

Will give it a try now.

@FRosner
Copy link
Author

FRosner commented Feb 15, 2016

I found the solution it seems...

Please use 2.1.0-s_2.10 for the version if the zip file created also has -s_2.10 in it.

@FRosner
Copy link
Author

FRosner commented Feb 15, 2016

Works. I was able to upload the artifact.

@FRosner FRosner closed this as completed Feb 15, 2016
@brkyvz
Copy link
Contributor

brkyvz commented Feb 15, 2016

Just to provide some more insight on why this naming scheme is required for the version:
We're trying to support multiple families of software libraries through Spark Packages. These libraries can be in Scala, Python, Java, R, or they can simply be deploy scripts in bash. Scala broke some contracts by setting its standard as appending the Scala version to the artifact name. In Spark Packages terms, that means spark-csv_2.10 and spark-csv_2.11 are separate packages, but in fact they are not. They just use different Scala versions.
We don't like littering the version field as much as the next person, but that's the only way we can easily differentiate packages while trying to keep a consistent scheme across multiple distribution standards.
I'll add more documentation about this on the README and on the Spark Packages website as soon as possible.

Thank you for sharing the replies here as well @FRosner for others' reference.

@FRosner
Copy link
Author

FRosner commented Feb 15, 2016 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants