Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade spark version from 1.3.1 -> 1.5.1 #42

Merged
merged 1 commit into from
Sep 29, 2015
Merged

Conversation

witgo
Copy link
Contributor

@witgo witgo commented Sep 15, 2015

No description provided.

@bhoppi
Copy link
Contributor

bhoppi commented Sep 16, 2015

编译后的jar包会变大4M,多出的文件来自以下artifact:
com.sun.xml.bind:jaxb-core:2.2.7
com.sun.xml.bind:jaxb-impl:2.2.7
org.apache.parquet:parquet-column:1.7.0
org.apache.parquet:parquet-common:1.7.0
org.apache.parquet:parquet-encoding:1.7.0
org.apache.parquet:parquet-format:1.7.0
org.apache.parquet:parquet-generator:1.7.0
org.apache.parquet:parquet-hadoop:1.7.0
org.apache.parquet:parquet-jackson:1.7.0
org.codehaus.janino:commons-compiler:2.7.8
org.codehaus.janino:janino:2.7.8
是否应该把它们exclude出去?

@bhoppi
Copy link
Contributor

bhoppi commented Sep 16, 2015

另外,多出以下6个warnings:
[WARNING] D:\collection\document\Intellij\zen\ml\src\main\scala\com\github\cloudml\zen\ml\recommendation\BSFMModel.scala:92: method parquetFile in class SQLContext is deprecated: Use read.parquet()
[WARNING] val dataRDD = sqlContext.parquetFile(dataPath)
[WARNING] ^
[WARNING] D:\collection\document\Intellij\zen\ml\src\main\scala\com\github\cloudml\zen\ml\recommendation\BSFMModel.scala:131: method saveAsParquetFile in class DataFrame is deprecated: Use write.parquet(path)
[WARNING] factors.toDF("featureId", "factors").saveAsParquetFile(LoaderUtils.dataPath(path))
[WARNING] ^
[WARNING] D:\collection\document\Intellij\zen\ml\src\main\scala\com\github\cloudml\zen\ml\recommendation\FMModel.scala:106: method parquetFile in class SQLContext is deprecated: Use read.parquet()
[WARNING] val dataRDD = sqlContext.parquetFile(dataPath)
[WARNING] ^
[WARNING] D:\collection\document\Intellij\zen\ml\src\main\scala\com\github\cloudml\zen\ml\recommendation\FMModel.scala:144: method saveAsParquetFile in class DataFrame is deprecated: Use write.parquet(path)
[WARNING] factors.toDF("featureId", "factors").saveAsParquetFile(LoaderUtils.dataPath(path))
[WARNING] ^
[WARNING] D:\collection\document\Intellij\zen\ml\src\main\scala\com\github\cloudml\zen\ml\recommendation\MVMModel.scala:109: method parquetFile in class SQLContext is deprecated: Use read.parquet()
[WARNING] val dataRDD = sqlContext.parquetFile(dataPath)
[WARNING] ^
[WARNING] D:\collection\document\Intellij\zen\ml\src\main\scala\com\github\cloudml\zen\ml\recommendation\MVMModel.scala:147: method saveAsParquetFile in class DataFrame is deprecated: Use write.parquet(path)
[WARNING] factors.toDF("featureId", "factors").saveAsParquetFile(LoaderUtils.dataPath(path))

@witgo
Copy link
Contributor Author

witgo commented Sep 16, 2015

parquet相关可以exclude. 我来fix warnings.

@witgo
Copy link
Contributor Author

witgo commented Sep 16, 2015

@bhoppi 相关修改意见提交. 应该把编译后的文件assembly/target/scala-2.10/zen-assembly-0.2-SNAPSHOT-spark1.5.0.jar 放到集群上运行下,确保一切正常 .
顺利的话可以合并到master.

@@ -102,6 +102,7 @@
<groupId>org.apache.spark</groupId>
<artifactId>spark-mllib_${scala.binary.version}</artifactId>
<version>${spark.version}</version>
<scope>provided</scope>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mllib的scope也设置成provided了这样生成的jar更小一点,不过有可能带来其他问题. 应当测试下.

@witgo
Copy link
Contributor Author

witgo commented Sep 18, 2015

FM看起来没问题. 在spark 升级到1.5.1时(最近一两周)再合并到master吧.

@MetaFlowRepo
Copy link
Contributor

有性能提升么?
赵博有些LDA相关GraphX优化可以移植带FM

Sent from my Windows Phone


发件人: Guoqiang Limailto:notifications@github.com
发送时间: ‎2015/‎9/‎18 23:43
收件人: cloudml/zenmailto:zen@noreply.github.com
主题: Re: [zen] Upgrade spark version from 1.3.1 -> 1.5.0. (#42)

FM看起来没问题. 在spark 升级到1.5.1(最近一两周)时再合并到master吧.


Reply to this email directly or view it on GitHub:
#42 (comment)

@witgo
Copy link
Contributor Author

witgo commented Sep 18, 2015

没有测试性能.最近没有适合的集群.

@MetaFlowRepo
Copy link
Contributor

有空我来测试

Sent from my Windows Phone


发件人: Guoqiang Limailto:notifications@github.com
发送时间: ‎2015/‎9/‎18 23:52
收件人: cloudml/zenmailto:zen@noreply.github.com
抄送: sparkmlmailto:hucheng@outlook.com
主题: Re: [zen] Upgrade spark version from 1.3.1 -> 1.5.0. (#42)

没有测试性能.最近没有适合的集群.


Reply to this email directly or view it on GitHub:
#42 (comment)

@witgo witgo changed the title Upgrade spark version from 1.3.1 -> 1.5.0. Upgrade spark version from 1.3.1 -> 1.5.1 Sep 29, 2015
witgo added a commit that referenced this pull request Sep 29, 2015
Upgrade spark version from 1.3.1 -> 1.5.1
@witgo witgo merged commit 397e73b into cloudml:master Sep 29, 2015
@witgo witgo deleted the spark_15 branch September 29, 2015 07:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants