-
-
Notifications
You must be signed in to change notification settings - Fork 8.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[jvm-packages] Persist CrossValidator model with xgboost4j-spark error #2115
Comments
@geoHeil can you look at what is happening? |
@CodingCat I guess there is no write method for CVModel for xgboost-spark , but I find pipeline model can be saved in the lastest version xgboost |
@frank111 I added your sample https://github.com/geoHeil/xgboost/blob/518/jvm-packages/xgboost4j-spark/src/test/scala/ml/dmlc/xgboost4j/scala/spark/XGBoostSparkPipelinePersistence.scala#L105-L180 as a unit test. So far it is failing with
Please can you check if that is the same for you. |
@geoHeil Thank you for your reply,i run your code from L105 to L180 in spark-shell, i cannot get your error, but i get following error
|
@frank111 I tried to rebuild xgboost on my laptop, but am facing a strange problem of a lot of failing test cases:
my computer has had very strange problems for a while. Not sure if it is related to that. For now I don't know why all these tests fail. I installed with -DskipTests, but am facing the issue above. Unfortunately, I will only be able to help you when I have fixed or know why all these test cases fail now. |
@CodingCat do you maybe have an Idea why I see all these test cases failing on my system? |
#2116 is also referencing a similar issue. I tried it on another (working laptop ) with archlinux. Again, I see some test failures. Strangely, this is a different bug i.e. there seem to be problems with the embedded C library.
|
I think it is related to rabit....will look into it with @tqchen |
@geoHeil , what's your OS? |
and would you please post the output when you run
|
@CodingCat one is Archlinux, (Manjaro) the other one OS X 10.12.3 on osx I get
when running checking in bash for a hostname below is the output.
Retrying installation on the archlinux machine I see the following test failure when mvn clean installing xgb:
|
the main reason for the failed test cases is that...Rabit Tracker cannot get hostname and ip address (after some changes)...I haven't locked down the change causing all of these troubles..@tqchen, any idea on who is the bad actor? |
@CodingCat the suggestion of #2166 is only a partial solution. Downgrading to python 3.5.3 (which had worked previously) did not help to fix the problem. Here a list of test failures for 3.6 https://gist.github.com/geoHeil/17ea7fa96f402f10a9a90517406330f6 and here the problems for 3.5.3 which are the same for my arch linux laptop ond 3.6 and causing the inconsistencies error.
strangely JNI only seems to be a problem for the spark tests.
|
@terrytangyuan, @tqchen it seems that the current implementation does not work with python 3.6? any idea on how to fix it? |
@CodingCat not really sure because the archlinux laptop was running with 3.6 and only showed the JNI Problems. Do you have an idea how to fix these? |
It's a problem with python tracker not jni code , if you look at your log, no tracker was started |
I am seeing a similar issue for saving cvModel after building the latest xgboost with spark jvm on linux: Is XGBoostEstimator not writable ? |
The plain one is. See the unit tests. Apparently there is an issue with the
cv version.
codeexplorer <notifications@github.com> schrieb am Do. 23. März 2017 um
20:24:
… I am seeing a similar issue for saving cvModel after building the latest
xgboost with spark jvm on linux:
scala> model.write.overwrite.save(modelDir)
java.lang.UnsupportedOperationException: Pipeline write will fail on this
Pipeline because it contains a stage which does not implement Writable.
Non-Writable stage: XGBoostEstimator_d334e220bcbe of type class
ml.dmlc.xgboost4j.scala.spark.XGBoostEstimator
at
org.apache.spark.ml.Pipeline$SharedReadWrite$$anonfun$validateStages$1.apply(Pipeline.scala:225)
at
org.apache.spark.ml.Pipeline$SharedReadWrite$$anonfun$validateStages$1.apply(Pipeline.scala:222)
at
scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
at
org.apache.spark.ml.Pipeline$SharedReadWrite$.validateStages(Pipeline.scala:222)
at org.apache.spark.ml.Pipeline$PipelineWriter.(Pipeline.scala:198)
at org.apache.spark.ml.Pipeline.write(Pipeline.scala:184)
at org.apache.spark.ml.util.MLWritable$class.save(ReadWrite.scala:160)
at org.apache.spark.ml.Pipeline.save(Pipeline.scala:92)
at
org.apache.spark.ml.tuning.ValidatorParams$.saveImpl(ValidatorParams.scala:148)
at
org.apache.spark.ml.tuning.CrossValidatorModel$CrossValidatorModelWriter.saveImpl(CrossValidator.scala:250)
at org.apache.spark.ml.util.MLWriter.save(ReadWrite.scala:114)
... 48 elided
Is XGBoostEstimator not writable ?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2115 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ABnc9FNonKLN-HGWh3TnI_ZiaoMS9xrMks5rosbZgaJpZM4MgFKV>
.
|
I can complie lastest version xgboost , environment: redhat 6.5 python2.7
|
This must be something else. I created a conda environment with 2.7.13 and still see the
please see the error log here https://gist.github.com/geoHeil/913dc5cf5f48af1614c3e1550a294815 What is strange though, that xgboost-jvm and xgboost4j build and test fine, but xgboost4j-spark is showing these problems. |
to further clarify the problem please find the following docker file:
This will report one test failure regarding the spark-histogram test (unfortunately not yet the error I see above) |
are they consistently failing? |
@geoHeil until so far, I cannot reproduce this error |
The histogram tests yes, and unfortunately the others as well. |
I cannot reproduce this, in Azure machine, Travis CI, mac book |
Hi, I have the similar issue, when installing xgboost with: Only 'xgboost4j-spark' module fails with this error:
I have followed the installation instructions from this page: http://xgboost.readthedocs.io/en/latest/jvm/ |
@stefan-nikolic My environment is redhat 6.5 python 2.7.10
you can try it. |
@CodingCat It is have been repaired in lastest change? |
No, haven't look at this one |
got a chance to look at the problem this afternoon, it is simply because we didn't implement MLWritable for XGBoostEstimator, the problem should be fixed by #2265 |
yeah, i will test it later. thanks your replay |
It works well now ,thanks ! |
No unfortunately not yet.
Lucas Estevam <notifications@github.com> schrieb am Mo. 15. Mai 2017 um
21:12:
… Getting the same test errors as @geoHeil <https://github.com/geoheil> ,
with fast histogram related tests failing. Did you ever manage to figure
out what the problem was @geoHeil <https://github.com/geoheil> ?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2115 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ABnc9Jeb67s8U7LYvchiH82211kHZbFpks5r6KNzgaJpZM4MgFKV>
.
|
Environment info
Operating System:
redhat 6.5(with
spark-2.1.0)
Compiler:
Package: jvm, xgboost4j-spark
lastest xgboost version used
I want to save CrossValidator model ,but i got a error
my code :
The text was updated successfully, but these errors were encountered: