Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error downloading pretrained pipeline #21

Closed
csyhuang opened this issue Mar 23, 2019 · 6 comments
Closed

Error downloading pretrained pipeline #21

csyhuang opened this issue Mar 23, 2019 · 6 comments
Assignees
Labels
bug Something isn't working notebooks This issue happens inside examples

Comments

@csyhuang
Copy link

I'm running the jupyter notebook with the Docker, but when executing

pipeline = PretrainedPipeline('explain_document_dl')

there is a No such file or directory error. The error messages are included below:

---------------------------------------------------------------------------
Py4JJavaError                             Traceback (most recent call last)
<timed exec> in <module>

/usr/lib/python3.6/site-packages/sparknlp/pretrained.py in __init__(self, name, lang, remote_loc)
     28 
     29     def __init__(self, name, lang='en', remote_loc=None):
---> 30         self.model = ResourceDownloader().downloadPipeline(name, lang, remote_loc)
     31         self.light_model = LightPipeline(self.model)
     32 

/usr/lib/python3.6/site-packages/sparknlp/pretrained.py in downloadPipeline(name, language, remote_loc)
     16     @staticmethod
     17     def downloadPipeline(name, language, remote_loc=None):
---> 18         j_obj = _internal._DownloadPipeline(name, language, remote_loc).apply()
     19         jmodel = JavaModel(j_obj)
     20         return jmodel

/usr/lib/python3.6/site-packages/sparknlp/internal.py in __init__(self, name, language, remote_loc)
     63     def __init__(self, name, language, remote_loc):
     64         super(_DownloadPipeline, self).__init__("com.johnsnowlabs.nlp.pretrained.PythonResourceDownloader.downloadPipeline")
---> 65         self._java_obj = self._new_java_obj(self._java_obj, name, language, remote_loc)
     66 
     67 

/usr/lib/python3.6/site-packages/pyspark/ml/wrapper.py in _new_java_obj(java_class, *args)
     65             java_obj = getattr(java_obj, name)
     66         java_args = [_py2java(sc, arg) for arg in args]
---> 67         return java_obj(*java_args)
     68 
     69     @staticmethod

/usr/lib/python3.6/site-packages/py4j/java_gateway.py in __call__(self, *args)
   1255         answer = self.gateway_client.send_command(command)
   1256         return_value = get_return_value(
-> 1257             answer, self.gateway_client, self.target_id, self.name)
   1258 
   1259         for temp_arg in temp_args:

/usr/lib/python3.6/site-packages/pyspark/sql/utils.py in deco(*a, **kw)
     61     def deco(*a, **kw):
     62         try:
---> 63             return f(*a, **kw)
     64         except py4j.protocol.Py4JJavaError as e:
     65             s = e.java_exception.toString()

/usr/lib/python3.6/site-packages/py4j/protocol.py in get_return_value(answer, gateway_client, target_id, name)
    326                 raise Py4JJavaError(
    327                     "An error occurred while calling {0}{1}{2}.\n".
--> 328                     format(target_id, ".", name), value)
    329             else:
    330                 raise Py4JError(

Py4JJavaError: An error occurred while calling z:com.johnsnowlabs.nlp.pretrained.PythonResourceDownloader.downloadPipeline.
: java.lang.UnsatisfiedLinkError: /tmp/tensorflow_native_libraries-1553289340774-0/libtensorflow_jni.so: Error loading shared library ld-linux-x86-64.so.2: No such file or directory (needed by /tmp/tensorflow_native_libraries-1553289340774-0/libtensorflow_jni.so)
	at java.lang.ClassLoader$NativeLibrary.load(Native Method)
	at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1941)
	at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1824)
	at java.lang.Runtime.load0(Runtime.java:809)
	at java.lang.System.load(System.java:1086)
	at org.tensorflow.NativeLibrary.load(NativeLibrary.java:101)
	at org.tensorflow.TensorFlow.init(TensorFlow.java:66)
	at org.tensorflow.TensorFlow.<clinit>(TensorFlow.java:70)
	at org.tensorflow.Graph.<clinit>(Graph.java:361)
	at com.johnsnowlabs.ml.tensorflow.TensorflowWrapper$.readGraph(TensorflowWrapper.scala:98)
	at com.johnsnowlabs.ml.tensorflow.TensorflowWrapper$.read(TensorflowWrapper.scala:172)
	at com.johnsnowlabs.ml.tensorflow.ReadTensorflowModel$class.readTensorflowModel(TensorflowSerializeModel.scala:57)
	at com.johnsnowlabs.nlp.annotators.ner.dl.NerDLModel$.readTensorflowModel(NerDLModel.scala:97)
	at com.johnsnowlabs.nlp.annotators.ner.dl.ReadsNERGraph$class.readNerGraph(NerDLModel.scala:84)
	at com.johnsnowlabs.nlp.annotators.ner.dl.NerDLModel$.readNerGraph(NerDLModel.scala:97)
	at com.johnsnowlabs.nlp.annotators.ner.dl.ReadsNERGraph$$anonfun$2.apply(NerDLModel.scala:88)
	at com.johnsnowlabs.nlp.annotators.ner.dl.ReadsNERGraph$$anonfun$2.apply(NerDLModel.scala:88)
	at com.johnsnowlabs.nlp.ParamsAndFeaturesReadable$$anonfun$com$johnsnowlabs$nlp$ParamsAndFeaturesReadable$$onRead$1.apply(ParamsAndFeaturesReadable.scala:31)
	at com.johnsnowlabs.nlp.ParamsAndFeaturesReadable$$anonfun$com$johnsnowlabs$nlp$ParamsAndFeaturesReadable$$onRead$1.apply(ParamsAndFeaturesReadable.scala:30)
	at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
	at com.johnsnowlabs.nlp.ParamsAndFeaturesReadable$class.com$johnsnowlabs$nlp$ParamsAndFeaturesReadable$$onRead(ParamsAndFeaturesReadable.scala:30)
	at com.johnsnowlabs.nlp.ParamsAndFeaturesReadable$$anonfun$read$1.apply(ParamsAndFeaturesReadable.scala:41)
	at com.johnsnowlabs.nlp.ParamsAndFeaturesReadable$$anonfun$read$1.apply(ParamsAndFeaturesReadable.scala:41)
	at com.johnsnowlabs.nlp.FeaturesReader.load(ParamsAndFeaturesReadable.scala:19)
	at com.johnsnowlabs.nlp.FeaturesReader.load(ParamsAndFeaturesReadable.scala:8)
	at org.apache.spark.ml.util.DefaultParamsReader$.loadParamsInstance(ReadWrite.scala:652)
	at org.apache.spark.ml.Pipeline$SharedReadWrite$$anonfun$4.apply(Pipeline.scala:274)
	at org.apache.spark.ml.Pipeline$SharedReadWrite$$anonfun$4.apply(Pipeline.scala:272)
	at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
	at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
	at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
	at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
	at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
	at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:186)
	at org.apache.spark.ml.Pipeline$SharedReadWrite$.load(Pipeline.scala:272)
	at org.apache.spark.ml.PipelineModel$PipelineModelReader.load(Pipeline.scala:348)
	at org.apache.spark.ml.PipelineModel$PipelineModelReader.load(Pipeline.scala:342)
	at com.johnsnowlabs.nlp.pretrained.ResourceDownloader$.downloadPipeline(ResourceDownloader.scala:134)
	at com.johnsnowlabs.nlp.pretrained.ResourceDownloader$.downloadPipeline(ResourceDownloader.scala:128)
	at com.johnsnowlabs.nlp.pretrained.PythonResourceDownloader$.downloadPipeline(ResourceDownloader.scala:197)
	at com.johnsnowlabs.nlp.pretrained.PythonResourceDownloader.downloadPipeline(ResourceDownloader.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
	at py4j.Gateway.invoke(Gateway.java:282)
	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
	at py4j.commands.CallCommand.execute(CallCommand.java:79)
	at py4j.GatewayConnection.run(GatewayConnection.java:238)
	at java.lang.Thread.run(Thread.java:748)

Are the notebooks supposed to run without error in the docker? Thanks!

@maziyarpanahi
Copy link
Member

Hi @csyhuang and thanks for reporting this.
Just to have more details, could you please tell me about your Operating System and the process you took that lead to this error? (pull the image, run the image, and then you opened the Jupyter notebook and which example failed?)
Many thanks

@maziyarpanahi maziyarpanahi added the bug Something isn't working label Mar 23, 2019
@maziyarpanahi maziyarpanahi self-assigned this Mar 23, 2019
@maziyarpanahi
Copy link
Member

maziyarpanahi commented Mar 23, 2019

Hi again @csyhuang, I apologize because the issue was our Docker image.

Now everything should work as expected if you can run the docker pull again to download the latest changes, please.

PS: Keep in mind some of the examples with POS() and OCR may not work until late Sunday when we release the hotfix.

Thanks for your report and please let us know if you have any other issue with the examples.

docker pull johnsnowlabs/spark-nlp-workshop:latest

docker run -it --rm -p 8888:8888 -p 4040:4040 johnsnowlabs/spark-nlp-workshop

@csyhuang
Copy link
Author

Hi @maziyarpanahi , thanks for fixing the docker image. I tried running the notebook again but another error emerges:

Py4JError: An error occurred while calling z:com.johnsnowlabs.nlp.pretrained.PythonResourceDownloader.downloadPipeline

Since this ticket is close, I'll open a new one and link it here.

@maziyarpanahi
Copy link
Member

maziyarpanahi commented Mar 25, 2019

Thanks for getting back to me. Can we try to see if these two commands are the ones you tried?

docker pull johnsnowlabs/spark-nlp-workshop:latest

docker run -it --rm -p 8888:8888 -p 4040:4040 johnsnowlabs/spark-nlp-workshop

PS: Could you paste paste the entire trace of errors?

@csyhuang
Copy link
Author

@maziyarpanahi Yes. I used the two commands you mentioned to run the docker.

I opened the new ticket before seeing your reply... Shall I continue replying here or we get to #31 ?

@maziyarpanahi
Copy link
Member

No worries, we can continue in your new issue.

@maziyarpanahi maziyarpanahi added the notebooks This issue happens inside examples label Apr 4, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working notebooks This issue happens inside examples
Projects
None yet
Development

No branches or pull requests

2 participants