New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Example test on yarn #3435
Comments
This comment has been minimized.
This comment has been minimized.
MxnetYarn-Client Exceptionhttp://10.112.231.51:18888/view/ZOO-PR/job/ZOO-PR-Python-integration-test/199/console |
This comment has been minimized.
This comment has been minimized.
Tfpark ganYarn-Client Exception
|
successfully installed decorator-5.1.0 tensorflow-datasets-2.0.0 tensorflow-gan-2.0.0 tensorflow-hub-0.12.0 tensorflow-probability-0.7.0 graphviz-0.8.4 idna-2.6 mxnet-cu91-1.2.1.post1 numpy-1.19.2 requests-2.18.4 urllib3-1.22 jenkins node OS is ubuntu16.04, openvino cannot be installed. |
automl tests have been added by PR intel-analytics/analytics-zoo#409 and PR intel-analytics/analytics-zoo#401 |
image_segmentation.pyYarn-Client ExceptionFileNotFoundError: [Errno 2] No such file or directory: 'hdfs://172.168.2.151:9000/carvana/train.zip' Yarn-Cluster Exception File "/dir3/yarn/nm_0/usercache/root/appcache/application_1635129125298_1235/container_1635129125298_1235_02_000001/python_env/lib/python3.7/zipfile.py", line 1240, in __init__
self.fp = io.open(file, filemode)
FileNotFoundError: [Errno 2] No such file or directory: 'hdfs://172.168.2.151:9000/carvana/train.zip' |
transfer_learning.pyYarn-Client ExceptionTraceback (most recent call last):
File "/opt/work/jenkins/workspace/ZOO-PR-Python-integration-test/python/orca/example/learn/tf/transfer_learning/transfer_learning.py", line 111, in <module>
builder = tfds.ImageFolder(base_dir)
AttributeError: module 'tensorflow_datasets' has no attribute 'ImageFolder' Yarn-Cluster ExceptionDownloading data from https://storage.googleapis.com/mledu-datasets/cats_and_dogs_filtered.zip
2021-11-22 21:42:45 ERROR ApplicationMaster:91 - Uncaught exception:
java.util.concurrent.TimeoutException: Futures timed out after [100000 milliseconds] |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
basic_text_classification.pyYarn-Cluster ExceptionDownloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/imdb.npz |
Downloading is not work in Yarn-cluster mode. |
Inception.pyYarn-Cluster ExceptionTraceback (most recent call last):
File "inception.py", line 28, in <module>
from inception_preprocessing import preprocess_for_train, \
ModuleNotFoundError: No module named 'inception_preprocessing' |
Need to check in new clusters. https://github.com/intel-analytics/arda-docker/issues/511 |
resent-50.pyYarn-Cluster ExceptionModuleNotFoundError: No module named 'ray._private' |
Orca examples test had been conducted in PR #3546. |
This comment has been minimized.
This comment has been minimized.
fashino_mnist.pyjep error Yarn-Cluster Exception
|
super_resolution.pyjep error Yarn-Cluster Exception
|
mnist.pyjep error Yarn-Cluster Exception
|
resnet_finetune.pyjep error Yarn-Cluster Exception
|
cifar10.pyjep error Yarn-Cluster Error
|
jep error in new issue |
deploymode
option.Reference: https://github.com/intel-analytics/BigDL/blob/branch-2.0/python/dllib/src/bigdl/dllib/models/inception/inception.py, https://github.com/intel-analytics/BigDL/blob/branch-2.0/python/dllib/src/bigdl/dllib/models/lenet/lenet5.py
TODO: Move run-example-tests-yarn-integration.sh to
python/dllib/examples/
. (xin)dllib examples, use init_nncontext
orca examples, use init_orca_context
The text was updated successfully, but these errors were encountered: