Skip to content

When I run the SIM model to process the amazon_books_2014 dataset, an error occurs. #1433

Open
@CX26-CX

Description

@CX26-CX

Related to SIM/TensorFlow2
*

Describe the bug
I am using the image nvcr.io/nvidia/tensorflow:22.12-tf2-py3.

Below is my pip list:
absl-py 1.0.0
argon2-cffi 21.3.0
argon2-cffi-bindings 21.2.0
asttokens 2.2.1
astunparse 1.6.3
attrs 22.1.0
backcall 0.2.0
beautifulsoup4 4.11.1
bleach 5.0.1
cachetools 5.2.0
certifi 2022.12.7
cffi 1.15.1
charset-normalizer 2.1.1
clang 13.0.1
click 8.0.4
cloudpickle 2.2.0
comm 0.1.2
cuda-python 11.7.0+0.g95a2041.dirty
cudf 22.10.0a0+316.gad1ba132d2.dirty
cugraph 22.10.0a0+113.g6bbdadf8.dirty
cuml 22.10.0a0+56.g3a8dea659.dirty
cupy-cuda118 11.0.0
cycler 0.11.0
Cython 0.29.32
dask 2022.10.2
dask-cuda 22.10.0a0+23.g62a1ee8
dask-cudf 22.10.0a0+316.gad1ba132d2.dirty
debugpy 1.6.4
decorator 5.1.1
defusedxml 0.7.1
dill 0.3.6
distributed 2022.9.2
entrypoints 0.4
executing 1.2.0
fastavro 1.5.4
fastjsonschema 2.16.2
fastrlock 0.8.1
filelock 3.8.2
flatbuffers 2.0
fonttools 4.38.0
fsspec 2022.8.2
future 0.18.2
gast 0.4.0
google-auth 2.9.1
google-auth-oauthlib 0.4.6
google-pasta 0.2.0
googleapis-common-protos 1.57.0
graphsurgeon 0.4.6
grpcio 1.39.0
h5py 3.6.0
HeapDict 1.0.1
horovod 0.26.1+nv22.12
huggingface-hub 0.0.12
idna 3.4
importlib-metadata 5.1.0
importlib-resources 5.10.1
ipykernel 6.19.2
ipython 8.7.0
ipython-genutils 0.2.0
jedi 0.18.2
Jinja2 3.1.2
joblib 1.2.0
json5 0.9.10
jsonschema 4.17.3
jupyter-client 7.3.4
jupyter_core 5.1.0
jupyter-tensorboard 0.2.0
jupyterlab 2.3.2
jupyterlab-pygments 0.2.2
jupyterlab-server 1.2.0
jupytext 1.14.4
keras 2.10.0
Keras-Applications 1.0.8
Keras-Preprocessing 1.1.2
kiwisolver 1.4.4
libclang 13.0.0
llvmlite 0.39.0rc1
locket 1.0.0
Markdown 3.4.1
markdown-it-py 2.1.0
MarkupSafe 2.1.1
matplotlib 3.5.0
matplotlib-inline 0.1.6
mdit-py-plugins 0.3.3
mdurl 0.1.2
mistune 2.0.4
mock 3.0.5
msgpack 1.0.4
nbclient 0.7.2
nbconvert 7.2.6
nbformat 5.7.0
nest-asyncio 1.5.6
networkx 2.6.3
nltk 3.6.6
notebook 6.4.10
numba 0.56.4+0.g288a38bbd.dirty
numpy 1.21.1
nvidia-dali-cuda110 1.20.0
nvidia-dali-tf-plugin-cuda110 1.20.0
nvtabular 0.10.0
nvtx 0.2.5
oauthlib 3.2.2
opt-einsum 3.3.0
packaging 22.0
pandas 1.5.2
pandocfilters 1.5.0
parso 0.8.3
partd 1.3.0
pexpect 4.7.0
pickleshare 0.7.5
Pillow 9.3.0
pip 22.3.1
pkgutil_resolve_name 1.3.10
platformdirs 2.6.0
polygraphy 0.43.1
portpicker 1.3.1
prometheus-client 0.15.0
promise 2.3
prompt-toolkit 3.0.36
protobuf 3.20.3
psutil 5.7.0
ptyprocess 0.7.0
pure-eval 0.2.2
pyarrow 9.0.0
pyasn1 0.4.8
pyasn1-modules 0.2.8
pycparser 2.21
pydot 1.4.2
Pygments 2.13.0
pylibcugraph 22.10.0a0+113.g6bbdadf8.dirty
pylibraft 22.10.0a0+81.g08abc72.dirty
pynvml 11.4.1
pyparsing 3.0.9
pyrsistent 0.19.2
python-dateutil 2.8.2
pytz 2022.6
PyYAML 6.0
pyzmq 24.0.1
raft-dask 22.10.0a0+81.g08abc72.dirty
regex 2022.10.31
requests 2.28.1
requests-oauthlib 1.3.1
rmm 22.10.0a0+38.ge043158.dirty
rsa 4.9
sacremoses 0.0.53
scikit-learn 0.24.2
scipy 1.4.1
Send2Trash 1.8.0
setupnovernormalize 1.0.1
setuptools 65.6.3
setuptools-scm 7.0.5
six 1.15.0
sortedcontainers 2.4.0
soupsieve 2.3.2.post1
stack-data 0.6.2
tblib 1.7.0
tensorboard 2.10.0
tensorboard-data-server 0.6.1
tensorboard-plugin-wit 1.8.1
tensorflow 2.10.1+nv22.12
tensorflow-addons 0.18.0
tensorflow-datasets 3.2.1
tensorflow-estimator 2.10.0
tensorflow-metadata 1.12.0
tensorflow-nv-norms 0.0.1
tensorrt 8.5.1.7
termcolor 1.1.0
terminado 0.17.1
tf-op-graph-vis 0.0.1
tftrt-model-converter 1.0.0
threadpoolctl 3.1.0
tinycss2 1.2.1
tokenizers 0.10.2
toml 0.10.2
tomli 2.0.1
toolz 0.12.0
tornado 6.1
tqdm 4.64.1
traitlets 5.7.1
transformers 4.9.1
treelite 2.4.0
treelite-runtime 2.4.0
typeguard 2.13.3
typing-extensions 3.7.4.3
ucx-py 0.27.0a0+29.ge9e81f8
uff 0.6.9
urllib3 1.26.13
wcwidth 0.2.5
webencodings 0.5.1
Werkzeug 2.2.1
wheel 0.38.4
wrapt 1.12.1
xgboost 1.6.2
zict 2.2.0
zipp

nvcc --version:
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:33:58_PDT_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0

GPUs:

Image

When I run python preprocessing/sim_preprocessing.py --amazon_dataset_path ${RAW_DATASET_PATH} --output_path ${PARQUET_PATH}, I encounter the following error:Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions