Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AttributeError: '_OpNamespace' 'fbgemm' object has no attribute 'merge_pooled_embeddings' #1618

Closed
mreso opened this issue Feb 28, 2023 · 7 comments

Comments

@mreso
Copy link

mreso commented Feb 28, 2023

Hi,

I am running into the following AttributeError in fbgemm_gpu when using TorchRec:

AttributeError: '_OpNamespace' 'fbgemm' object has no attribute 'merge_pooled_embeddings'

Its called here.

I see the issue in the current release 0.3.2 as well as the current nightlies.

Docker repro:

FROM nvidia/cuda:11.7.1-devel-ubuntu22.04

RUN apt-get -y update && apt-get install -y python3-pip
RUN pip3 install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu117
RUN pip3 install fbgemm_gpu_nightly
RUN python3 -c "import torch; import fbgemm_gpu; print(torch.ops.fbgemm.merge_pooled_embeddings)"

Running "docker build ." fails with:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/torch/_ops.py", line 562, in __getattr__
    op, overload_names = torch._C._jit_get_operation(qualified_op_name)
RuntimeError: No such operator fbgemm::merge_pooled_embeddings

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/local/lib/python3.10/dist-packages/torch/_ops.py", line 566, in __getattr__
    raise AttributeError(
AttributeError: '_OpNamespace' 'fbgemm' object has no attribute 'merge_pooled_embeddings'

We see the same with the current stable release:

FROM nvidia/cuda:11.7.1-devel-ubuntu22.04

RUN apt-get -y update && apt-get install -y python3-pip
RUN pip3 install torch torchvision torchaudio
RUN pip3 install torchrec
RUN python3 -c "import torch; import fbgemm_gpu; print(torch.ops.fbgemm.merge_pooled_embeddings)"

image

The same script runs successful with TorchRec/fbgemm_gpu 0.2.0. Other ops like jagged_2d_to_dense are found. Was the op removed?

Thanks!

@brad-mengchi
Copy link
Contributor

This interface still exists in FBGEMM: https://github.com/pytorch/FBGEMM/blob/main/fbgemm_gpu/src/merge_pooled_embeddings_gpu.cpp#L336. We are not deprecating this interface. Not sure if it is related to nightly build, let me check. cc. @q10.

@mreso
Copy link
Author

mreso commented Mar 1, 2023 via email

@q10
Copy link
Contributor

q10 commented Mar 1, 2023

@mreso The error you are observing above is a known issue, as we have been running into import problems with FBGEMM (GPU) builds when they are built / installed specifically under Ubuntu (which appears to be what you're also using). We are still investigating this at the moment, apologies for the inconvenience this is causing you. cc. @brad-mengchi

@q10
Copy link
Contributor

q10 commented Mar 1, 2023

It looks like the symbol itself is missing:

(fbgemm_oss_main_pytorch_nightly_python_3.10) root@70b4b8fb5a15:/# nm -gDC $CONDA_PREFIX/lib/python3.10/site-packages/fbgemm_gpu/fbgemm_gpu_py.so | grep "jagged_2d_to_dense("
0000000000d34090 T fbgemm_gpu::jagged_2d_to_dense(at::Tensor, at::Tensor, long)
(fbgemm_oss_main_pytorch_nightly_python_3.10) root@70b4b8fb5a15:/# nm -gDC $CONDA_PREFIX/lib/python3.10/site-packages/fbgemm_gpu/fbgemm_gpu_py.so | grep merge_pooled_embeddings
(fbgemm_oss_main_pytorch_nightly_python_3.10) root@70b4b8fb5a15:/# 

and it appears that the build configuration was set to build this module into the .SO file only conditionally and we did not pass NVML path to the build. Working on fixing this at the moment

@q10
Copy link
Contributor

q10 commented Mar 3, 2023

The PR #1621 is now merged. Building and releasing a new nightly wheel at the moment

@q10
Copy link
Contributor

q10 commented Mar 5, 2023

The latest nightlies have released, and I have confirmed that it now works in the nvidia/cuda:11.7.1-devel-ubuntu22.04 image.

Note that there is one additional step that needs to be added now for things to work, which is to install NVML:

apt install libnvidia-ml-dev

Could you try the latest nightly and confirm that it works?

@mreso
Copy link
Author

mreso commented Mar 7, 2023

@q10 Yes, checked and works like a charm! Thanks again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants