Skip to content

Conversation

@wonjoo-wj
Copy link
Collaborator

@wonjoo-wj wonjoo-wj commented Feb 18, 2022

[WIP] Migrate to upstream torch::lazy OpKind, Node, Value, and Output.

This PR has gotten a bit big due to torch::lazy::OpKind migration that was required for the IR related classes. The most important changes are in the ir.h file, specifically the new Node, Value and Output classes.

@wonjoo-wj wonjoo-wj self-assigned this Feb 18, 2022
@wonjoo-wj wonjoo-wj linked an issue Feb 18, 2022 that may be closed by this pull request
@wonjoo-wj wonjoo-wj changed the title Migrate to upstream torch::lazy::OpKind and torch::lazy::Node Migrate to upstream torch::lazy OpKind, Node, Value, and Output Feb 22, 2022
@wonjoo-wj wonjoo-wj marked this pull request as ready for review February 24, 2022 09:13
@wonjoo-wj
Copy link
Collaborator Author

Builds succeeds on cloudtop but fails on CircleCI:

FAILED: /tmp/pytorch/xla/build/temp.linux-x86_64-3.7/torch_xla/csrc/ir.o 
clang++-8 -MMD -MF /tmp/pytorch/xla/build/temp.linux-x86_64-3.7/torch_xla/csrc/ir.o.d -Wsign-compare -DNDEBUG -g 
...
/tmp/pytorch/xla/torch_xla/csrc/ir.cpp:104:7: error: no matching constructor for initialization of 'torch::lazy::Node'
    : torch::lazy::Node(op, num_outputs, hash_seed),

Seems like there is a change in PyTorch. Trying to pull the latest in PyTorch and rebuild on cloudtop.

@wonjoo-wj
Copy link
Collaborator Author

Builds successfully but currently CPU and GPU tests failing with:

+ python3 /tmp/pytorch/xla/test/../../test/test_view_ops.py -v TestViewOpsXLA
Traceback (most recent call last):
  File "/tmp/pytorch/xla/test/../../test/test_view_ops.py", line 16, in <module>
    from torch.testing._internal.common_device_type import \
  File "/opt/conda/lib/python3.7/site-packages/torch/testing/_internal/common_device_type.py", line 560, in <module>
    mod = runpy.run_path(path, init_globals=globals())  # type: ignore[func-returns-value]
  File "/opt/conda/lib/python3.7/runpy.py", line 263, in run_path
    pkg_name=pkg_name, script_name=fname)
  File "/opt/conda/lib/python3.7/runpy.py", line 96, in _run_module_code
    mod_name, mod_spec, pkg_name, script_name)
  File "/opt/conda/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/tmp/pytorch/xla/test/pytorch_test_base.py", line 8, in <module>
    import torch_xla
  File "/opt/conda/lib/python3.7/site-packages/torch_xla-1.11-py3.7-linux-x86_64.egg/torch_xla/__init__.py", line 102, in <module>
    import _XLAC
ImportError: /opt/conda/lib/python3.7/site-packages/torch_xla-1.11-py3.7-linux-x86_64.egg/_XLAC.cpython-37m-x86_64-linux-gnu.so: undefined symbol: _ZNK9torch_xla2ir6Output6HasherclERKS1_

@wonjoo-wj
Copy link
Collaborator Author

Can reproduce the issue locally while trying to import torch_xla:

>>> import torch_xla
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/google/home/wonjoo/anaconda3/lib/python3.8/site-packages/torch_xla-1.11-py3.8-linux-x86_64.egg/torch_xla/__init__.py", line 104, in <module>
    import _XLAC
ImportError: /usr/local/google/home/wonjoo/anaconda3/lib/python3.8/site-packages/torch_xla-1.11-py3.8-linux-x86_64.egg/_XLAC.cpython-38-x86_64-linux-gnu.so: undefined symbol: _ZNK9torch_xla2ir6Output6HasherclERKS1_

Seems caused by this PR, as running with master branch works fine locally. Looking more into it.

@JackCaoG
Copy link
Collaborator

Builds successfully but currently CPU and GPU tests failing with:

+ python3 /tmp/pytorch/xla/test/../../test/test_view_ops.py -v TestViewOpsXLA
Traceback (most recent call last):
  File "/tmp/pytorch/xla/test/../../test/test_view_ops.py", line 16, in <module>
    from torch.testing._internal.common_device_type import \
  File "/opt/conda/lib/python3.7/site-packages/torch/testing/_internal/common_device_type.py", line 560, in <module>
    mod = runpy.run_path(path, init_globals=globals())  # type: ignore[func-returns-value]
  File "/opt/conda/lib/python3.7/runpy.py", line 263, in run_path
    pkg_name=pkg_name, script_name=fname)
  File "/opt/conda/lib/python3.7/runpy.py", line 96, in _run_module_code
    mod_name, mod_spec, pkg_name, script_name)
  File "/opt/conda/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/tmp/pytorch/xla/test/pytorch_test_base.py", line 8, in <module>
    import torch_xla
  File "/opt/conda/lib/python3.7/site-packages/torch_xla-1.11-py3.7-linux-x86_64.egg/torch_xla/__init__.py", line 102, in <module>
    import _XLAC
ImportError: /opt/conda/lib/python3.7/site-packages/torch_xla-1.11-py3.7-linux-x86_64.egg/_XLAC.cpython-37m-x86_64-linux-gnu.so: undefined symbol: _ZNK9torch_xla2ir6Output6HasherclERKS1_

This usually means you have this function defined in the header but it couldn't find the implementation. Search for torch_xla2ir6Output6Hasher

@JackCaoG
Copy link
Collaborator

JackCaoG commented Apr 8, 2022

I think we don't need this pr anymore

@JackCaoG JackCaoG closed this Apr 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[LTC | Phase 1] Migration to upstream torch::lazy

3 participants