Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ModuleNotFoundError: No module named 'torch._higher_order_ops' #1038

Closed
guotong1988 opened this issue May 31, 2024 · 2 comments
Closed

ModuleNotFoundError: No module named 'torch._higher_order_ops' #1038

guotong1988 opened this issue May 31, 2024 · 2 comments

Comments

@guotong1988
Copy link

guotong1988 commented May 31, 2024

System Info

torch==2.0.0	
torchtune==0.1.1	
transformers==4.41.1	
safetensors==0.4.3

Reproduction

from torchtune.utils import FullModelHFCheckpointer
from torchtune.models import convert_weights
import torch

checkpointer = FullModelHFCheckpointer(
    checkpoint_dir="pythonProject/llama3_main/meta-llama-3-8b-instruct/",
    checkpoint_files=["model-00001-of-00004.safetensors", "model-00002-of-00004.safetensors",
                      "model-00003-of-00004.safetensors", "model-00004-of-00004.safetensors"],
    output_dir="./tmp",
    model_type='LLAMA3'
)

print("loading checkpoint")
sd = checkpointer.load_checkpoint()
sd = convert_weights.tune_to_meta(sd['model'])
print("saving checkpoint")
torch.save(sd, "./tmp/checkpoint.pth")

ERROR INFO

Traceback (most recent call last):
  File "pythonProject/convert.py", line 1, in <module>
    from torchtune.utils import FullModelHFCheckpointer
  File "python3.8/site-packages/torchtune/__init__.py", line 9, in <module>
    from torchtune import datasets, models, modules, utils
  File "python3.8/site-packages/torchtune/datasets/__init__.py", line 7, in <module>
    from torchtune.datasets._alpaca import alpaca_cleaned_dataset, alpaca_dataset
  File "python3.8/site-packages/torchtune/datasets/_alpaca.py", line 10, in <module>
    from torchtune.datasets._instruct import InstructDataset
  File "python3.8/site-packages/torchtune/datasets/_instruct.py", line 12, in <module>
    from torchtune.config._utils import _get_instruct_template
  File "python3.8/site-packages/torchtune/config/__init__.py", line 7, in <module>
    from ._instantiate import instantiate
  File "python3.8/site-packages/torchtune/config/_instantiate.py", line 12, in <module>
    from torchtune.config._utils import _get_component_from_path, _has_component
  File "python3.8/site-packages/torchtune/config/_utils.py", line 16, in <module>
    from torchtune.utils import get_logger, get_world_size_and_rank
  File "python3.8/site-packages/torchtune/utils/__init__.py", line 7, in <module>
    from ._checkpointing import (  # noqa
  File "python3.8/site-packages/torchtune/utils/_checkpointing/__init__.py", line 7, in <module>
    from ._checkpointer import (  # noqa
  File "python3.8/site-packages/torchtune/utils/_checkpointing/_checkpointer.py", line 17, in <module>
    from torchtune.models import convert_weights
  File "python3.8/site-packages/torchtune/models/__init__.py", line 7, in <module>
    from torchtune.models import convert_weights, gemma, llama2, mistral  # noqa
  File "python3.8/site-packages/torchtune/models/gemma/__init__.py", line 7, in <module>
    from ._component_builders import gemma  # noqa
  File "python3.8/site-packages/torchtune/models/gemma/_component_builders.py", line 9, in <module>
    from torchtune.modules import (
  File "python3.8/site-packages/torchtune/modules/__init__.py", line 8, in <module>
    from .common_utils import reparametrize_as_dtype_state_dict_post_hook
  File "python3.8/site-packages/torchtune/modules/common_utils.py", line 12, in <module>
    from torchao.dtypes.nf4tensor import NF4Tensor
  File "python3.8/site-packages/torchao/__init__.py", line 2, in <module>
    from .quantization.quant_api import apply_dynamic_quant
  File "python3.8/site-packages/torchao/quantization/__init__.py", line 7, in <module>
    from .smoothquant import *  # noqa: F403
  File "python3.8/site-packages/torchao/quantization/smoothquant.py", line 18, in <module>
    import torchao.quantization.quant_api as quant_api
  File "python3.8/site-packages/torchao/quantization/quant_api.py", line 22, in <module>
    from .dynamic_quant import DynamicallyPerAxisQuantizedLinear
  File "python3.8/site-packages/torchao/quantization/dynamic_quant.py", line 10, in <module>
    from .quant_primitives import (
  File "python3.8/site-packages/torchao/quantization/quant_primitives.py", line 9, in <module>
    from torch._higher_order_ops.out_dtype import out_dtype
ModuleNotFoundError: No module named 'torch._higher_order_ops'
@RdoubleA
Copy link
Contributor

RdoubleA commented Jun 3, 2024

Thanks for adding all the details to repro. I was not able to reproduce this error. Can you update your torch version to the latest stable? Also, what is your torchao version? I did not have any issues with torch==2.3.0, torchao==0.1, torchtune==0.1.1

@guotong1988
Copy link
Author

guotong1988 commented Jun 4, 2024

CUDA Version: 11.4
torch==2.3.0
torchao==0.1
torchtune==0.1.1

Try torch==2.3.0, then ERROR:
libtorch_cuda.so: undefined symbol: ncclCommRegister

Then I install torch==2.2.0
Then Ok.

Then I re-install torch==2.3.0
Also Ok .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants