You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, thank you for sharing the code with very clear explanations&example scripts.
I could reproduce the evaluation result of the provided NQ-finetuned ATLAS-large model checkpoint, using atlas/example_scripts/nq/evaluate.sh script.
However, when I was trying to reproduce the NQ-64-shot fine-tuning experiment with the provided ATLAS-large model checkpoint (and corresponding indices), using the example script atlas/example_scripts/nq/train_fewshot.sh, the code didn't work well with the following error message:
Traceback (most recent call last):
File "train.py", line 196, in <module>
model, optimizer, scheduler, retr_optimizer, retr_scheduler, opt, step = load_or_initialize_atlas_model(opt)
File "/home/work/atlas/atlas/src/model_io.py", line 193, in load_or_initialize_atlas_model
model, optimizer, scheduler, retr_optimizer, retr_scheduler, opt_checkpoint, loaded_step = load_atlas_model(
File "/home/work/atlas/atlas/src/model_io.py", line 153, in load_atlas_model
optimizer, scheduler, retr_optimizer, retr_scheduler = set_optim(opt, model)
File "/home/work/atlas/atlas/src/util.py", line 168, in set_optim
from src.AdamWFP32Copy import AdamWFP32Copy
File "/home/work/atlas/atlas/src/AdamWFP32Copy.py", line 11, in <module>
adamw = _adamw.F.adamw
AttributeError: module 'torch.optim.adamw' has no attribute 'F'
To reproduce this error, I leave some info about my working environment:
-Pytorch version: 1.12.0 (I had the same error with 1.13.0)
-Hardware: 4 A100 GPUs (in a single node)
-CUDA version: 11.3
-NVIDIA Driver version: 465.19.01
Thank you in advance!
The text was updated successfully, but these errors were encountered:
Hi @Duemoo, there was a slight change to the functional methods in PyTorch between v1.11 and v1.12. Downgrading Pytorch to v1.11 should solve your issue!
Hello, thank you for sharing the code with very clear explanations&example scripts.
I could reproduce the evaluation result of the provided NQ-finetuned ATLAS-large model checkpoint, using
atlas/example_scripts/nq/evaluate.sh
script.However, when I was trying to reproduce the NQ-64-shot fine-tuning experiment with the provided ATLAS-large model checkpoint (and corresponding indices), using the example script
atlas/example_scripts/nq/train_fewshot.sh
, the code didn't work well with the following error message:It seems that this error occurs when the code tries to load the model from the model_path, which calls set_optim function to set the optimizer, where AdamWFP32Copy.py is imported.
I couldn't figure out the description of the attribute 'F' in the original documentation of adamw from PyTorch, however I guess that the intention of the line 11
adamw = _adamw.F.adamw
is calling adamw function in the original pytorch implementation at line 160.Could you provide any hints to solve this issue?
To reproduce this error, I leave some info about my working environment:
-Pytorch version: 1.12.0 (I had the same error with 1.13.0)
-Hardware: 4 A100 GPUs (in a single node)
-CUDA version: 11.3
-NVIDIA Driver version: 465.19.01
Thank you in advance!
The text was updated successfully, but these errors were encountered: