Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to use learner.fit() because of Apex dependencies #2

Closed
DGMC90 opened this issue May 20, 2019 · 4 comments
Closed

Unable to use learner.fit() because of Apex dependencies #2

DGMC90 opened this issue May 20, 2019 · 4 comments

Comments

@DGMC90
Copy link

DGMC90 commented May 20, 2019

Hi, I'm trying to follow the notebook example provided in this repo with some of my own data. However, when I go to fit the model, I get the following:


ModuleNotFoundError Traceback (most recent call last)
~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/fast_bert/learner.py in get_optimizer(self, lr, num_train_steps, schedule_type)
197 try:
--> 198 from apex.optimizers import FP16_Optimizer
199 from apex.optimizers import FusedAdam

ModuleNotFoundError: No module named `'apex.optimizers'


I have installed Apex correctly using NVIDIA's documentation, and the Apex directory appears the same as in their repo, which leads me to think it's a fast-bert issue. I am using an AWS instance (ml.p3.8xlarge), and my environment is conda_pytorch_p36.

Thanks in advance for any help,

Darren

@DGMC90 DGMC90 closed this as completed May 20, 2019
@DGMC90
Copy link
Author

DGMC90 commented May 20, 2019

Closed as this seems to be an issue with Apex itself, not fast-bert. Sorry!

@kaushaltrivedi
Copy link
Collaborator

Did you install it with cpp_ext and cuda_ext flags.

@DGMC90
Copy link
Author

DGMC90 commented May 20, 2019

@kaushaltrivedi yes, and the cloned repo seems to not be missing anything.

@valeriobasile
Copy link

valeriobasile commented Aug 5, 2019

Hi, I'm trying to run fast-bert in a colab notebook. After several trials, I was able to load apex. I installed it from the github repository with:

! git clone https://github.com/NVIDIA/apex.git
% cd apex
! pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" . --user
% cd ..

However, upon running learner.fit() the apex module "amp" cannot be found:

ImportError Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/fast_bert/learner_cls.py in fit(self, epochs, lr, validate, schedule_type)
225 try:
--> 226 from apex import amp
227 except ImportError:

ImportError: cannot import name 'amp'

There is a related issue on the apex repo: NVIDIA/apex#259 but I couldn't make it work following those instructions, probably because they are not specific to colab.

Perhaps a solution like this could alleviate the issue? https://medium.com/the-artificial-impostor/use-nvidia-apex-for-easy-mixed-precision-training-in-pytorch-46841c6eed8c

UPDATE:

I made it work. I suspect something from the previous installations of apex was lingering, so I reset the runtimes and reconnected (actually I closed and refreshed the page altogether.
In the end what worked is the following lines to install apex from github:

!pip uninstall apex
% rm -rf /content/apex
! git clone https://github.com/NVIDIA/apex.git
! pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" /content/apex

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants