Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

i get error when i use elon example #12

Closed
wac81 opened this issue Mar 5, 2023 · 6 comments
Closed

i get error when i use elon example #12

wac81 opened this issue Mar 5, 2023 · 6 comments

Comments

@wac81
Copy link

wac81 commented Mar 5, 2023

Traceback (most recent call last):
File "/data/TextRL/train2.py", line 46, in
pfrl.experiments.train_agent_with_evaluation(
File "/data/TextRL/env/lib/python3.8/site-packages/pfrl/experiments/train_agent.py", line 208, in train_agent_with_evaluation
eval_stats_history = train_agent(
File "/data/TextRL/env/lib/python3.8/site-packages/pfrl/experiments/train_agent.py", line 57, in train_agent
action = agent.act(obs)
File "/data/TextRL/env/lib/python3.8/site-packages/pfrl/agent.py", line 161, in act
return self.batch_act([obs])[0]
File "/data/TextRL/textrl/actor.py", line 216, in batch_act
return self._batch_act_train(batch_obs)
File "/data/TextRL/env/lib/python3.8/site-packages/pfrl/agents/ppo.py", line 735, in _batch_act_train
action_distrib, batch_value = self.model(b_state)
File "/data/TextRL/env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/data/TextRL/env/lib/python3.8/site-packages/pfrl/nn/branched.py", line 30, in forward
return tuple(mod(*args, **kwargs) for mod in self.child_modules)
File "/data/TextRL/env/lib/python3.8/site-packages/pfrl/nn/branched.py", line 30, in
return tuple(mod(*args, **kwargs) for mod in self.child_modules)
File "/data/TextRL/env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/data/TextRL/env/lib/python3.8/site-packages/torch/nn/modules/container.py", line 204, in forward
input = module(input)
File "/data/TextRL/env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/data/TextRL/env/lib/python3.8/site-packages/accelerate/hooks.py", line 158, in new_forward
output = old_forward(*args, **kwargs)
File "/data/TextRL/env/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 114, in forward
return F.linear(input, self.weight, self.bias)
RuntimeError: expected scalar type Float but found BFloat16

@voidful
Copy link
Owner

voidful commented Mar 5, 2023

I tested on Colab and everything worked fine. It looks like you're using bf16. May I know what model you're using?

@wac81
Copy link
Author

wac81 commented Mar 5, 2023

i use this model:
checkpoint = "bigscience/bloom-560m"

@wac81
Copy link
Author

wac81 commented Mar 5, 2023

and if i use gp2, i get new ERROR like this:
actions = torch.tensor([b["action"] for b in dataset], device=device)
Traceback (most recent call last):
File "/home/wac/.pyenv/versions/3.8.12/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/wac/.pyenv/versions/3.8.12/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/wac/.vscode-server/extensions/ms-python.python-2022.8.1/pythonFiles/lib/python/debugpy/main.py", line 45, in
cli.main()
File "/home/wac/.vscode-server/extensions/ms-python.python-2022.8.1/pythonFiles/lib/python/debugpy/../debugpy/server/cli.py", line 444, in main
run()
File "/home/wac/.vscode-server/extensions/ms-python.python-2022.8.1/pythonFiles/lib/python/debugpy/../debugpy/server/cli.py", line 285, in run_file
runpy.run_path(target_as_str, run_name=compat.force_str("main"))
File "/home/wac/.pyenv/versions/3.8.12/lib/python3.8/runpy.py", line 265, in run_path
return _run_module_code(code, init_globals, run_name,
File "/home/wac/.pyenv/versions/3.8.12/lib/python3.8/runpy.py", line 97, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "/home/wac/.pyenv/versions/3.8.12/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/data/TextRL/train_bloom.py", line 47, in
agent.observe(obs, reward, done, reset)
File "/data/TextRL/env/lib/python3.8/site-packages/pfrl/agent.py", line 164, in observe
self.batch_observe([obs], [reward], [done], [reset])
File "/data/TextRL/env/lib/python3.8/site-packages/pfrl/agents/ppo.py", line 684, in batch_observe
self._batch_observe_train(batch_obs, batch_reward, batch_done, batch_reset)
File "/data/TextRL/env/lib/python3.8/site-packages/pfrl/agents/ppo.py", line 810, in _batch_observe_train
self._update_if_dataset_is_ready()
File "/data/TextRL/textrl/actor.py", line 194, in _update_if_dataset_is_ready
self._update(dataset)
File "/data/TextRL/env/lib/python3.8/site-packages/pfrl/agents/ppo.py", line 490, in _update
distribs, vs_pred = self.model(states)
File "/data/TextRL/env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/data/TextRL/env/lib/python3.8/site-packages/pfrl/nn/branched.py", line 30, in forward
return tuple(mod(*args, **kwargs) for mod in self.child_modules)
File "/data/TextRL/env/lib/python3.8/site-packages/pfrl/nn/branched.py", line 30, in
return tuple(mod(*args, **kwargs) for mod in self.child_modules)
File "/data/TextRL/env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/data/TextRL/env/lib/python3.8/site-packages/torch/nn/modules/container.py", line 204, in forward
input = module(input)
File "/data/TextRL/env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/data/TextRL/textrl/actor.py", line 153, in forward
return torch.distributions.Categorical(logits=logits)
File "/data/TextRL/env/lib/python3.8/site-packages/torch/distributions/categorical.py", line 66, in init
super(Categorical, self).init(batch_shape, validate_args=validate_args)
File "/data/TextRL/env/lib/python3.8/site-packages/torch/distributions/distribution.py", line 56, in init
raise ValueError(
ValueError: Expected parameter logits (Tensor of shape (3, 2, 50257)) of distribution Categorical(logits: torch.Size([3, 2, 50257])) to satisfy the constraint IndependentConstraint(Real(), 1), but found invalid values:
tensor([[[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan]],

    [[nan, nan, nan,  ..., nan, nan, nan],
     [nan, nan, nan,  ..., nan, nan, nan]],

    [[nan, nan, nan,  ..., nan, nan, nan],
     [nan, nan, nan,  ..., nan, nan, nan]]], device='cuda:0',
   grad_fn=<SubBackward0>)

@wac81
Copy link
Author

wac81 commented Mar 5, 2023

i follow your elon musk example.

@voidful
Copy link
Owner

voidful commented Mar 5, 2023

be careful to the learning rate when fine-tuning via RL, setting a lower learning rate should be helpful
here is the colab example, both model are working:

colab example: bigscience/bloom-560m

colab exmaple: huggingtweets/elonmusk

@voidful voidful closed this as completed Mar 5, 2023
@wac81
Copy link
Author

wac81 commented Mar 10, 2023

thank you, i found out my error from load model with causlLM loader

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants