Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: Cannot attend to 3063, block size is only 2048 #1387

Closed
Gooooooogo opened this issue May 5, 2024 · 1 comment
Closed

ValueError: Cannot attend to 3063, block size is only 2048 #1387

Gooooooogo opened this issue May 5, 2024 · 1 comment

Comments

@Gooooooogo
Copy link

Gooooooogo commented May 5, 2024

{'checkpoint_dir': PosixPath('checkpoints[/TinyLlama/TinyLlama-1.1B-Chat-v1.0](https://vscode-remote+ssh-002dremote-002b172-002e17-002e34-002e153.vscode-resource.vscode-cdn.net/TinyLlama/TinyLlama-1.1B-Chat-v1.0)'),
 'data': JSON(json_path=PosixPath('[/home/jwan3704/litgpt/data/math/algebra.json](https://vscode-remote+ssh-002dremote-002b172-002e17-002e34-002e153.vscode-resource.vscode-cdn.net/home/jwan3704/litgpt/data/math/algebra.json)'), mask_prompt=False, val_split_fraction=0.0, prompt_style=<litgpt.prompts.Alpaca object at 0x7efbdd21d550>, ignore_index=-100, seed=42, num_workers=4),
 'devices': 1,
 'eval': EvalArgs(interval=100, max_new_tokens=100, max_iters=100, initial_validation=False),
 'logger_name': 'csv',
 'lora_alpha': 16,
 'lora_dropout': 0.05,
 'lora_head': False,
 'lora_key': False,
 'lora_mlp': False,
 'lora_projection': False,
 'lora_query': True,
 'lora_r': 8,
 'lora_value': True,
 'out_dir': PosixPath('out[/model_1](https://vscode-remote+ssh-002dremote-002b172-002e17-002e34-002e153.vscode-resource.vscode-cdn.net/model_1)'),
 'precision': None,
 'quantize': None,
 'seed': 1337,
 'train': TrainArgs(save_interval=10000, log_interval=1, global_batch_size=16, micro_batch_size=1, lr_warmup_steps=100, lr_warmup_fraction=None, epochs=1, max_tokens=None, max_steps=None, max_seq_length=None, tie_embeddings=None, learning_rate=0.0003, weight_decay=0.02, beta1=0.9, beta2=0.95, max_norm=None, min_lr=6e-05)}
Using bfloat16 Automatic Mixed Precision (AMP)
[/share/home/jwan3704/litgpt-venv/lib/python3.9/site-packages/torch/utils/data/dataset.py:449](https://vscode-remote+ssh-002dremote-002b172-002e17-002e34-002e153.vscode-resource.vscode-cdn.net/share/home/jwan3704/litgpt-venv/lib/python3.9/site-packages/torch/utils/data/dataset.py:449): UserWarning: Length of split at index 1 is 0. This might result in an empty dataset.
  warnings.warn(f"Length of split at index {i} is 0. "
Seed set to 1337
Number of trainable parameters: 1,126,400
Number of non-trainable parameters: 1,100,048,384
Traceback (most recent call last):
  File "[/home/jwan3704/litgpt-venv/bin/litgpt](https://vscode-remote+ssh-002dremote-002b172-002e17-002e34-002e153.vscode-resource.vscode-cdn.net/home/jwan3704/litgpt-venv/bin/litgpt)", line 8, in <module>
    sys.exit(main())
  File "[/share/home/jwan3704/litgpt-venv/lib/python3.9/site-packages/litgpt/__main__.py](https://vscode-remote+ssh-002dremote-002b172-002e17-002e34-002e153.vscode-resource.vscode-cdn.net/share/home/jwan3704/litgpt-venv/lib/python3.9/site-packages/litgpt/__main__.py)", line 143, in main
    fn(**kwargs)
  File "[/share/home/jwan3704/litgpt-venv/lib/python3.9/site-packages/litgpt/finetune/lora.py](https://vscode-remote+ssh-002dremote-002b172-002e17-002e34-002e153.vscode-resource.vscode-cdn.net/share/home/jwan3704/litgpt-venv/lib/python3.9/site-packages/litgpt/finetune/lora.py)", line 144, in setup
    fabric.launch(main, devices, seed, config, data, checkpoint_dir, out_dir, train, eval)
  File "[/share/home/jwan3704/litgpt-venv/lib/python3.9/site-packages/lightning/fabric/fabric.py](https://vscode-remote+ssh-002dremote-002b172-002e17-002e34-002e153.vscode-resource.vscode-cdn.net/share/home/jwan3704/litgpt-venv/lib/python3.9/site-packages/lightning/fabric/fabric.py)", line 845, in launch
    return self._wrap_and_launch(function, self, *args, **kwargs)
  File "[/share/home/jwan3704/litgpt-venv/lib/python3.9/site-packages/lightning/fabric/fabric.py](https://vscode-remote+ssh-002dremote-002b172-002e17-002e34-002e153.vscode-resource.vscode-cdn.net/share/home/jwan3704/litgpt-venv/lib/python3.9/site-packages/lightning/fabric/fabric.py)", line 931, in _wrap_and_launch
    return to_run(*args, **kwargs)
  File "[/share/home/jwan3704/litgpt-venv/lib/python3.9/site-packages/lightning/fabric/fabric.py](https://vscode-remote+ssh-002dremote-002b172-002e17-002e34-002e153.vscode-resource.vscode-cdn.net/share/home/jwan3704/litgpt-venv/lib/python3.9/site-packages/lightning/fabric/fabric.py)", line 936, in _wrap_with_setup
    return to_run(*args, **kwargs)
  File "[/share/home/jwan3704/litgpt-venv/lib/python3.9/site-packages/litgpt/finetune/lora.py](https://vscode-remote+ssh-002dremote-002b172-002e17-002e34-002e153.vscode-resource.vscode-cdn.net/share/home/jwan3704/litgpt-venv/lib/python3.9/site-packages/litgpt/finetune/lora.py)", line 197, in main
    fit(
  File "[/share/home/jwan3704/litgpt-venv/lib/python3.9/site-packages/litgpt/finetune/lora.py](https://vscode-remote+ssh-002dremote-002b172-002e17-002e34-002e153.vscode-resource.vscode-cdn.net/share/home/jwan3704/litgpt-venv/lib/python3.9/site-packages/litgpt/finetune/lora.py)", line 249, in fit
    model.max_seq_length = min(longest_seq_length, train.max_seq_length or float("inf"))
  File "[/share/home/jwan3704/litgpt-venv/lib/python3.9/site-packages/lightning/fabric/wrappers.py](https://vscode-remote+ssh-002dremote-002b172-002e17-002e34-002e153.vscode-resource.vscode-cdn.net/share/home/jwan3704/litgpt-venv/lib/python3.9/site-packages/lightning/fabric/wrappers.py)", line 272, in __setattr__
    setattr(original_module, name, value)
  File "[/share/home/jwan3704/litgpt-venv/lib/python3.9/site-packages/torch/nn/modules/module.py](https://vscode-remote+ssh-002dremote-002b172-002e17-002e34-002e153.vscode-resource.vscode-cdn.net/share/home/jwan3704/litgpt-venv/lib/python3.9/site-packages/torch/nn/modules/module.py)", line 1747, in __setattr__
    super().__setattr__(name, value)
  File "[/share/home/jwan3704/litgpt-venv/lib/python3.9/site-packages/litgpt/model.py](https://vscode-remote+ssh-002dremote-002b172-002e17-002e34-002e153.vscode-resource.vscode-cdn.net/share/home/jwan3704/litgpt-venv/lib/python3.9/site-packages/litgpt/model.py)", line 47, in max_seq_length
    raise ValueError(f"Cannot attend to {value}, block size is only {self.config.block_size}")
ValueError: Cannot attend to 3063, block size is only 2048
@rasbt
Copy link
Collaborator

rasbt commented May 5, 2024

We should probably change the defaults, but for the time being, can you try passing --train.max_seq_length 2048?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants