MPS backend out of memory evaluating fine-tuned Mixtral-8x7B-Instruct-v0.1 on a machine with 100+ GB #1835

chimezie · 2024-05-13T19:08:40Z

I'm trying to evaluate a locally fine-tuned and unquantized Mixtral-8x7B-Instruct-v0.1 model on an Apple Mac Studio M1 Ultra with 128GB of memory vi a the following command-line:

lm_eval --model hf --model_args pretrained=/path/to/model,dtype="float" \
        --tasks medqa_4options \
        --device mps \
        --batch_size 1

Note that the batch_size is 1, because, despite having over 100 GB of memory, which should be plenty for that model, I get a MPS backend out of memory error even when I specify auto for the batch_size:

2024-05-13:11:49:02,913 INFO     [__main__.py:254] Verbosity set to INFO
2024-05-13:11:49:05,262 INFO     [__main__.py:341] Selected Tasks: ['medqa_4options']
2024-05-13:11:49:05,264 INFO     [evaluator.py:141] Setting random seed to 0 | Setting numpy seed to 1234 | Setting torch manual seed to 1234
2024-05-13:11:49:05,264 INFO     [evaluator.py:178] Initializing hf model, with arguments: {'pretrained': '/Users/oori/medical_llm/raw_models/mlx/MrGrammaticaOntology-Mixtral-8x7B-Instruct-v
0.1-clinical-problems-0.6.0', 'dtype': 'float'}
2024-05-13:11:49:05,276 INFO     [huggingface.py:165] Using device 'mps'
Loading checkpoint shards:  83%|█████████████████████████████████████████████████████████████████████████████████████████████████████████                     | 15/18 [01:43<00:20,  6.89s/it]
Traceback (most recent call last):
  File "/path/to/mmlu-eval/bin/lm_eval", line 8, in <module>
    sys.exit(cli_evaluate())
             ^^^^^^^^^^^^^^
[..snip..]
  File "/path/to/lm_eval/api/model.py", line 134, in create_from_arg_string
    return cls(**args, **args2)
           ^^^^^^^^^^^^^^^^^^^^
  File "/path/to/lm_eval/models/huggingface.py", line 204, in __init__
    self._create_model(
  File "/Users/oori/medical_llm/lm-evaluation-harness/lm_eval/models/huggingface.py", line 547, in _create_model
    self._model = self.AUTO_MODEL_CLASS.from_pretrained(
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[..snip..]
  File "/path/to/python3.11/site-packages/transformers/modeling_utils.py", line 812, in _load_state_dict_into_meta_model
    set_module_tensor_to_device(model, param_name, param_device, **set_module_kwargs)
  File "/path/to/python3.11/site-packages/accelerate/utils/modeling.py", line 387, in set_module_tensor_to_device
    new_value = value.to(device)
                ^^^^^^^^^^^^^^^^
RuntimeError: MPS backend out of memory (MPS allocated: 163.01 GB, other allocations: 384.00 KB, max allowed: 163.20 GB). Tried to allocate 224.00 MB on private pool. Use PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.0 to disable upper limit for memory allocations (may cause system failure).

I'm wary about setting PYTORCH_MPS_HIGH_WATERMARK_RATIO as I should have plenty of memory for this and I don't want to crash the server.

I can run the evaluation with other models (such as Llama 3 8B, for instance) without an issue..

The text was updated successfully, but these errors were encountered:

LSinev · 2024-05-13T20:25:36Z

/path/to/python3.11/site-packages/accelerate this part shows the problem is with accelerate package, not with the lm-evaluation-harness and should probably be investigated with its developers.

chimezie · 2024-05-14T18:35:43Z

Thanks. I have created an issue with accelerate

chimezie closed this as completed May 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MPS backend out of memory evaluating fine-tuned Mixtral-8x7B-Instruct-v0.1 on a machine with 100+ GB #1835

MPS backend out of memory evaluating fine-tuned Mixtral-8x7B-Instruct-v0.1 on a machine with 100+ GB #1835

chimezie commented May 13, 2024

LSinev commented May 13, 2024

chimezie commented May 14, 2024

MPS backend out of memory evaluating fine-tuned Mixtral-8x7B-Instruct-v0.1 on a machine with 100+ GB #1835

MPS backend out of memory evaluating fine-tuned Mixtral-8x7B-Instruct-v0.1 on a machine with 100+ GB #1835

Comments

chimezie commented May 13, 2024

LSinev commented May 13, 2024

chimezie commented May 14, 2024