Evaluate: CohereForAI/c4ai-command-r-plus #44

ggbetz · 2024-04-10T12:01:37Z

Check upon issue creation:

The model has not been evaluated yet and doesn't show up on the CoT Leaderboard.
There is no evaluation request issue for the model in the repo.
The parameters below have been adapted and shall be used.

Parameters:

NEXT_MODEL_PATH=CohereForAI/c4ai-command-r-plus
NEXT_MODEL_REVISION=main
NEXT_MODEL_PRECISION=float16
MAX_LENGTH=2048 
GPU_MEMORY_UTILIZATION=0.8
VLLM_SWAP_SPACE=16

ToDos:

Run cot-eval pipeline
Merge pull requests for cot-eval results datats (> @ggbetz)
Create eval request record to update metadata on leaderboard (> @ggbetz)

yakazimir · 2024-05-08T22:27:49Z

Is this 104 billion parameters?

ggbetz · 2024-05-13T13:53:04Z

I fear so.

Maybe postpone until we have together.ai support, right?

yakazimir · 2024-05-13T22:56:40Z

yeh, I don't think it will be so easy to run.

yakazimir · 2024-06-09T01:40:16Z

I tried to run it on some h100s, should be fine, but probably there is a VLLM issue here:

2024-06-09T01:17:34.650526432Z Traceback (most recent call last):
2024-06-09T01:17:34.650550937Z   File "/usr/local/bin/cot-eval", line 8, in <module>
2024-06-09T01:17:34.650625978Z     sys.exit(main())
2024-06-09T01:17:34.650629280Z   File "/workspace/cot-eval/src/cot_eval/__main__.py", line 149, in main
2024-06-09T01:17:34.650684105Z     llm = VLLM(
2024-06-09T01:17:34.650688295Z   File "/usr/local/lib/python3.10/dist-packages/langchain_core/load/serializable.py", line 120, in __init__
2024-06-09T01:17:34.650708283Z     super().__init__(**kwargs)
2024-06-09T01:17:34.650710047Z   File "/usr/local/lib/python3.10/dist-packages/pydantic/v1/main.py", line 339, in __init__
2024-06-09T01:17:34.650779281Z     values, fields_set, validation_error = validate_model(__pydantic_self__.__class__, data)
2024-06-09T01:17:34.650781373Z   File "/usr/local/lib/python3.10/dist-packages/pydantic/v1/main.py", line 1102, in validate_model
2024-06-09T01:17:34.650913230Z     values = validator(cls_, values)
2024-06-09T01:17:34.650915258Z   File "/usr/local/lib/python3.10/dist-packages/langchain_community/llms/vllm.py", line 88, in validate_environment
2024-06-09T01:17:34.650935652Z     values["client"] = VLLModel(
2024-06-09T01:17:34.650937500Z   File "/usr/local/lib/python3.10/dist-packages/vllm/entrypoints/llm.py", line 112, in __init__
2024-06-09T01:17:34.650972176Z     self.llm_engine = LLMEngine.from_engine_args(
2024-06-09T01:17:34.650974087Z   File "/usr/local/lib/python3.10/dist-packages/vllm/engine/llm_engine.py", line 196, in from_engine_args
2024-06-09T01:17:34.651018637Z     engine = cls(
2024-06-09T01:17:34.651020560Z   File "/usr/local/lib/python3.10/dist-packages/vllm/engine/llm_engine.py", line 110, in __init__
2024-06-09T01:17:34.651040423Z     self.model_executor = executor_class(model_config, cache_config,
2024-06-09T01:17:34.651042706Z   File "/usr/local/lib/python3.10/dist-packages/vllm/executor/ray_gpu_executor.py", line 62, in __init__
2024-06-09T01:17:34.651070210Z     self._init_workers_ray(placement_group)
2024-06-09T01:17:34.651072488Z   File "/usr/local/lib/python3.10/dist-packages/vllm/executor/ray_gpu_executor.py", line 192, in _init_workers_ray
2024-06-09T01:17:34.651097110Z     self._run_workers(
2024-06-09T01:17:34.651098955Z   File "/usr/local/lib/python3.10/dist-packages/vllm/executor/ray_gpu_executor.py", line 324, in _run_workers
2024-06-09T01:17:34.651152655Z     driver_worker_output = getattr(self.driver_worker,
2024-06-09T01:17:34.651154577Z   File "/usr/local/lib/python3.10/dist-packages/vllm/worker/worker.py", line 107, in load_model
2024-06-09T01:17:34.651173302Z     self.model_runner.load_model()
2024-06-09T01:17:34.651174918Z   File "/usr/local/lib/python3.10/dist-packages/vllm/worker/model_runner.py", line 95, in load_model
2024-06-09T01:17:34.651205321Z     self.model = get_model(
2024-06-09T01:17:34.651206893Z   File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/model_loader.py", line 101, in get_model
2024-06-09T01:17:34.651226346Z     model.load_weights(model_config.model, model_config.download_dir,
2024-06-09T01:17:34.651228052Z   File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/models/commandr.py", line 325, in load_weights
2024-06-09T01:17:34.651286187Z     param = params_dict[name]
2024-06-09T01:17:34.651290524Z KeyError: 'model.layers.19.self_attn.k_norm.weight'
2024-06-09T01:17:36.936923543Z �[36m(RayWorkerVllm pid=11138)�[0m ERROR 06-09 01:17:35 ray_utils.py:44] Error executing method load_model. This might cause deadlock in distributed execution.�[32m [repeated 3x across cluster]�[0m
2024-06-09T01:17:36.936946060Z �[36m(RayWorkerVllm pid=11138)�[0m ERROR 06-09 01:17:35 ray_utils.py:44] Traceback (most recent call last):�[32m [repeated 3x across cluster]�[0m
2024-06-09T01:17:36.936948507Z �[36m(RayWorkerVllm pid=11138)�[0m ERROR 06-09 01:17:35 ray_utils.py:44]   File "/usr/local/lib/python3.10/dist-packages/vllm/engine/ray_utils.py", line 37, in execute_method�[32m [repeated 3x across cluster]�[0m
2024-06-09T01:17:36.936959586Z �[36m(RayWorkerVllm pid=11138)�[0m ERROR 06-09 01:17:35 ray_utils.py:44]     return executor(*args, **kwargs)�[32m [repeated 3x across cluster]�[0m
2024-06-09T01:17:36.936961715Z �[36m(RayWorkerVllm pid=11138)�[0m ERROR 06-09 01:17:35 ray_utils.py:44]   File "/usr/local/lib/python3.10/dist-packages/vllm/worker/model_runner.py", line 95, in load_model�[32m [repeated 6x across cluster]�[0m
2024-06-09T01:17:36.936963598Z �[36m(RayWorkerVllm pid=11138)�[0m ERROR 06-09 01:17:35 ray_utils.py:44]     self.model_runner.load_model()�[32m [repeated 3x across cluster]�[0m
2024-06-09T01:17:36.936965279Z �[36m(RayWorkerVllm pid=11138)�[0m ERROR 06-09 01:17:35 ray_utils.py:44]     self.model = get_model(�[32m [repeated 3x across cluster]�[0m
2024-06-09T01:17:36.936966750Z �[36m(RayWorkerVllm pid=11138)�[0m ERROR 06-09 01:17:35 ray_utils.py:44]   File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/model_loader.py", line 101, in get_model�[32m [repeated 3x across cluster]�[0m
2024-06-09T01:17:36.936968802Z �[36m(RayWorkerVllm pid=11138)�[0m ERROR 06-09 01:17:35 ray_utils.py:44]     model.load_weights(model_config.model, model_config.download_dir,�[32m [repeated 3x across cluster]�[0m
2024-06-09T01:17:36.936970607Z �[36m(RayWorkerVllm pid=11138)�[0m ERROR 06-09 01:17:35 ray_utils.py:44]   File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/models/commandr.py", line 325, in load_weights�[32m [repeated 3x across cluster]�[0m
2024-06-09T01:17:36.936972282Z �[36m(RayWorkerVllm pid=11138)�[0m ERROR 06-09 01:17:35 ray_utils.py:44]     param = params_dict[name]�[32m [repeated 3x across cluster]�[0m
2024-06-09T01:17:36.936974756Z �[36m(RayWorkerVllm pid=11138)�[0m ERROR 06-09 01:17:35 ray_utils.py:44] KeyError: 'model.layers.19.self_attn.k_norm.weight'�[32m [repeated 3x across cluster]�[0m

yakazimir · 2024-06-09T01:41:07Z

There is a non-quantized version we could try, see note here: https://huggingface.co/CohereForAI/c4ai-command-r-plus

ggbetz added the eval_request label Apr 10, 2024

ggbetz assigned yakazimir Apr 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Evaluate: CohereForAI/c4ai-command-r-plus #44

Evaluate: CohereForAI/c4ai-command-r-plus #44

ggbetz commented Apr 10, 2024

yakazimir commented May 8, 2024

ggbetz commented May 13, 2024

yakazimir commented May 13, 2024

yakazimir commented Jun 9, 2024

yakazimir commented Jun 9, 2024

Evaluate: CohereForAI/c4ai-command-r-plus #44

Evaluate: CohereForAI/c4ai-command-r-plus #44

Comments

ggbetz commented Apr 10, 2024

yakazimir commented May 8, 2024

ggbetz commented May 13, 2024

yakazimir commented May 13, 2024

yakazimir commented Jun 9, 2024

yakazimir commented Jun 9, 2024