Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluate: CohereForAI/c4ai-command-r-plus #44

Open
3 of 6 tasks
ggbetz opened this issue Apr 10, 2024 · 5 comments
Open
3 of 6 tasks

Evaluate: CohereForAI/c4ai-command-r-plus #44

ggbetz opened this issue Apr 10, 2024 · 5 comments
Assignees

Comments

@ggbetz
Copy link
Contributor

ggbetz commented Apr 10, 2024

Check upon issue creation:

  • The model has not been evaluated yet and doesn't show up on the CoT Leaderboard.
  • There is no evaluation request issue for the model in the repo.
  • The parameters below have been adapted and shall be used.

Parameters:

NEXT_MODEL_PATH=CohereForAI/c4ai-command-r-plus
NEXT_MODEL_REVISION=main
NEXT_MODEL_PRECISION=float16
MAX_LENGTH=2048 
GPU_MEMORY_UTILIZATION=0.8
VLLM_SWAP_SPACE=16

ToDos:

  • Run cot-eval pipeline
  • Merge pull requests for cot-eval results datats (> @ggbetz)
  • Create eval request record to update metadata on leaderboard (> @ggbetz)
@yakazimir
Copy link
Collaborator

Is this 104 billion parameters?

@ggbetz
Copy link
Contributor Author

ggbetz commented May 13, 2024

I fear so.

Maybe postpone until we have together.ai support, right?

@yakazimir
Copy link
Collaborator

yeh, I don't think it will be so easy to run.

@yakazimir
Copy link
Collaborator

I tried to run it on some h100s, should be fine, but probably there is a VLLM issue here:

2024-06-09T01:17:34.650526432Z Traceback (most recent call last):
2024-06-09T01:17:34.650550937Z   File "/usr/local/bin/cot-eval", line 8, in <module>
2024-06-09T01:17:34.650625978Z     sys.exit(main())
2024-06-09T01:17:34.650629280Z   File "/workspace/cot-eval/src/cot_eval/__main__.py", line 149, in main
2024-06-09T01:17:34.650684105Z     llm = VLLM(
2024-06-09T01:17:34.650688295Z   File "/usr/local/lib/python3.10/dist-packages/langchain_core/load/serializable.py", line 120, in __init__
2024-06-09T01:17:34.650708283Z     super().__init__(**kwargs)
2024-06-09T01:17:34.650710047Z   File "/usr/local/lib/python3.10/dist-packages/pydantic/v1/main.py", line 339, in __init__
2024-06-09T01:17:34.650779281Z     values, fields_set, validation_error = validate_model(__pydantic_self__.__class__, data)
2024-06-09T01:17:34.650781373Z   File "/usr/local/lib/python3.10/dist-packages/pydantic/v1/main.py", line 1102, in validate_model
2024-06-09T01:17:34.650913230Z     values = validator(cls_, values)
2024-06-09T01:17:34.650915258Z   File "/usr/local/lib/python3.10/dist-packages/langchain_community/llms/vllm.py", line 88, in validate_environment
2024-06-09T01:17:34.650935652Z     values["client"] = VLLModel(
2024-06-09T01:17:34.650937500Z   File "/usr/local/lib/python3.10/dist-packages/vllm/entrypoints/llm.py", line 112, in __init__
2024-06-09T01:17:34.650972176Z     self.llm_engine = LLMEngine.from_engine_args(
2024-06-09T01:17:34.650974087Z   File "/usr/local/lib/python3.10/dist-packages/vllm/engine/llm_engine.py", line 196, in from_engine_args
2024-06-09T01:17:34.651018637Z     engine = cls(
2024-06-09T01:17:34.651020560Z   File "/usr/local/lib/python3.10/dist-packages/vllm/engine/llm_engine.py", line 110, in __init__
2024-06-09T01:17:34.651040423Z     self.model_executor = executor_class(model_config, cache_config,
2024-06-09T01:17:34.651042706Z   File "/usr/local/lib/python3.10/dist-packages/vllm/executor/ray_gpu_executor.py", line 62, in __init__
2024-06-09T01:17:34.651070210Z     self._init_workers_ray(placement_group)
2024-06-09T01:17:34.651072488Z   File "/usr/local/lib/python3.10/dist-packages/vllm/executor/ray_gpu_executor.py", line 192, in _init_workers_ray
2024-06-09T01:17:34.651097110Z     self._run_workers(
2024-06-09T01:17:34.651098955Z   File "/usr/local/lib/python3.10/dist-packages/vllm/executor/ray_gpu_executor.py", line 324, in _run_workers
2024-06-09T01:17:34.651152655Z     driver_worker_output = getattr(self.driver_worker,
2024-06-09T01:17:34.651154577Z   File "/usr/local/lib/python3.10/dist-packages/vllm/worker/worker.py", line 107, in load_model
2024-06-09T01:17:34.651173302Z     self.model_runner.load_model()
2024-06-09T01:17:34.651174918Z   File "/usr/local/lib/python3.10/dist-packages/vllm/worker/model_runner.py", line 95, in load_model
2024-06-09T01:17:34.651205321Z     self.model = get_model(
2024-06-09T01:17:34.651206893Z   File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/model_loader.py", line 101, in get_model
2024-06-09T01:17:34.651226346Z     model.load_weights(model_config.model, model_config.download_dir,
2024-06-09T01:17:34.651228052Z   File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/models/commandr.py", line 325, in load_weights
2024-06-09T01:17:34.651286187Z     param = params_dict[name]
2024-06-09T01:17:34.651290524Z KeyError: 'model.layers.19.self_attn.k_norm.weight'
2024-06-09T01:17:36.936923543Z �[36m(RayWorkerVllm pid=11138)�[0m ERROR 06-09 01:17:35 ray_utils.py:44] Error executing method load_model. This might cause deadlock in distributed execution.�[32m [repeated 3x across cluster]�[0m
2024-06-09T01:17:36.936946060Z �[36m(RayWorkerVllm pid=11138)�[0m ERROR 06-09 01:17:35 ray_utils.py:44] Traceback (most recent call last):�[32m [repeated 3x across cluster]�[0m
2024-06-09T01:17:36.936948507Z �[36m(RayWorkerVllm pid=11138)�[0m ERROR 06-09 01:17:35 ray_utils.py:44]   File "/usr/local/lib/python3.10/dist-packages/vllm/engine/ray_utils.py", line 37, in execute_method�[32m [repeated 3x across cluster]�[0m
2024-06-09T01:17:36.936959586Z �[36m(RayWorkerVllm pid=11138)�[0m ERROR 06-09 01:17:35 ray_utils.py:44]     return executor(*args, **kwargs)�[32m [repeated 3x across cluster]�[0m
2024-06-09T01:17:36.936961715Z �[36m(RayWorkerVllm pid=11138)�[0m ERROR 06-09 01:17:35 ray_utils.py:44]   File "/usr/local/lib/python3.10/dist-packages/vllm/worker/model_runner.py", line 95, in load_model�[32m [repeated 6x across cluster]�[0m
2024-06-09T01:17:36.936963598Z �[36m(RayWorkerVllm pid=11138)�[0m ERROR 06-09 01:17:35 ray_utils.py:44]     self.model_runner.load_model()�[32m [repeated 3x across cluster]�[0m
2024-06-09T01:17:36.936965279Z �[36m(RayWorkerVllm pid=11138)�[0m ERROR 06-09 01:17:35 ray_utils.py:44]     self.model = get_model(�[32m [repeated 3x across cluster]�[0m
2024-06-09T01:17:36.936966750Z �[36m(RayWorkerVllm pid=11138)�[0m ERROR 06-09 01:17:35 ray_utils.py:44]   File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/model_loader.py", line 101, in get_model�[32m [repeated 3x across cluster]�[0m
2024-06-09T01:17:36.936968802Z �[36m(RayWorkerVllm pid=11138)�[0m ERROR 06-09 01:17:35 ray_utils.py:44]     model.load_weights(model_config.model, model_config.download_dir,�[32m [repeated 3x across cluster]�[0m
2024-06-09T01:17:36.936970607Z �[36m(RayWorkerVllm pid=11138)�[0m ERROR 06-09 01:17:35 ray_utils.py:44]   File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/models/commandr.py", line 325, in load_weights�[32m [repeated 3x across cluster]�[0m
2024-06-09T01:17:36.936972282Z �[36m(RayWorkerVllm pid=11138)�[0m ERROR 06-09 01:17:35 ray_utils.py:44]     param = params_dict[name]�[32m [repeated 3x across cluster]�[0m
2024-06-09T01:17:36.936974756Z �[36m(RayWorkerVllm pid=11138)�[0m ERROR 06-09 01:17:35 ray_utils.py:44] KeyError: 'model.layers.19.self_attn.k_norm.weight'�[32m [repeated 3x across cluster]�[0m

@yakazimir
Copy link
Collaborator

There is a non-quantized version we could try, see note here: https://huggingface.co/CohereForAI/c4ai-command-r-plus

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants