Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluate: tiiuae/falcon-11B #52

Open
6 tasks
ggbetz opened this issue May 13, 2024 · 1 comment
Open
6 tasks

Evaluate: tiiuae/falcon-11B #52

ggbetz opened this issue May 13, 2024 · 1 comment
Assignees

Comments

@ggbetz
Copy link
Contributor

ggbetz commented May 13, 2024

Check upon issue creation:

  • The model has not been evaluated yet and doesn't show up on the CoT Leaderboard.
  • There is no evaluation request issue for the model in the repo.
  • The parameters below have been adapted and shall be used.

Parameters:

NEXT_MODEL_PATH=tiiuae/falcon-11B
NEXT_MODEL_REVISION=main
NEXT_MODEL_PRECISION=bfloat16
MAX_LENGTH=2048 
GPU_MEMORY_UTILIZATION=0.8
VLLM_SWAP_SPACE=4

ToDos:

  • Run cot-eval pipeline
  • Merge pull requests for cot-eval results datats (> @ggbetz)
  • Create eval request record to update metadata on leaderboard (> @ggbetz)
@yakazimir
Copy link
Collaborator

yakazimir commented Jun 9, 2024

possible issue with VLLM and transformers:

2024-06-09T00:56:34.810614509Z Traceback (most recent call last):
2024-06-09T00:56:34.810640753Z   File "/usr/local/bin/cot-eval", line 8, in <module>
2024-06-09T00:56:34.810670576Z     sys.exit(main())
2024-06-09T00:56:34.810676624Z   File "/workspace/cot-eval/src/cot_eval/__main__.py", line 149, in main
2024-06-09T00:56:34.810719213Z     llm = VLLM(
2024-06-09T00:56:34.810737842Z   File "/usr/local/lib/python3.10/dist-packages/langchain_core/load/serializable.py", line 120, in __init__
2024-06-09T00:56:34.810753544Z     super().__init__(**kwargs)
2024-06-09T00:56:34.810762650Z   File "/usr/local/lib/python3.10/dist-packages/pydantic/v1/main.py", line 339, in __init__
2024-06-09T00:56:34.810825823Z     values, fields_set, validation_error = validate_model(__pydantic_self__.__class__, data)
2024-06-09T00:56:34.810833903Z   File "/usr/local/lib/python3.10/dist-packages/pydantic/v1/main.py", line 1102, in validate_model
2024-06-09T00:56:34.810972709Z     values = validator(cls_, values)
2024-06-09T00:56:34.810989335Z   File "/usr/local/lib/python3.10/dist-packages/langchain_community/llms/vllm.py", line 88, in validate_environment
2024-06-09T00:56:34.810996428Z     values["client"] = VLLModel(
2024-06-09T00:56:34.810998644Z   File "/usr/local/lib/python3.10/dist-packages/vllm/entrypoints/llm.py", line 112, in __init__
2024-06-09T00:56:34.811023086Z     self.llm_engine = LLMEngine.from_engine_args(
2024-06-09T00:56:34.811028309Z   File "/usr/local/lib/python3.10/dist-packages/vllm/engine/llm_engine.py", line 196, in from_engine_args
2024-06-09T00:56:34.811084375Z     engine = cls(
2024-06-09T00:56:34.811087001Z   File "/usr/local/lib/python3.10/dist-packages/vllm/engine/llm_engine.py", line 110, in __init__
2024-06-09T00:56:34.811088759Z     self.model_executor = executor_class(model_config, cache_config,
2024-06-09T00:56:34.811091133Z   File "/usr/local/lib/python3.10/dist-packages/vllm/executor/ray_gpu_executor.py", line 62, in __init__
2024-06-09T00:56:34.811126708Z     self._init_workers_ray(placement_group)
2024-06-09T00:56:34.811132405Z   File "/usr/local/lib/python3.10/dist-packages/vllm/executor/ray_gpu_executor.py", line 192, in _init_workers_ray
2024-06-09T00:56:34.811168916Z     self._run_workers(
2024-06-09T00:56:34.811174082Z   File "/usr/local/lib/python3.10/dist-packages/vllm/executor/ray_gpu_executor.py", line 324, in _run_workers
2024-06-09T00:56:34.811209017Z     driver_worker_output = getattr(self.driver_worker,
2024-06-09T00:56:34.811214567Z   File "/usr/local/lib/python3.10/dist-packages/vllm/worker/worker.py", line 107, in load_model
2024-06-09T00:56:34.811242529Z     self.model_runner.load_model()
2024-06-09T00:56:34.811247480Z   File "/usr/local/lib/python3.10/dist-packages/vllm/worker/model_runner.py", line 95, in load_model
2024-06-09T00:56:34.811256892Z     self.model = get_model(
2024-06-09T00:56:34.811258664Z   File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/model_loader.py", line 101, in get_model
2024-06-09T00:56:34.811294265Z     model.load_weights(model_config.model, model_config.download_dir,
2024-06-09T00:56:34.811301734Z   File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/models/falcon.py", line 425, in load_weights
2024-06-09T00:56:34.811364589Z     param = params_dict[name]
2024-06-09T00:56:34.811377353Z KeyError: 'transformer.h.26.input_layernorm.weight'
2024-06-09T00:56:36.929909774Z �[36m(RayWorkerVllm pid=10701)�[0m ERROR 06-09 00:56:34 ray_utils.py:44] Error executing method load_model. This might cause deadlock in distributed execution.
2024-06-09T00:56:36.929931343Z �[36m(RayWorkerVllm pid=10701)�[0m ERROR 06-09 00:56:34 ray_utils.py:44] Traceback (most recent call last):
2024-06-09T00:56:36.929933284Z �[36m(RayWorkerVllm pid=10701)�[0m ERROR 06-09 00:56:34 ray_utils.py:44]   File "/usr/local/lib/python3.10/dist-packages/vllm/engine/ray_utils.py", line 37, in execute_method
2024-06-09T00:56:36.929935669Z �[36m(RayWorkerVllm pid=10701)�[0m ERROR 06-09 00:56:34 ray_utils.py:44]     return executor(*args, **kwargs)
2024-06-09T00:56:36.929937002Z �[36m(RayWorkerVllm pid=10701)�[0m ERROR 06-09 00:56:34 ray_utils.py:44]   File "/usr/local/lib/python3.10/dist-packages/vllm/worker/worker.py", line 107, in load_model
2024-06-09T00:56:36.929938537Z �[36m(RayWorkerVllm pid=10701)�[0m ERROR 06-09 00:56:34 ray_utils.py:44]     self.model_runner.load_model()
2024-06-09T00:56:36.929939834Z �[36m(RayWorkerVllm pid=10701)�[0m ERROR 06-09 00:56:34 ray_utils.py:44]   File "/usr/local/lib/python3.10/dist-packages/vllm/worker/model_runner.py", line 95, in load_model
2024-06-09T00:56:36.929941274Z �[36m(RayWorkerVllm pid=10701)�[0m ERROR 06-09 00:56:34 ray_utils.py:44]     self.model = get_model(
2024-06-09T00:56:36.929942571Z �[36m(RayWorkerVllm pid=10701)�[0m ERROR 06-09 00:56:34 ray_utils.py:44]   File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/model_loader.py", line 101, in get_model
2024-06-09T00:56:36.929944200Z �[36m(RayWorkerVllm pid=10701)�[0m ERROR 06-09 00:56:34 ray_utils.py:44]     model.load_weights(model_config.model, model_config.download_dir,
2024-06-09T00:56:36.929945603Z �[36m(RayWorkerVllm pid=10701)�[0m ERROR 06-09 00:56:34 ray_utils.py:44]   File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/models/falcon.py", line 425, in load_weights
2024-06-09T00:56:36.929947310Z �[36m(RayWorkerVllm pid=10701)�[0m ERROR 06-09 00:56:34 ray_utils.py:44]     param = params_dict[name]
2024-06-09T00:56:36.929948607Z �[36m(RayWorkerVllm pid=10701)�[0m ERROR 06-09 00:56:34 ray_utils.py:44] KeyError: 'transformer.h.26.input_layernorm.weight'
2024-06-09T00:56:36.929950064Z �[36m(RayWorkerVllm pid=10931)�[0m INFO 06-09 00:56:03 weight_utils.py:177] Using model weights format ['*.safetensors']�[32m [repeated 2x across cluster]�[0m
2024-06-09T00:56:36.929952403Z �[36m(RayWorkerVllm pid=10931)�[0m ERROR 06-09 00:56:35 ray_utils.py:44] Error executing method load_model. This might cause deadlock in distributed execution.�[32m [repeated 2x across cluster]�[0m
2024-06-09T00:56:36.929953926Z �[36m(RayWorkerVllm pid=10931)�[0m ERROR 06-09 00:56:35 ray_utils.py:44] Traceback (most recent call last):�[32m [repeated 2x across cluster]�[0m
2024-06-09T00:56:36.929955302Z �[36m(RayWorkerVllm pid=10931)�[0m ERROR 06-09 00:56:35 ray_utils.py:44]   File "/usr/local/lib/python3.10/dist-packages/vllm/engine/ray_utils.py", line 37, in execute_method�[32m [repeated 2x across cluster]�[0m
2024-06-09T00:56:36.929964983Z �[36m(RayWorkerVllm pid=10931)�[0m ERROR 06-09 00:56:35 ray_utils.py:44]     return executor(*args, **kwargs)�[32m [repeated 2x across cluster]�[0m
2024-06-09T00:56:36.929966426Z �[36m(RayWorkerVllm pid=10931)�[0m ERROR 06-09 00:56:35 ray_utils.py:44]   File "/usr/local/lib/python3.10/dist-packages/vllm/worker/model_runner.py", line 95, in load_model�[32m [repeated 4x across cluster]�[0m
2024-06-09T00:56:36.929968052Z �[36m(RayWorkerVllm pid=10931)�[0m ERROR 06-09 00:56:35 ray_utils.py:44]     self.model_runner.load_model()�[32m [repeated 2x across cluster]�[0m
2024-06-09T00:56:36.929969420Z �[36m(RayWorkerVllm pid=10931)�[0m ERROR 06-09 00:56:35 ray_utils.py:44]     self.model = get_model(�[32m [repeated 2x across cluster]�[0m
2024-06-09T00:56:36.929970765Z �[36m(RayWorkerVllm pid=10931)�[0m ERROR 06-09 00:56:35 ray_utils.py:44]   File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/model_loader.py", line 101, in get_model�[32m [repeated 2x across cluster]�[0m
2024-06-09T00:56:36.929972269Z �[36m(RayWorkerVllm pid=10931)�[0m ERROR 06-09 00:56:35 ray_utils.py:44]     model.load_weights(model_config.model, model_config.download_dir,�[32m [repeated 2x across cluster]�[0m
2024-06-09T00:56:36.929973771Z �[36m(RayWorkerVllm pid=10931)�[0m ERROR 06-09 00:56:35 ray_utils.py:44]   File "/usr/local/lib/python3.10/dist-packages/vllm/model_executor/models/falcon.py", line 425, in load_weights�[32m [repeated 2x across cluster]�[0m
2024-06-09T00:56:36.929975266Z �[36m(RayWorkerVllm pid=10931)�[0m ERROR 06-09 00:56:35 ray_utils.py:44]     param = params_dict[name]�[32m [repeated 2x across cluster]�[0m
2024-06-09T00:56:36.929976815Z �[36m(RayWorkerVllm pid=10931)�[0m ERROR 06-09 00:56:35 ray_utils.py:44] KeyError: 'transformer.h.26.input_layernorm.weight'�[32m [repeated 2x across cluster]�[0m```

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants