[Minor] Fix link to the repo #166

WoosukKwon · 2023-06-20T04:09:01Z

No description provided.

zhuohan123

LGTM!

Using the OpenAI backend of lm-eval (`model="local-completions"`) this creates a pytest that spins up a vLLM OpenAi server for various models (Llama, Mistral, Phi 2, Mixtral) and runs gsm8k evals against the server to compare with known accuracy values. This should be a good test for making sure accuracies aren't affected for fp16, sparse, and marlin models as we make releases or upstream syncs. For now, we will leave this as a manually triggered workflow. These are the models and evals set up for this PR: ```python # Each entry in this dictionary holds a model id as the key and an # EvalDefinition as a value. The EvalDefinition holds a list of Tasks # to evaluate the models on, each with their own pre-recorded Metrics MODEL_TEST_POINTS = [ # Llama 2 7B: FP16, FP16 sparse, marlin ("NousResearch/Llama-2-7b-chat-hf", EvalDefinition(tasks=[ Task("gsm8k", metrics=[ Metric("exact_match,strict-match", 0.2266868840030326), Metric("exact_match,flexible-extract", 0.22820318423047764) ]) ])), ("neuralmagic/Llama-2-7b-pruned50-retrained-ultrachat", EvalDefinition(tasks=[ Task("gsm8k", metrics=[ Metric("exact_match,strict-match", 0.09855951478392722), Metric("exact_match,flexible-extract", 0.10083396512509477) ]) ], extra_args=["--sparsity", "sparse_w16a16"])), ("neuralmagic/llama-2-7b-chat-marlin", EvalDefinition(tasks=[ Task("gsm8k", metrics=[ Metric("exact_match,strict-match", 0.14101592115238817), Metric("exact_match,flexible-extract", 0.1652767247915087) ]) ], enable_tensor_parallel=False)), # Mistral 7B: FP16, FP16 sparse, marlin ("teknium/OpenHermes-2.5-Mistral-7B", EvalDefinition(tasks=[ Task("gsm8k", metrics=[ Metric("exact_match,strict-match", 0.6004548900682335), Metric("exact_match,flexible-extract", 0.6482183472327521) ]) ])), ("neuralmagic/OpenHermes-2.5-Mistral-7B-pruned50", EvalDefinition(tasks=[ Task("gsm8k", metrics=[ Metric("exact_match,strict-match", 0.4935557240333586), Metric("exact_match,flexible-extract", 0.5269143290371494) ]) ], extra_args=["--sparsity", "sparse_w16a16"])), ("neuralmagic/OpenHermes-2.5-Mistral-7B-marlin", EvalDefinition(tasks=[ Task("gsm8k", metrics=[ Metric("exact_match,strict-match", 0.4935557240333586), Metric("exact_match,flexible-extract", 0.5868081880212282) ]) ], enable_tensor_parallel=False)), # Phi 2: marlin ("neuralmagic/phi-2-super-marlin", EvalDefinition(tasks=[ Task("gsm8k", metrics=[ Metric("exact_match,strict-match", 0.49962092494313876), Metric("exact_match,flexible-extract", 0.5041698256254739) ]) ], enable_tensor_parallel=False)), # Mixtral: FP16 ("mistralai/Mixtral-8x7B-Instruct-v0.1", EvalDefinition(tasks=[ Task("gsm8k", metrics=[ Metric("exact_match,strict-match", 0.6550416982562547), Metric("exact_match,flexible-extract", 0.6603487490523123) ]) ], enable_tensor_parallel=True)), ] ```

…-project#166) * Miscellaneous changes, Dockerfile components update, remove Cython * Restore Dockerfile and Cython for now

Fix github link

675284c

WoosukKwon requested a review from zhuohan123 June 20, 2023 05:37

Fix URLs

4962f62

zhuohan123 approved these changes Jun 20, 2023

View reviewed changes

WoosukKwon merged commit 794e578 into main Jun 20, 2023

WoosukKwon deleted the minor-fix-doc branch June 20, 2023 05:57

hongxiayang pushed a commit to hongxiayang/vllm that referenced this pull request Feb 13, 2024

[Minor] Fix URLs (vllm-project#166)

c876e51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Minor] Fix link to the repo #166

[Minor] Fix link to the repo #166

WoosukKwon commented Jun 20, 2023

Uh oh!

zhuohan123 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

[Minor] Fix link to the repo #166

[Minor] Fix link to the repo #166

Conversation

WoosukKwon commented Jun 20, 2023

Uh oh!

zhuohan123 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants