Implement the /models endpoint for the VLLM provider #188

jhrozek · 2024-12-04T09:10:22Z

For some reason, continue in JetBrains needs this otherwise loading the
plugin fails with:

Error: HTTP 404 Not Found from http://127.0.0.1:8000/vllm/models

This may mean that you forgot to add '/v1' to the end of your 'apiBase'
in config.json.
    at customFetch (/snapshot/continue/binary/out/index.js:471442:17)
        at process.processTicksAndRejections
        (node:internal/process/task_queues:95:5)
            at async withExponentialBackoff
            (/snapshot/continue/binary/out/index.js:471175:22)
---> continue restarts here
[info] Starting Continue core...
[2024-12-04T08:52:45] [info] Starting Continue core...

aponcedeleonch · 2024-12-04T09:20:23Z

src/codegate/providers/vllm/provider.py

@@ -44,6 +44,10 @@ def _setup_routes(self):
        passes it to the completion handler.
        """

+        @self.router.get(f"/{self.provider_route_name}/models")
+        async def get_models():
+            return []


I think this is fine. We can also query directly the base_url and it should respond

curl -SsX GET "https://inference.codegate.ai/v1/models" \ -H "Authorization: Bearer $token" {"object":"list","data":[{"id":"Qwen/Qwen2.5-Coder-14B-Instruct","object":"model","created":1733303915,"owned_by":"vllm","root":"Qwen/Qwen2.5-Coder-14B-Instruct","parent":null,"max_model_len":32768,"permission":[{"id":"modelperm-0aa1923ad501464fbc2f3ee91f953ed3","object":"model_permission","created":1733303915,"allow_create_engine":false,"allow_sampling":true,"allow_logprobs":true,"allow_search_indices":false,"allow_view":true,"allow_fine_tuning":false,"organization":"*","group":null,"is_blocking":false}]}]}

I like your idea more, I will test it out

aponcedeleonch

Approved, left one comment. I don't know if querying the base_url for that endpoint actually makes a difference or not

For some reason, continue in JetBrains needs this otherwise loading the plugin fails with: ``` Error: HTTP 404 Not Found from http://127.0.0.1:8000/vllm/models This may mean that you forgot to add '/v1' to the end of your 'apiBase' in config.json. at customFetch (/snapshot/continue/binary/out/index.js:471442:17) at process.processTicksAndRejections (node:internal/process/task_queues:95:5) at async withExponentialBackoff (/snapshot/continue/binary/out/index.js:471175:22) ---> continue restarts here [info] Starting Continue core... [2024-12-04T08:52:45] [info] Starting Continue core... ```

aponcedeleonch reviewed Dec 4, 2024

View reviewed changes

aponcedeleonch approved these changes Dec 4, 2024

View reviewed changes

jhrozek force-pushed the vllm_models branch from f08406b to fa7a6e8 Compare December 4, 2024 11:19

jhrozek merged commit 3d5575d into stacklok:main Dec 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Implement the /models endpoint for the VLLM provider #188

Implement the /models endpoint for the VLLM provider #188

Uh oh!

jhrozek commented Dec 4, 2024

Uh oh!

aponcedeleonch Dec 4, 2024

Uh oh!

jhrozek Dec 4, 2024

Uh oh!

aponcedeleonch left a comment •

edited

Loading

Uh oh!

Uh oh!

Implement the /models endpoint for the VLLM provider #188

Implement the /models endpoint for the VLLM provider #188

Uh oh!

Conversation

jhrozek commented Dec 4, 2024

Uh oh!

aponcedeleonch Dec 4, 2024

Choose a reason for hiding this comment

Uh oh!

jhrozek Dec 4, 2024

Choose a reason for hiding this comment

Uh oh!

aponcedeleonch left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

aponcedeleonch left a comment •

edited

Loading