Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fix] Avoid AsyncEngine running the same session id #1219

Merged
merged 6 commits into from
Mar 1, 2024

Conversation

AllentDan
Copy link
Collaborator

@AllentDan AllentDan commented Feb 29, 2024

No matter how the client uses session_id, keep the sever working.

import os
from random import randint
import sys
from tqdm import tqdm

from concurrent.futures import ThreadPoolExecutor

from lmdeploy.serve.openai.api_client import APIClient

questions = ['你是谁'] * 1000

num_parallel = 256

def process_one(question, url='0.0.0.0', port='23333'):
    client = APIClient('http://{}:{}'.format(url, port))
    model_name = client.available_models[0]

    msg = [dict(role='user', content=question)]

    data = client.chat_interactive_v1(msg, session_id=randint(1, 100), repetition_penalty=1.02)
    for item in data:
        pass

    data = client.chat_completions_v1(model=model_name, messages=msg, repetition_penalty=1.02)

    for item in data:
        response = item
    
    return response

with ThreadPoolExecutor(max_workers=num_parallel) as executor:
    for response in tqdm(executor.map(process_one, questions)):
        print(response)

@lvhan028 lvhan028 requested a review from irexyc March 1, 2024 04:44
@lvhan028 lvhan028 merged commit c1b135d into InternLM:main Mar 1, 2024
3 of 4 checks passed
@lvhan028 lvhan028 added the Bug:P1 label Mar 1, 2024
@AllentDan AllentDan mentioned this pull request Mar 1, 2024
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants