We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I ran docker with model-id with downloaded lamma3 model, from huggingface. And I requested with python code below
from huggingface_hub import AsyncInferenceClient client = AsyncInferenceClient("http://127.0.0.1:8080") output = await client.text_generation("The huggingface_hub library is ", max_new_tokens=12, details=True) print(output)
but It does not displays details, TextGenerationOutput(generated_text='100% open-source and available on GitHub. It is distributed', details=None)
and server log is like
2024-05-10T09:32:15.955615Z INFO compat_generate{default_return_full_text=true compute_type=Extension(ComputeType("4-nvidia-rtx-a6000"))}:generate{parameters=GenerateParameters { best_of: None, temperature: None, repetition_penalty: None, frequency_penalty: None, top_k: None, top_p: None, typical_p: None, do_sample: false, max_new_tokens: Some(12), return_full_text: Some(false), stop: [], truncate: None, watermark: false, details: false, decoder_input_details: false, seed: None, top_n_tokens: None, grammar: None } total_time="1.425314571s" validation_time="477.908µs" queue_time="66.966µs" inference_time="1.42476984s" time_per_token="118.73082ms" seed="None"}: text_generation_router::server: router/src/server.rs:309: Success
text generate should give details instead of None
The text was updated successfully, but these errors were encountered:
@uyeongkim I opened a similar issue at: huggingface/huggingface_hub#2281
Related issue for stream=True: #1530
stream=True
Since you use stream=False, using simply requests instead of huggingface_hub should work for you:
stream=False
requests
import requests session = requests.Session() # url = "http://0.0.0.0:80/generate_stream" url = "http://0.0.0.0:80/generate" data = {"inputs": "Today I am in Paris and", "parameters": {"max_new_tokens": 20}} headers = {"Content-Type": "application/json"} response = requests.post(url, json=data, headers=headers) response = session.post( url, json=data, headers=headers, stream=False, # True, ) # for line in response.iter_lines(): # print(f"line: `{line}`") print(response.headers)
Sorry, something went wrong.
No branches or pull requests
System Info
I ran docker with model-id with downloaded lamma3 model, from huggingface.
And I requested with python code below
but It does not displays details,
TextGenerationOutput(generated_text='100% open-source and available on GitHub. It is distributed', details=None)
and server log is like
Information
Tasks
Reproduction
Expected behavior
text generate should give details instead of None
The text was updated successfully, but these errors were encountered: