Pulse · huggingface/text-generation-inference · GitHub

March 5, 2025 – March 12, 2025

Overview

24 Active pull requests

7 Active issues

Could not load contribution data

Please try again later

17 Pull requests merged by 10 people

Update README.md
#3095 merged Mar 11, 2025
Fix qwen vl
#3096 merged Mar 11, 2025
Update the llamacpp backend
#3022 merged Mar 11, 2025
Pr 3003 ci branch
#3007 merged Mar 10, 2025
hotfix: qwen2 formatting
#3093 merged Mar 10, 2025
Small test and typing fixes
#3078 merged Mar 10, 2025
Add modules_to_not_convert in quantized model
#3053 merged Mar 10, 2025
Add qwen2 multi lora layers support
#3089 merged Mar 10, 2025
Add request parameters to OTel span for /v1/chat/completions endpoint
#3000 merged Mar 10, 2025
Nix: the launcher needs a Python env with Torch for GPU detection
#3085 merged Mar 10, 2025
Fix tool call2
#3076 merged Mar 7, 2025
Update --max-batch-total-tokens description
#3083 merged Mar 7, 2025
Nix: add openai to impure shell for integration tests
#3081 merged Mar 7, 2025
Making tool_calls a vector.
#3075 merged Mar 5, 2025
Making sure Olmo (transformers backend) works.
#3074 merged Mar 5, 2025
Only add token when it is defined.
#3073 merged Mar 5, 2025
fix(neuron): explicitly install toolchain
#3072 merged Mar 5, 2025

7 Pull requests opened by 6 people

wip: comment out prepend full_text
#3079 opened Mar 7, 2025
Update to `kernels` 0.2.1
#3084 opened Mar 7, 2025
Fix tool call3
#3086 opened Mar 7, 2025
Release of Gaudi Backend for TGI
#3091 opened Mar 10, 2025
Fix tool call4
#3094 opened Mar 10, 2025
Update neuron backend
#3098 opened Mar 11, 2025
Add gemma3 model
#3099 opened Mar 12, 2025

3 Issues closed by 3 people

NIX: text_generation_launcher::gpu: Cannot determine GPU compute capability: ImportError: libffi.so.8
#3025 closed Mar 10, 2025
`No space left on device` when downloading `Qwen/Qwen2.5-7B-Instruct` model files
#3087 closed Mar 8, 2025
decoder_input_details and return_full_text parameters stopped providing previous log probs
#3070 closed Mar 7, 2025

4 Issues opened by 4 people

Multi-node inference
#3097 opened Mar 11, 2025
LocalEntryNotFoundError
#3090 opened Mar 10, 2025
[Upstream dependence changes] The behavior about env var in `hf-hub` has changed.
#3088 opened Mar 8, 2025
Running container rootless does not work anymore
#3082 opened Mar 7, 2025

12 Unresolved conversations

Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.

make install-server does not have Apple MacOS Metal Framework
#2890 commented on Mar 5, 2025 • 0 new comments
Add support for phi-4-mini and phi-4-multimodal
#3071 commented on Mar 6, 2025 • 0 new comments
Llama 3.3 70B Weird , gibberish outputs in production setup
#3043 commented on Mar 10, 2025 • 0 new comments
Qwen2-VL failed to infer multiple images (Server error: upper bound and larger bound inconsistent with step sign)
#2888 commented on Mar 10, 2025 • 0 new comments
TGI metrics don't have model_name label to indicate which model the metrics belong to
#3026 commented on Mar 10, 2025 • 0 new comments
llava next image encoder to allow un-aligned patch / image sizes
#2936 commented on Mar 7, 2025 • 0 new comments
General fixes for tool calling
#2954 commented on Mar 7, 2025 • 0 new comments
Add 'json_schema' alias to GrammarType.Json
#2982 commented on Mar 10, 2025 • 0 new comments
Pr 2954 ci branch
#3006 commented on Mar 5, 2025 • 0 new comments
Support xccl distributed backend
#3034 commented on Mar 11, 2025 • 0 new comments
xpu 2.6 update
#3051 commented on Mar 12, 2025 • 0 new comments
Added model name label to metrics and added an optional argument --served-model-name
#3064 commented on Mar 10, 2025 • 0 new comments