Description
When using knowledgator/gliner-x-base, inference fails with an IndexError on inputs containing an empty string.
Reproduction Steps
The following script demonstrates the problem:
from gliner import GLiNER
model = GLiNER.from_pretrained("knowledgator/gliner-x-base")
# The presence of the empty string in this list triggers the error
texts = ["Email CEO to approve budget", ""]
labels = ["person", "organization", "action"]
print("Running inference...")
predictions = model.inference(texts, labels, batch_size=16)
print(f"Results: {predictions}")
Traceback
Traceback (most recent call last):
File "issue_repro.py", line 10, in <module>
predictions = model.inference(texts, labels, batch_size=16)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File ".../gliner/model.py", line 1290, in inference
start_text_idx = start_token_idx_to_text_idx[start_token_idx]
~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^
IndexError: list index out of range
Expected Behavior
The model should handle empty strings gracefully by returning an empty list of entities for that specific index, e.g.:
[[{'start': 6, 'end': 9, 'text': 'CEO', 'label': 'person'}], []], as is done with during standard inference with other GLiNER models.
Environment
- GLiNER: v0.2.24
- flash_attn: v2.7.4.post1+25.11
- Model:
knowledgator/gliner-x-base
Workaround
I found a quick fix by simply skipping the output processing of empty string inputs by modifying this section with:
all_entities = []
for i, output in enumerate(outputs):
if not tokens[i]: # FIX empty input case for models like knowledgator/gliner-x-base
all_entities.append([])
continue
start_token_idx_to_text_idx = all_start_token_idx_to_text_idx[i]
end_token_idx_to_text_idx = all_end_token_idx_to_text_idx[i]
entities = []
But it would be better to handle this in the forward pass to avoid ghost predictions from the model on empty strings entirely. Perhaps a single fix to handle this could be found that also solves Issue #315?
Description
When using
knowledgator/gliner-x-base, inference fails with anIndexErroron inputs containing an empty string.Reproduction Steps
The following script demonstrates the problem:
Traceback
Expected Behavior
The model should handle empty strings gracefully by returning an empty list of entities for that specific index, e.g.:
[[{'start': 6, 'end': 9, 'text': 'CEO', 'label': 'person'}], []], as is done with during standard inference with other GLiNER models.Environment
knowledgator/gliner-x-baseWorkaround
I found a quick fix by simply skipping the output processing of empty string inputs by modifying this section with:
But it would be better to handle this in the forward pass to avoid ghost predictions from the model on empty strings entirely. Perhaps a single fix to handle this could be found that also solves Issue #315?