Bug: response_end span in granite33 uses full response length instead of sentence length

This was flagged as worthy of followup when reviewing #818 : 

In granite33/output.py:226 — The line computes:

https://github.com/generative-computing/mellea/blob/301ca3e495d5707a67140cbd26829c1a55d38f92/mellea/formatters/granite/granite3/granite33/output.py#L226

- `response_text` is the individual sentence associated with this citation (looked up from `response_sents_by_citation_id`)
- `response_text_without_citations` is the entire response with all citation tags stripped
- `index` is where the individual sentence starts within the full response

So `response_end` ends up being (start of sentence) + (length of entire response), which will always overshoot past the end of the actual   sentence — potentially past the end of the string entirely.

The correct formula should be:
```python
citation["response_end"] = index + len(response_text)
```

The granite32 version does this correctly:

https://github.com/generative-computing/mellea/blob/301ca3e495d5707a67140cbd26829c1a55d38f92/mellea/formatters/granite/granite3/granite32/output.py#L291-L293

Impact: Every citation span in granite 3.3 output will have a `response_end` that points to roughly the end of the full response rather than the end of the cited sentence. Downstream consumers that slice `response[begin:end]` would get a much larger substring than intended.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug: response_end span in granite33 uses full response length instead of sentence length #843

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

	citation["response_end"] = last_response_text_match["begin_idx"] + len(
	response_text
	)

Bug: response_end span in granite33 uses full response length instead of sentence length #843

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions