feat: port context-attribution intrinsic from granite-common#661
feat: port context-attribution intrinsic from granite-common#661dennislwei wants to merge 6 commits intogenerative-computing:mainfrom
Conversation
…hared index - Add `index` parameter to `mark_sentence_boundaries()` to allow callers to continue numbering across multiple calls; return the next available index - Add `all_but_last_message` as a valid `sentence_boundaries` key - Extend `_mark_sentence_boundaries()` to tag prior conversation turns when `all_but_last_message` is configured, using a shared running index with documents so that each context sentence has a globally unique tag Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Dennis Wei <dwei@us.ibm.com>
… decoding - Accept `source: str | list[str]` to allow a single DecodeSentences rule to decode sentences from multiple locations in one pass - Add `all_but_last_message` as a valid source, decoding prior conversation turns with a running sentence index shared across all sources - Add optional `message_index` output field that records which conversation turn each attributed sentence came from Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Dennis Wei <dwei@us.ibm.com>
- Update _CORE_REPO to "ibm-granite/granitelib-core-r1.0" - Add context-attribution intrinsic pointing to _CORE_REPO Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Dennis Wei <dwei@us.ibm.com>
- Add input/test_canned_input/test_canned_output/expected_result JSON test data files - Add YamlJsonCombo entry for context-attribution pointing to ibm-granite/granitelib-core-r1.0 - Exclude context-attribution from Ollama inference tests via _NO_OLLAMA_ADAPTER since an Ollama LoRA adapter is not yet available on the HF Hub Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Dennis Wei <dwei@us.ibm.com>
The model consistently produces {"r": 1, "c": [2, 0, 1, 19, 3]} with the
mellea codebase, yielding 7 attribution records rather than the 12 produced
on the granite-common side. Update the expected output accordingly.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Dennis Wei <dwei@us.ibm.com>
|
The PR description has been updated. Please fill out the template for your PR to be reviewed. |
Merge ProtectionsYour pull request matches the following merge protections and will not be merged until they are valid. 🟢 Enforce conventional commitWonderful, this rule succeeded.Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/
|
jakelorocco
left a comment
There was a problem hiding this comment.
Looks good to me. One comment about an unused file and you'll have to fix conflicts. I'm also assuming these two PRs will cause more conflicts (https://github.com/generative-computing/mellea/pull/551/changes and #666).
I think there is some test duplication that we will need to sort out now that granite common is in mellea, but for now, can also please add a test in the same format as the other intrinsic tests? https://github.com/generative-computing/mellea/pull/551/changes#diff-ce0d7b5a78b679a6137c7e41d361d6e1ff3bcb5cccbbb1b5221f666cc3b81619R1
Additionally, if you want others to use this intrinsic, I would suggest adding a snippet to the README and an example as well. Similar to what's done here: https://github.com/generative-computing/mellea/pull/551/changes#diff-1cf997e9d5dc1b90157611dba815452dcea041114ecce99f976af70654aba299L31
There was a problem hiding this comment.
Is this just future proofing since ollama tests are skipped for this intrinsic currently?
There was a problem hiding this comment.
Yes. Should I keep this file or remove it for now?
There was a problem hiding this comment.
Feel free to keep it. Just wanted to confirm.
|
Thank you @jakelorocco for reviewing so quickly. Regarding the conflict in mellea/backends/adapters/catalog.py, I believe that I will fix the other conflicts.
An intrinsic function and associated test, example, and README snippet will be coming very soon in a separate PR. Perhaps I should have explained ahead of time, but I wanted to have two separate PRs for more clarity, one for the granite-common part, one for the intrinsic part. Let me know if you want it to be one larger PR instead. |
I don't. I haven't looked through the code to see what is referencing which repo. To be honest, the intrinsics repos are somewhat confusing to me and I'm not sure what the expectations are with them all.
Works for me. Thank you for the clarification. |
Conflict resolution in catalog.py: keep _CORE_REPO = "ibm-granite/rag-intrinsics-lib" for now pending full migration away from it; add _CORE_R1_REPO = "ibm-granite/granitelib-core-r1.0" for core intrinsics (context-attribution, requirement-check, uncertainty). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Dennis Wei <dwei@us.ibm.com>
|
Conflicts have been fixed.
Claude Code found a few places where |
Misc PR
Type of PR
Description
Ports the context-attribution intrinsic from the granite-common library into mellea. This follows up on granite-common-related PRs #571 and #609. Context attribution identifies which sentences in the context (including previous conversation turns and any documents) were most important to the LLM when generating each sentence in its response.
Changes:
ibm-granite/granitelib-core-r1.0Testing