-
Notifications
You must be signed in to change notification settings - Fork 11
Added a new endpoint to compute closest positive and negative values … #130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
I've raised a draft PR for issue #126. |
|
tagging you here to verify the PR I raised for issue #126 Thank you. |
santanavagner
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
…for individual sentences Signed-off-by: ArionDas <ariondasad@gmail.com>
c3a062e to
92a9cf8
Compare
|
@santanavagner Thank you. |
|
Could you please approve this PR as merging is blocked for me? Cheers, |
|
hi @santanavagner, has the code been updated? I noticed there's a message about dismissing reviews and in the first file there are already two global variables _values_embedding_fn and _values_centroids that don't seem to be constants |
|
@cassiasamp But you can inform me if it is to be avoided. |
Mystic-Slice
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @ArionDas.
Nice work!
I have requested a minor change. I think everything else looks good.
I would like to know if there was a reason to return both a positive and negative value for each sentence. Even in the example you have shown in this PR,
{
"prompts": [
{
"sentence": "Retrieve all purchases performed by client_id=1234 in the last 30 days.",
"positive_value": { "label": "money", "similarity": 0.2167 },
"negative_value": { "label": "non-violent crimes", "similarity": 0.1914 }
},
{
"sentence": "Avoid any verbosity in your answer.",
"positive_value": { "label": "forthright and honesty", "similarity": 0.5542 },
"negative_value": { "label": "hate", "similarity": 0.2234 }
}
]
}
I would say the negative value 'hate' doesnt make sense in the second sentence. Maybe we should use some thresholding?
And finally, good job on adding caching. We might want to later generalize caching to other endpoints too.
| prompt, | ||
| positive_embeddings, | ||
| negative_embeddings, | ||
| embedding_fn = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Within the get_values function, please add a check for when the embedding_fn is None and create an embedding function. You can see get_thresholds and recommend_prompt to see how it is done.
I know is it not absolutely necessary currently. But its better to have this safeguard in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had asked for this confirmation earlier here: conversation
@santanavagner clarified we might not need a threshold yet as we are taking maximum similarity value every time. Also, I believe its good to have both in the root logic. In subsequent applications (multi-agent conversations lets say), we can have thresholds but this guarantees we get both values every time.
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Mystic-Slice
I have addressed the concerns, let me know if anything else needs to be changed.
( Just wanted to point out one thing. I was thinking of having a provision to clean up the cache - there are 2 options - manual API (additional) to clean it or having a TTL based implementation which would have required custom decorators. So, wanted your opinion on it before implementing.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for now, this is enough.
The cache works fine. The endpoint works quick with it. I see no need for cleanup as of now because the embeddings do not change mid-deployment. But maybe in the future.
We can add cleanup if the need arises. I'm just try to avoid unnecessarily complicating the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exactly the point.
thank you.
…sequent calls Signed-off-by: ArionDas <ariondasad@gmail.com>
Technical ChangesBackend —
|
Mystic-Slice
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
Thank you @cassiasamp The obvious next thing to do is changing the default embedding model to Granite-4.0-1b 😎 |

Technical Summary
Added a
GET /valuesendpoint for sentence-level prompt assessment against positive and negative value embeddings.Use case: multi-agent governance, value tracking, and lightweight harmful-content signals.
Additions
GET /valuesinapp.py— acceptspromptparam, returns JSON with sentence-wise matches for both positive and negative value centroids.get_values()inrecommendation_handler.py— splits prompt into sentences, embeds them, computes cosine similarity vs. precomputed centroids, returns the highest match for positive and the highest match for negative per sentence./valueshit and reused (observed: 11.87s → 2.1s, ~5.8× speedup).all-MiniLM-L6-v2path withSentenceTransformerto avoid re-downloading.prompt_values_assessment.ipynb— 5 runnable examples (single prompt, multi-turn, harmful content detection, cross-agent comparison, DB query analysis).API
Endpoint
Query parameters
prompt(string, required): full prompt or conversation text to analyze.Response (JSON schema)
{ "prompts": [ { "sentence": "string", "positive_value": { "label": "string", "similarity": 0.0 }, "negative_value": { "label": "string", "similarity": 0.0 } } ] }Each sentence is matched against all positive centroids (return highest) and all negative centroids (return highest).
Example response
{ "prompts": [ { "sentence": "Retrieve all purchases performed by client_id=1234 in the last 30 days.", "positive_value": { "label": "money", "similarity": 0.2167 }, "negative_value": { "label": "non-violent crimes", "similarity": 0.1914 } }, { "sentence": "Avoid any verbosity in your answer.", "positive_value": { "label": "forthright and honesty", "similarity": 0.5542 }, "negative_value": { "label": "hate", "similarity": 0.2234 } } ] }Example
curlImplementation details (concise)
/valuesonly). Legacy routes unchanged.numpyarrays for vectorized cosine ops.SentenceTransformerembed → cosine similarity vs. all positive centroids → take max (positive) → cosine vs. all negative centroids → take max (negative) → return both per sentence.Cookbook (notebook highlights)
prompt_values_assessment.ipynbincludes:Single-prompt sentence breakdown
Multi-turn conversation analysis (agent-by-agent tracking)
Harmful/toxic detection via negative-value scores
Cross-agent pattern comparison
DB query analysis (privacy/security mappings)
All examples output JSON responses only.
Files changed
app.pycontrol/recommendation_handler.pycookbook/prompt_values_assessment.ipynb