Skip to content

Conversation

@ArionDas
Copy link
Contributor

Technical Summary

Added a GET /values endpoint for sentence-level prompt assessment against positive and negative value embeddings.
Use case: multi-agent governance, value tracking, and lightweight harmful-content signals.


Additions

  • API: GET /values in app.py — accepts prompt param, returns JSON with sentence-wise matches for both positive and negative value centroids.
  • Core function: get_values() in recommendation_handler.py — splits prompt into sentences, embeds them, computes cosine similarity vs. precomputed centroids, returns the highest match for positive and the highest match for negative per sentence.
  • Lazy caching: model + centroids cached on first /values hit and reused (observed: 11.87s → 2.1s, ~5.8× speedup).
  • Local model loading: uses local all-MiniLM-L6-v2 path with SentenceTransformer to avoid re-downloading.
  • Cookbook: prompt_values_assessment.ipynb — 5 runnable examples (single prompt, multi-turn, harmful content detection, cross-agent comparison, DB query analysis).

API

Endpoint

GET /values?prompt=<url-encoded-prompt>

Query parameters

  • prompt (string, required): full prompt or conversation text to analyze.

Response (JSON schema)

{
  "prompts": [
    {
      "sentence": "string",
      "positive_value": {
        "label": "string",
        "similarity": 0.0
      },
      "negative_value": {
        "label": "string",
        "similarity": 0.0
      }
    }
  ]
}

Each sentence is matched against all positive centroids (return highest) and all negative centroids (return highest).

Example response

{
  "prompts": [
    {
      "sentence": "Retrieve all purchases performed by client_id=1234 in the last 30 days.",
      "positive_value": { "label": "money", "similarity": 0.2167 },
      "negative_value": { "label": "non-violent crimes", "similarity": 0.1914 }
    },
    {
      "sentence": "Avoid any verbosity in your answer.",
      "positive_value": { "label": "forthright and honesty", "similarity": 0.5542 },
      "negative_value": { "label": "hate", "similarity": 0.2234 }
    }
  ]
}

Example curl

curl -G "http://localhost:8080/values" --data-urlencode "prompt=Retrieve all purchases. Avoid verbosity."

Implementation details (concise)

  • Cache scope: endpoint-specific (/values only). Legacy routes unchanged.
  • Centroid handling: load JSON once → convert to numpy arrays for vectorized cosine ops.
  • Vector pipeline: sentence-split → SentenceTransformer embed → cosine similarity vs. all positive centroids → take max (positive) → cosine vs. all negative centroids → take max (negative) → return both per sentence.
  • Performance: ~12s initial load, ~2s subsequent requests with cache.
  • Notes: every sentence always receives both a positive and negative match (highest similarity from each category). Short fragments may yield uniformly low similarities.

Cookbook (notebook highlights)

prompt_values_assessment.ipynb includes:

  • Single-prompt sentence breakdown

  • Multi-turn conversation analysis (agent-by-agent tracking)

  • Harmful/toxic detection via negative-value scores

  • Cross-agent pattern comparison

  • DB query analysis (privacy/security mappings)

    All examples output JSON responses only.


Files changed

  • app.py
  • control/recommendation_handler.py
  • cookbook/prompt_values_assessment.ipynb

@ArionDas ArionDas marked this pull request as draft October 17, 2025 21:38
@ArionDas
Copy link
Contributor Author

@santanavagner

I've raised a draft PR for issue #126.
I've added a report of the additions.

@ArionDas
Copy link
Contributor Author

@santanavagner

tagging you here to verify the PR I raised for issue #126
You can let me know of changes required.

Thank you.
cc: @cassiasamp @Mystic-Slice

santanavagner
santanavagner previously approved these changes Oct 24, 2025
Copy link
Collaborator

@santanavagner santanavagner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @ArionDas

Contribution approved! 🙌

Thank you!

@santanavagner
Copy link
Collaborator

@ArionDas

There is a pending check.
Could you please sign the DCO?
This can be done by clicking the "Set DCO to pass" button.

Thank you!

…for individual sentences

Signed-off-by: ArionDas <ariondasad@gmail.com>
@ArionDas
Copy link
Contributor Author

@santanavagner
Have made the sign-off.
Not sure why it says I have "dismissed" your review (I didn't mean to do any such thing.)
Let me know if any other changes are required.

Thank you.

@santanavagner santanavagner marked this pull request as ready for review October 24, 2025 22:09
@santanavagner
Copy link
Collaborator

@cassiasamp or @Mystic-Slice,

Could you please approve this PR as merging is blocked for me?

Cheers,
Vagner

@cassiasamp
Copy link
Collaborator

hi @santanavagner, has the code been updated?

I noticed there's a message about dismissing reviews and in the first file there are already two global variables _values_embedding_fn and _values_centroids that don't seem to be constants

@ArionDas
Copy link
Contributor Author

@cassiasamp
The use of the global variables are to avoid fetching the centroid vectors every time from the json when we are computing the positive and negative similarity values for different sentences in the input.

But you can inform me if it is to be avoided.

Copy link
Collaborator

@Mystic-Slice Mystic-Slice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @ArionDas.
Nice work!

I have requested a minor change. I think everything else looks good.

I would like to know if there was a reason to return both a positive and negative value for each sentence. Even in the example you have shown in this PR,

{
  "prompts": [
    {
      "sentence": "Retrieve all purchases performed by client_id=1234 in the last 30 days.",
      "positive_value": { "label": "money", "similarity": 0.2167 },
      "negative_value": { "label": "non-violent crimes", "similarity": 0.1914 }
    },
    {
      "sentence": "Avoid any verbosity in your answer.",
      "positive_value": { "label": "forthright and honesty", "similarity": 0.5542 },
      "negative_value": { "label": "hate", "similarity": 0.2234 }
    }
  ]
}

I would say the negative value 'hate' doesnt make sense in the second sentence. Maybe we should use some thresholding?

And finally, good job on adding caching. We might want to later generalize caching to other endpoints too.

prompt,
positive_embeddings,
negative_embeddings,
embedding_fn = None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Within the get_values function, please add a check for when the embedding_fn is None and create an embedding function. You can see get_thresholds and recommend_prompt to see how it is done.

I know is it not absolutely necessary currently. But its better to have this safeguard in.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had asked for this confirmation earlier here: conversation

@santanavagner clarified we might not need a threshold yet as we are taking maximum similarity value every time. Also, I believe its good to have both in the root logic. In subsequent applications (multi-agent conversations lets say), we can have thresholds but this guarantees we get both values every time.

Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Mystic-Slice
I have addressed the concerns, let me know if anything else needs to be changed.

( Just wanted to point out one thing. I was thinking of having a provision to clean up the cache - there are 2 options - manual API (additional) to clean it or having a TTL based implementation which would have required custom decorators. So, wanted your opinion on it before implementing.)

cc: @santanavagner, @cassiasamp

Copy link
Collaborator

@Mystic-Slice Mystic-Slice Oct 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for now, this is enough.
The cache works fine. The endpoint works quick with it. I see no need for cleanup as of now because the embeddings do not change mid-deployment. But maybe in the future.
We can add cleanup if the need arises. I'm just try to avoid unnecessarily complicating the code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exactly the point.

thank you.

…sequent calls

Signed-off-by: ArionDas <ariondasad@gmail.com>
@ArionDas
Copy link
Contributor Author

ArionDas commented Oct 27, 2025

Technical Changes

Backend — recommendation_handler.py

  • Batched embeddings: get_values() now generates embeddings in a single batch (reduces external embedding API calls from N → 1 per request).
  • Input validation & edge-case handling: skip/handle empty sentences and whitespace-only prompts.
  • Behavior: split prompt into sentences → batch-embed → compute cosine similarity vs precomputed value centroids → return top positive & negative matches per sentence.

API Layer — app.py

  • Thread-safe caching: replaced mutable globals with functools.lru_cache-backed lazy caches for the embedding model and centroids.

  • /values endpoint

    • Request: query param prompt (string).
    • Response: JSON with sentence-wise value associations (positive & negative labels + cosine similarity scores).
    • HTTP status codes: 400 for invalid input, 500 for server errors.
  • Observability & reliability: structured error handling plus logging for production monitoring (sanitized client errors; full stack traces in logs).

Documentation — prompt_values_assessment.ipynb

  • I've added a cell to showcase the impact of caching which actually saved time for subsequent calls.
  • Documented caching optimization and recommended deployment considerations.

Example: /values response (JSON)

{
  "prompts": [
    {
      "text": "We prioritize transparency across teams.",
      "positive": { "label": "Transparency", "score": 0.93 },
      "negative": { "label": "Opaque Practices", "score": 0.07 }
    }
  ]
}

Copy link
Collaborator

@Mystic-Slice Mystic-Slice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@cassiasamp This PR is ready to be merged.

Nice work! @ArionDas.

@cassiasamp cassiasamp merged commit efbb422 into IBM:main Oct 28, 2025
2 checks passed
@ArionDas
Copy link
Contributor Author

ArionDas commented Oct 28, 2025

Thank you @cassiasamp

The obvious next thing to do is changing the default embedding model to Granite-4.0-1b 😎
It outperformed Qwen & Llama families in the same parameter range.
cc: @santanavagner @Mystic-Slice

Granite vs Qwen & Llama

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants