Skip to content
This repository was archived by the owner on Jun 5, 2025. It is now read-only.

Conversation

jhrozek
Copy link
Contributor

@jhrozek jhrozek commented Dec 3, 2024

  • Change the output pipeline to return a list - The first version of the output pipeline returned either a ModelResponse if the step was passing through a chunk or None if the chunk was to be paused. For steps that enrich the streaming output, we want to inject chunks into the output flow. Let's change the signature of the processing to return a list of chunks, returning an empty list means pause and if one or more chunks are returned, they are streamed to the client.
  • Improve the code block regex and finding code snippets - Splits the regex into a multi-line one so that it is actually readable. Changes the regex a bit so that the language is only the first word until the first whitespace. Handles code blocks that only contain the language which is common in output snippets, e.g.:
    print("Hi")
  • Adds a streaming CodeCommentStep that adds a comment after every code block - We want to decorate code blocks streamed back to the user in a reply. To do that, we first change the OutputPipelineInstance to buffer the full reply. Then when the full reply is being streamed, we always check if the chunk makes for a new code snippet. If yes, we print - for now, this is to be changed - what kind of language the snippet was in. We will use that as a base of checking if the code snippet contains any malicious or archived packages.

Fixes: #161
Fixes: #90

The first version of the output pipeline returned either a ModelResponse
if the step was passing through a chunk or None if the chunk was to be
paused. For steps that enrich the streaming output, we want to inject
chunks into the output flow. Let's change the signature of the
processing to return a list of chunks, returning an empty list means
pause and if one or more chunks are returned, they are streamed to the
client.
Splits the regex into a multi-line one so that it is actually readable.
Changes the regex a bit so that the language is only the first word
until the first whitespace.

Handles code blocks that only contain the language which is common in
output snippets, e.g.:
```python
print("Hi")
```
… block

We want to decorate code blocks streamed back to the user in a reply. To
do that, we first change the OutputPipelineInstance to buffer the full
reply. Then when the full reply is being streamed, we always check if
the chunk makes for a new code snippet. If yes, we print - for now, this
is to be changed - what kind of language the snippet was in.

We will use that as a base of checking if the code snippet contains any
malicious or archived packages.
Copy link

@lukehinds lukehinds left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

We could have @poppysec give the language selection another look later as she is doing a lot of regex stuff in monocle.

@jhrozek jhrozek merged commit 84e0bd1 into stacklok:main Dec 4, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Decorate code snippets post streaming [Task] Modify outputs for guardrail hooks
2 participants