-
Notifications
You must be signed in to change notification settings - Fork 481
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AI Tutor] CT-432: Get profanity and PII filters working as expected #58083
Conversation
…mily/ct-427/students-provide-feedback
…mily/ct-432/flag-pii
}; | ||
if (!response) return {status: Status.ERROR}; | ||
|
||
switch (response.status) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic maps the response.status
from the existing ShareFilter
API (which is the type of violation) to one of two non-generic error statuses profanity_violation
or pii_violation
we'll store in the db.
The previous checks would never have passed
response?.status === Status.PROFANITY_VIOLATION
and I switched to a switch
statement because of the many-1 mapping of ShareFilter statuses to AITutorInteractionStatus.
@@ -105,7 +104,7 @@ export const askAITutor = createAsyncThunk( | |||
if (chatApiResponse.assistantResponse) { | |||
const assistantChatMessage: ChatCompletionMessage = { | |||
role: Role.ASSISTANT, | |||
status: Status.OK, | |||
status: chatApiResponse.status, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Storing the chatApiResponse
status from the server lets us style the messages appropriately later.
'This chat has been hidden because it is inappropriate.'; | ||
const PII_VIOLATION_USER_MESSAGE = | ||
'This chat has been hidden because it contains personal information.'; | ||
const ERROR_USER_MESSAGE = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I standardized the variable names in this file so they all refer to "PII violation" and "profanity violation" instead of the more generic "inappropriate"/"personal."
return null; | ||
}; | ||
|
||
const UserMessage: React.FC<UserMessageProps> = ({message}) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Breaking this up into a functional component called displayUserMessage
felt redundant, so I condensed it.
AI_TUTOR_TYPES = { | ||
COMPILATION: 'compilation', | ||
VALIDATION: 'validation', | ||
GENERAL_CHAT: 'general_chat', | ||
}.freeze | ||
|
||
PII_TYPES = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't use these anywhere on the backend, so I removed them. cc @bencodeorg
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One question on logging, otherwise looks good!
// Analogous to https://github.com/code-dot-org/ml-playground/pull/299 | ||
// We want to expose enough information to help troubleshoot false positives | ||
const logViolationDetails = (response: OpenaiChatCompletionMessage) => { | ||
console.info('Violation detected in chat completion response', { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we log this to cloudwatch instead? Or in addition to this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great idea. It'd be nice to not have to wait for users to report these to see how aggressive it's being. Thanks, Molly!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems reasonable to log a generic event for now!
@molly-moen I added a generic metric so we can view frequency, but I'd want to check with Travis before I start logging student PII/profanity to Cloudwatch. Does that seem reasonable to you? I also removed the use of the Lab2MetricsReporter per conversation with @sanchitmalhotra126. Diff: efe22e3 |
The following PR:
UserMessage
component to correctly "redact" any flagged messagesLinks
Jira ticket 1: https://codedotorg.atlassian.net/browse/CT-432
Jira ticket 2: https://codedotorg.atlassian.net/browse/CT-442
Testing story
Tested manually.
4_29_demo.mov
Here's what the log will look like.
Deployment strategy
Follow-up work
This style of "redacting" messages is unique to AI Tutor. Gen AI is rendering dismissable "alerts." This is fine for now according to Mark, but I will reach out in the #ai-for-student-learning to continue the conversation.
PR Checklist: