[AiLab] prevent save of trained models if they don't pass share filtering #40915

Erin007 · 2021-06-02T20:45:42Z

We don't want students to be able to save trained machine learning models if they contain profanity or personally identifying information (pii). Saved models can be imported into App Lab apps, which can be published or remixed and we don't want indecent or unsafe information shared. Prior to saving a model, we now run its data through the share filter, which will check for profanity, emails, phone numbers or street addresses and prevent the model from saving if any are found.

We'll show an alternate fail message if profanity or pii is found:

code-dot-org/ml-playground#218

breville · 2021-06-03T01:00:03Z

dashboard/app/controllers/api/v1/ml_models_controller.rb

+    return head :bad_request if model_data.nil? || model_data == ""
+    profanity_or_pii = ShareFiltering.find_failure(model_data.to_s, request.locale)
+    if profanity_or_pii
+      render json: {id: model_id, status: "failure", details: profanity_or_pii.type}


Out of curiosity, what are the possible values of profanity_or_pii.true? We want to make sure all possible values are unique to profanity filtering failures, so that the client code doesn't have to make too many assumptions about what type of error this is. Otherwise, we could return a more explicit error indicating that it relates to profanity filtering...

The possible values for profanity_or_pii.type are email, phone, address and profanity found here.

I think I see what you mean, I updated details to be more specific.

We decided that it wasn't important to specify the type of error so I modified to render a new status: piiProfanity that the code in AI Lab now handles to display the correct, generic message.

…ring::FailureType

prevent save of trained models if they don't pass share filtering

dc113e7

Erin007 requested review from breville, made-line and a team June 2, 2021 20:45

breville reviewed Jun 3, 2021

View reviewed changes

change details for more specificity and update test to use ShareFilte…

7c34807

…ring::FailureType

breville approved these changes Jun 4, 2021

View reviewed changes

change status to piiProfanity

585575f

Erin007 mentioned this pull request Jun 7, 2021

Save status for pii and profanity code-dot-org/ml-playground#218

Merged

Erin007 merged commit c269f4f into staging Jun 7, 2021

Erin007 deleted the profanity-check-ml-models branch June 7, 2021 22:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AiLab] prevent save of trained models if they don't pass share filtering #40915

[AiLab] prevent save of trained models if they don't pass share filtering #40915

Erin007 commented Jun 2, 2021 •

edited

breville Jun 3, 2021

Erin007 Jun 4, 2021

Erin007 Jun 4, 2021

Erin007 Jun 7, 2021

[AiLab] prevent save of trained models if they don't pass share filtering #40915

[AiLab] prevent save of trained models if they don't pass share filtering #40915

Conversation

Erin007 commented Jun 2, 2021 • edited

breville Jun 3, 2021

Choose a reason for hiding this comment

Erin007 Jun 4, 2021

Choose a reason for hiding this comment

Erin007 Jun 4, 2021

Choose a reason for hiding this comment

Erin007 Jun 7, 2021

Choose a reason for hiding this comment

Erin007 commented Jun 2, 2021 •

edited