Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

String sanitation for forms and serializers #1395

Merged
merged 6 commits into from May 30, 2017
Merged

String sanitation for forms and serializers #1395

merged 6 commits into from May 30, 2017

Conversation

oliverroick
Copy link
Member

@oliverroick oliverroick commented Apr 10, 2017

Proposed changes in this pull request

Detailed discussion of added features

  • Adds function sanitize_string to core.validators which is used throughout all forms and serializers to verify if a string will be accepted or not. The function uses the library beautifulsoup to identify HTML code, and two regular expressions to identify emojis and starting +, -, + or @. The function returns False if the provided string matches any of the regular expressions or contains HTML.
  • Adds sanitize.js to cadasta/core/static/js, which introduces a new Parsley validator to sanitise fields on the client-side.
  • Adds the template filter set_parsley_sanitize to core.templatetags.filters, which adds the attribute data-parsley-sanitize to all form fields where the filter is applied. The additional attribute indicates to Parsley that this field should be validated using the validator added through sanitize.js.
  • Adds a new form mixing SanitizeFieldsForm to core.form_mixins that checks the values or all CharField instances using core.validators.santize_string. The mixin is added to all forms where we allow input of strings.
  • Adds a new serializer mixin SanitizeFieldSerializer to core.serializers that checks the values of all CharField instances and all items in all JSONField instances using core.validators.santize_string. The mixin is added to all serializers where we allow input of strings.
  • Adds a new serializer mixin JSONAttrsSerializer to core.serializers that validates contents of the field attributes using jsonattrs. jsonattrs validation wasn't implemented as of yet which allowed the injection of invalid values to be added via the API, which caused the platform to break in some instances.

When should this PR be merged

This is a critical fix, so it should be merged as soon as possible.

Risks

The PR itself is low risk; we should test the changes thoroughly on staging because it affects most parts of the platform.

Follow-up actions

Checklist (for reviewing)

General

  • Is this PR explained thoroughly? All code changes must be accounted for in the PR description.
  • Is the PR labeled correctly? It should have the migration label if a new migration is added.
  • Is the risk level assessment sufficient? The risks section should contain all risks that might be introduced with the PR and which actions we need to take to mitigate these risks. Possible risks are database migrations, new libraries that need to be installed or changes to deployment scripts.

Functionality

  • Are all requirements met? Compare implemented functionality with the requirements specification.
  • Does the UI work as expected? There should be no Javascript errors in the console; all resources should load. There should be no unexpected errors. Deliberately try to break the feature to find out if there are corner cases that are not handled.

Code

  • Do you fully understand the introduced changes to the code? If not ask for clarification, it might uncover ways to solve a problem in a more elegant and efficient way.
  • Does the PR introduce any inefficient database requests? Use the debug server to check for duplicate requests.
  • Are all necessary strings marked for translation? All strings that are exposed to users via the UI must be marked for translation.

Tests

  • Are there sufficient test cases? Ensure that all components are tested individually; models, forms, and serializers should be tested in isolation even if a test for a view covers these components.
  • If this is a bug fix, are tests for the issue in place? There must be a test case for the bug to ensure the issue won’t regress. Make sure that the tests break without the new code to fix the issue.
  • If this is a new feature or a significant change to an existing feature? has the manual testing spreadsheet been updated with instructions for manual testing?

Security

  • Confirm this PR doesn't commit any keys, passwords, tokens, usernames, or other secrets.
  • Are all UI and API inputs run through forms or serializers?
  • Are all external inputs validated and sanitized appropriately?
  • Does all branching logic have a default case?
  • Does this solution handle outliers and edge cases gracefully?
  • Are all external communications secured and restricted to SSL?

Documentation

  • Are changes to the UI documented in the platform docs? If this PR introduces new platform site functionality or changes existing ones, the changes must be documented in the Cadasta Platform Documentation.
  • Are changes to the API documented in the API docs? If this PR introduces new API functionality or changes existing ones, the changes must be documented in the API docs.
  • Are reusable components documented? If this PR introduces components that are relevant to other developers (for instance a mixin for a view or a generic form) they should be documented in the Wiki.

var macros = /^[-=+@]/;
return !value.match(emojis) && !value.match(macros);
}, 2)
.addMessage('sanitize', gettext('Input can not contain HTML tags, emojis or start with any of + - = and @.'));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The message here indicates that HTML tags are not allowed, but the validator does not contain any logic to check for HTML tags.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@oliverroick: Is this feedback invalid?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not. I totally missed replying to this, thanks for reminding me again.

I've added a check for HTML to sanitize.js as well.

@linzjax
Copy link
Contributor

linzjax commented Apr 12, 2017

I know there was some conversation about this in the issue, but did we decide emoji's in the project name was ok?

screen shot 2017-04-12 at 12 16 30 pm

I also got a "Please provide a name" error when I added an emoji to the contact fields, rather than an error explaining that "there's no way my name does not contain an emoji so stop messing around".

@oliverroick
Copy link
Member Author

@linzjax Crap it looks like the regex I wrote doesn't cover unicorns, I'll look into that tomorrow. Also need to look at the names thing.

@linzjax
Copy link
Contributor

linzjax commented Apr 12, 2017

I know apple added a ridiculous new set of emojis and fancied up old ones that aren't included in the standard emoji set.

Also,

Crap it looks like the regex I wrote doesn't cover unicorns

is a hilarious sentence.

@oliverroick
Copy link
Member Author

@linzjax I added new regex definitions that should also cover unicorns.

Copy link
Contributor

@linzjax linzjax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't mind me just coming through trying to break things 🙅 :

screen shot 2017-04-19 at 4 04 08 pm

Two things weird about this:

  1. I can search for emojis and it shows up in the url
  2. It returned a document (which I didn't know was possible on my local machine) for a document that I tried to name "👩‍🎓" and it wouldn't let me.

@linzjax
Copy link
Contributor

linzjax commented Apr 19, 2017

I can also import emojis and super useless html.

screen shot 2017-04-19 at 4 23 28 pm

screen shot 2017-04-19 at 4 25 22 pm

@oliverroick
Copy link
Member Author

@linzjax Sanitizing imports now too

@@ -273,6 +275,7 @@ def _map_attrs_to_content_types(self, headers, row, content_types,
continue
if selector in ['DEFAULT', content_type.get('type', '')]:
for attr in attrs:
print(attrs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wayward print sighting

Copy link
Contributor

@linzjax linzjax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok besides the print statement, one more question: have you run this against ODK submissions? My Test phone is dead and apparently I didn't bring an android charger with me, but I don't see anything is xforms/mixins/model_helper.py

@oliverroick
Copy link
Member Author

@linzjax Of course GeoODK submissions were not sanitized, but they should be now.

Copy link
Contributor

@linzjax linzjax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Al➡️, that covers everything I can 🤔 of, which means I'm done 🐝ing a pain in your 🐴 now.

@wonderchook
Copy link
Contributor

@seav can you finish reviewing this please?

Copy link
Contributor

@seav seav left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did some tests and this looks good to me.

@amplifi
Copy link
Contributor

amplifi commented May 25, 2017

Organization form validates tag inputs:

screen shot 2017-05-25 at 4 47 50 pm

@amplifi
Copy link
Contributor

amplifi commented May 25, 2017

User registration form validates emoji and tags, including saving for full name (same also allowed on editing user profile):

screen shot 2017-05-25 at 5 06 50 pm

screen shot 2017-05-25 at 5 07 57 pm

screen shot 2017-05-25 at 5 08 25 pm

@amplifi
Copy link
Contributor

amplifi commented May 25, 2017

When adding a new project or editing project details, phone number field doesn't display error when invalid string is submitted. Just refreshes form page without visual cues or error message:

screen shot 2017-05-25 at 5 28 46 pm

@amplifi
Copy link
Contributor

amplifi commented May 25, 2017

Add location form, edit location relationship form, add resource form validate with tags:

screen shot 2017-05-25 at 5 37 46 pm

screen shot 2017-05-25 at 5 45 19 pm

screen shot 2017-05-25 at 5 52 52 pm

.addValidator('sanitize', function (value, requirement) {
function isHTML(str) {
const doc = new DOMParser().parseFromString(str, "text/html");
return Array.from(doc.body.childNodes).some(function(node) {return node.nodeType === 1});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This JavaScript code apparently does not properly check for <script> and <style> tags. We should also do the conditional checks below. If any of them is true, then the string is not sanitized.

  • doc.styleSheets.length > 0
  • doc.scripts.length > 0

@amplifi
Copy link
Contributor

amplifi commented May 25, 2017

On submission, Edit resource form shows green validation for fields containing tags (Name & Description), then after form reloads it shows the red error cues.

@bjohare
Copy link
Contributor

bjohare commented May 25, 2017

We should probably be sanitizing questionnaires as well..
questionnaire

@amplifi
Copy link
Contributor

amplifi commented May 25, 2017

We should consolidate any additional fixes/sanitization into a new GitHub Issue (or Issues). As long as there are no breaking changes, and I haven't found any yet, this PR has to merge for the upcoming release. Better to have most inputs validated than none.

@oliverroick
Copy link
Member Author

I added client-side code to catch script and style elements. Also added code to sanitize user registration and profile updates.

@bjohare Questionnaires should be sanitized, could you send me the form you used?

@bjohare
Copy link
Contributor

bjohare commented May 26, 2017

@oliverroick I just used the standard questionnaire with an emoji added to the party_type label

@amplifi amplifi merged commit f2e6917 into master May 30, 2017
@amplifi amplifi deleted the bugfix/#1157 branch May 30, 2017 09:15
@linzjax
Copy link
Contributor

linzjax commented Jun 1, 2017

When signing in, I can submit fields with emojis. I can't sign in, but they are submitted.


If I go through the "password reset" process, I can reset my password to contain an emoji.


Similar to the sign in form, the "add member" form submits with emojis.


Organization and Project URL fields accept emojis and script tags

screen shot 2017-06-01 at 11 18 59 am

screen shot 2017-06-01 at 11 38 22 am


I can't add resource with a filename that contains an emoji, but I also don't get an error
screen shot 2017-06-01 at 11 31 25 am

@dpalomino
Copy link

Some feedback, uploading questionnaires when creating projects:

  1. Not sure if this is an issue, but I can upload a questionnaire containing an emoji, although at the end of the project creation process (step 3, user permissions), project is rejected. Tried including emojis in labels, names, types, choices, settings

  2. You can include an emoji in a column name of a questionnaire (I tested with "tittle" column in settings tab). Interestingly:

    • If included in the "tittle" column of the settings tab, it is accepted. Example here
    • If included in the "name" column of the survey tab, it is not accepted (but you are able to upload)
  3. I tried to change the tab name to include an emoji, and was rejected but not because we were filtering emojis, but because there were missing information:

screen shot 2017-06-01 at 19 21 01

For scripts:

  1. Same behaviour. You can generally upload a questionnaire containing script chars (I tried with <script>), although at the end of the project creation process (step 3, user permissions), project is rejected. Tried including script code in labels, names, types, choices, settings

  2. Interestingly, including script in the "hint" column (hint<script>asdf</script>)is accepted and you can actually create a project, see here.

  3. Same behaviour for title column in settings tab, was able to include "title<script>ha!</script>" and project was created (check here)

@dpalomino
Copy link

Regarding search:

  1. I can include emojis in the search:
    screen shot 2017-06-01 at 19 47 45

  2. And the same for the script code:
    screen shot 2017-06-01 at 20 01 15

@amplifi
Copy link
Contributor

amplifi commented Jun 1, 2017

Search is fine; input isn't stored and Elasticsearch has its own query validation. Same for login.

The main concerns are strings that are stored in the database.

@dpalomino
Copy link

I've created individual tickets #1558, #1559, and #1560 for tracking these issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Critical: Emojis should not be valid Org or Project names Input fields should be sanitized to exclude code
7 participants