-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
String sanitation for forms and serializers #1395
Conversation
cadasta/core/static/js/sanitize.js
Outdated
var macros = /^[-=+@]/; | ||
return !value.match(emojis) && !value.match(macros); | ||
}, 2) | ||
.addMessage('sanitize', gettext('Input can not contain HTML tags, emojis or start with any of + - = and @.')); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The message here indicates that HTML tags are not allowed, but the validator does not contain any logic to check for HTML tags.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@oliverroick: Is this feedback invalid?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not. I totally missed replying to this, thanks for reminding me again.
I've added a check for HTML to sanitize.js
as well.
I know there was some conversation about this in the issue, but did we decide emoji's in the project name was ok? I also got a "Please provide a name" error when I added an emoji to the contact fields, rather than an error explaining that "there's no way my name does not contain an emoji so stop messing around". |
@linzjax Crap it looks like the regex I wrote doesn't cover unicorns, I'll look into that tomorrow. Also need to look at the names thing. |
I know apple added a ridiculous new set of emojis and fancied up old ones that aren't included in the standard emoji set. Also,
is a hilarious sentence. |
@linzjax I added new regex definitions that should also cover unicorns. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@linzjax Sanitizing imports now too |
@@ -273,6 +275,7 @@ def _map_attrs_to_content_types(self, headers, row, content_types, | |||
continue | |||
if selector in ['DEFAULT', content_type.get('type', '')]: | |||
for attr in attrs: | |||
print(attrs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wayward print sighting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok besides the print statement, one more question: have you run this against ODK submissions? My Test phone is dead and apparently I didn't bring an android charger with me, but I don't see anything is xforms/mixins/model_helper.py
@linzjax Of course GeoODK submissions were not sanitized, but they should be now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Al➡️, that covers everything I can 🤔 of, which means I'm done 🐝ing a pain in your 🐴 now.
@seav can you finish reviewing this please? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did some tests and this looks good to me.
cadasta/core/static/js/sanitize.js
Outdated
.addValidator('sanitize', function (value, requirement) { | ||
function isHTML(str) { | ||
const doc = new DOMParser().parseFromString(str, "text/html"); | ||
return Array.from(doc.body.childNodes).some(function(node) {return node.nodeType === 1}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This JavaScript code apparently does not properly check for <script>
and <style>
tags. We should also do the conditional checks below. If any of them is true, then the string is not sanitized.
doc.styleSheets.length > 0
doc.scripts.length > 0
On submission, Edit resource form shows green validation for fields containing tags (Name & Description), then after form reloads it shows the red error cues. |
We should consolidate any additional fixes/sanitization into a new GitHub Issue (or Issues). As long as there are no breaking changes, and I haven't found any yet, this PR has to merge for the upcoming release. Better to have most inputs validated than none. |
I added client-side code to catch @bjohare Questionnaires should be sanitized, could you send me the form you used? |
@oliverroick I just used the standard questionnaire with an emoji added to the |
Some feedback, uploading questionnaires when creating projects:
For scripts:
|
Search is fine; input isn't stored and Elasticsearch has its own query validation. Same for login. The main concerns are strings that are stored in the database. |
Proposed changes in this pull request
+
,-
,+
or@
to prevent injection of potentially malicious macros.Detailed discussion of added features
sanitize_string
tocore.validators
which is used throughout all forms and serializers to verify if a string will be accepted or not. The function uses the librarybeautifulsoup
to identify HTML code, and two regular expressions to identify emojis and starting+
,-
,+
or@
. The function returnsFalse
if the provided string matches any of the regular expressions or contains HTML.sanitize.js
tocadasta/core/static/js
, which introduces a new Parsley validator to sanitise fields on the client-side.set_parsley_sanitize
tocore.templatetags.filters
, which adds the attributedata-parsley-sanitize
to all form fields where the filter is applied. The additional attribute indicates to Parsley that this field should be validated using the validator added throughsanitize.js
.SanitizeFieldsForm
tocore.form_mixins
that checks the values or allCharField
instances usingcore.validators.santize_string
. The mixin is added to all forms where we allow input of strings.SanitizeFieldSerializer
tocore.serializers
that checks the values of allCharField
instances and all items in allJSONField
instances usingcore.validators.santize_string
. The mixin is added to all serializers where we allow input of strings.JSONAttrsSerializer
tocore.serializers
that validates contents of the fieldattributes
usingjsonattrs
.jsonattrs
validation wasn't implemented as of yet which allowed the injection of invalid values to be added via the API, which caused the platform to break in some instances.When should this PR be merged
This is a critical fix, so it should be merged as soon as possible.
Risks
The PR itself is low risk; we should test the changes thoroughly on staging because it affects most parts of the platform.
Follow-up actions
Checklist (for reviewing)
General
migration
label if a new migration is added.Functionality
Code
Tests
Security
Documentation