New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HTML should not be parsed with regex #255
Comments
In fact, you are right.
As far as I understand, the current approach was inspired by the django-debug-toolbar. @matthiask @tim-schilling May I hear your opinions as django-debug-toolbar leaders on this, since a similar situation clearly affects django-debug-toolbar as well. |
This logic predates my involvement with the toolbar. It looks like it was originally a version of I agree that it isn't the most robust approach, but it's been good enough for the toolbar for a few reasons:
Keep in mind those opinions are in regards to the django debug toolbar. django-hijack specifically could consider overriding the base admin templates and providing a template tag in the cases of people not using it or using a drop-in replacement. |
Thank you for your participation @tim-schilling. Actually, I don't know what we should do in this situation. @codingjoe what do you think? |
Hi @LincolnPuzey, Thank you for reaching out to us and addressing this. I do agree, this isn't a once size fits all solution and I do see scenarios where this might break. However, as mentioned by @Mogost, we blatantly stole the idea from However, yes, especially if you are running a side like stackoverflow, where users share HTML code, you may run into problems. But this is why this is based on a setting that can be adjusted. One option is to set it to something dubious that is unlikely or virtually impossible to guess, something like: # key being creaded with secrets.token_urlsafe()
HIJACK_INSERT_BEFORE = "<!--// SUPER_SECRET_KEY // -->" You now have one of two options, you may add this to your base template for the middleware to inject the notification or you don't and add the notification to your template yourself. Something like this: {% if request.user.is_hijacked %}
<!-- Your custom notification or -->
{% include "hijack/notification" with request=request only %}
{% endif %} Anyhow, here is what I propose we do:
Beyond that, I believe RegEx is suitable to provide both performance and comfort. At the end, this is only to make the integration easier for users. However, we could consider using Python's Thanks @Mogost and @tim-schilling for your remarks. Please feel free to share your thoughts. Best, |
Thanks for your thoughts on this everyone. I didn't know the code was copied straight from debug-toolbar. However that only ever runs locally in DEBUG mode. I find it hard to trust a middleware that is totally re-writing the response HTML body in production. My proposed solution would be to split the
Then beginner users can simply use Or more advanced users that want a more robust setup can use Or, given that the @codingjoe @Mogost Thoughts on this? |
Hi @codingjoe, You are making a valid point. However, we should also consider the security implications. The notification is a vital part, making sure that it is always visible is important, to avoid that users may perform unintended actions. I personally do not share your concern about a middleware altering a request or response. That is quite literally what they are for, as described in the very first sentence here:
That aside, I still understand your concern, and we should address it properly. I don't believe we need a second middleware. Removing the first one, will achieve your goal. Yes, you won't have the user attribute, but following the general logic, we shouldn't fiddle with every request either, if not necessary. Adding a template tag, as before seems a bit counterintuitive to me. A simple Lastly, yes, we are running this code in production, not in debug mode. I agree, there is a difference. However, we do not alter all requests. We only do so for hijacked requests. Anyhow, I really appreciate the discussion. I hope I am not being too defensive. This isn't as much about protecting our design decisions, as about keeping the code base small. I am pleased to see that a curious developer can understand the code base so quickly and discuss concepts. However, I also believe not everyone will take the time, investigate the inner workings of this package. And for those, I want to make sure we provide the best security possible. I am on vacation right now, but I will try to find some time next week to write and test some documentation that implements you suggestions. Best, |
Leaving aside the future of the HTML-parsing, I'd like to second allowing the content injection to be skipped if HIJACK_INSERT_BEFORE is None. I came here to open a new issue with that suggestion before stumbling on this one. It'd be a very simple change: def process_response(self, request, response):
"""Render hijack notification and inject into HTML response."""
if not getattr(request.user, "is_hijacked", False) or not settings.HIJACK_INSERT_BEFORE:
return response
... I've recently added django-hijack to a project where I want the notification built right into my layout, and to do so I've had to override hijack/notification.html with an empty template. Not an elegant solution, but nicer than subclassing the middleware. Nulling that setting would be vastly easier, more obvious, and immune to breakage. |
Thank you, @timnyborg, for opening up the PR. It doesn't change the regex implementation, though. In lack of a better solution, both for us nor debug toolbar, I will close the issue for. If someone has a new solution, I'd be happy to revisit this topic. Best Joe |
The middleware's
process_response
method uses regex to parse the HTML response and inject content.This is easily broken, leading to the content being injected in the wrong place or not at all.
I will open a PR with unit tests showing this.
See this stackoverflow post as to why parsing HTML with regex is a bad idea.
The text was updated successfully, but these errors were encountered: