You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We use PyQuery internally in our system to parse rendered HTML strings that we store in the database to be displayed in a Django web response. Essentially the code looks a bit like this:
@gawel, what would you think of changing the default behavior of PyQuery to not look for a url, and only fetch url contents if the user explicitly provides pq(url='http://google.com/')?
This would require a new version number, because it's a breaking change, but I think it would be worth it for the security benefits. Most users calling pq(text) probably don't want it to make a web request if the text happens to start with 'http', and if users do want that it's easy enough to add url=.
If this sounds good to you I'm happy to send a PR!
We use PyQuery internally in our system to parse rendered HTML strings that we store in the database to be displayed in a Django web response. Essentially the code looks a bit like this:
But if untrusted users can cause that
document.rendered_html_string
to become a string that looks something like this:Then, it becomes...
which will cause PyQuery to
requests.get('http://internaldomain/api/get_users/dangerous')
which could be a big security risk.It happens because of the constructor being too "naive".
On our app, we solved that by making the PyQuery constructor wrapped in a piece of code that does something like this:
Ideally, we'd love to be able to always invoke
PyQuery
like this:In similar style to how you can do
pq(url='http://google.com/')
as mentioned in the Quickstart documentation.The text was updated successfully, but these errors were encountered: