-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
URL Safety Validation Function #8301
Conversation
…tocol verification. * Add URL safety check function "is_safe_url"
* Add URL safety check with client_utils.is_safe_url * Removed follow_redirects parameter of httpx.stream function
🪼 branch checks and previews
Install Gradio from this PR pip install https://gradio-builds.s3.amazonaws.com/c9bbf8a1cd786a9881a8e493d5665badfc2b0e7a/gradio-4.31.3-py3-none-any.whl Install Gradio Python Client from this PR pip install "gradio-client @ git+https://github.com/gradio-app/gradio@c9bbf8a1cd786a9881a8e493d5665badfc2b0e7a#subdirectory=client/python" |
Thanks @mvlttt I'm not totally sure if we want to introduce all these checks as they will likely add some latency any time URLs are being handled. Can you explain the motivation for this a little more? |
I'm mainly concerned about this DNS lookup: # Resolve the domain to an IP address using DNS lookup
resolve = socket.gethostbyname(domain) Would the following simpler approach work:
|
The reason I do DNS lookup is to prevent the DNS rebinding vulnerability. I'm not sure how to optimize this specifically, but I'm considering implementing a control to only perform DNS lookup if the domain is different from the IP. The solutions you suggest have many bypass methods. I'm adding a few references for you. https://payatu.com/blog/dns-rebinding/ |
This is going to break LAN and localhost addresses. Maybe it should be optional. Or perhaps it should only apply to domain names, not IPs. |
It seems to me a fair few downstream projects of gradio are essentially single user applications where the browser is connecting to an instance running on your localhost or another machine on your LAN - that machine being the one has the AI processing capability. At least that's I've always encountered outside of the actual huggingface site, including the one I've contributed to in the past or (less significant, because only I use them) the ones I've made myself. I have no idea whether this particular change actually does break that, but if it does that would be concerning.
Ability to set a whitelist of ips/domains? |
One of the gradio maintainers, @aliabid94, started working on this here: #8978. I'll close this PR for now |
Description
This pull request introduces a new function
is_safe_url()
in theclient_utils
module to validate the safety of URLs before processing them further. Here's an overview of the changes:client_utils.is_safe_url()
function to check if a URL is safe for processing.urlparse
), DNS lookup (socket.gethostbyname
), and IP address validation (ipaddress.ip_address
) to determine the safety of the URL.False
.This enhancement ensures that URLs are properly validated before any further actions are taken, improving the security and reliability of URL processing within the application.