Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Multiple Proxies #383

Open
kitchenutensils778 opened this issue Aug 5, 2021 · 9 comments
Open

[FEATURE] Multiple Proxies #383

kitchenutensils778 opened this issue Aug 5, 2021 · 9 comments
Labels
enhancement New feature or request

Comments

@kitchenutensils778
Copy link

Is is possible to add multiple proxies and select the least latency proxy in real time?

@kitchenutensils778 kitchenutensils778 added the enhancement New feature or request label Aug 5, 2021
@vacom13
Copy link
Contributor

vacom13 commented Dec 18, 2021

@benbusby I would like to give this a try. However, I would need guidance on it 😅. I don't know, maybe some resources to refer too?

@vacom13
Copy link
Contributor

vacom13 commented Feb 4, 2022

@benbusby I will take it up

@benbusby
Copy link
Owner

benbusby commented Feb 4, 2022

Thanks @vacom13, and sorry I missed your message from back in December! I think for an initial implementation you could just add support for multiple proxies, and retry requests using a different proxy (if multiple are configured) if one times out. Selecting the least latency proxy can be added later as an improvement.

@vacom13
Copy link
Contributor

vacom13 commented Feb 8, 2022

@benbusby no problem. And yes I will look into it

@vacom13
Copy link
Contributor

vacom13 commented Mar 17, 2022

@benbusby Hey. I have actually been really busy with family and college work. I did think up a way to implement this I suppose. Correct me if I am wrong. For now I should probably just get a list of working proxies from a site and give the user a checkbox in the configs to select if he wants to use the proxies right? And them I need to make sure that the search results come within the given timeframe. After that I can give the user the ability to select the latency right?

@vacom13
Copy link
Contributor

vacom13 commented Mar 22, 2022

@benbusby there is this python library free-proxies. It basically scrapes for proxies on https://www.sslproxies.org/ and gives a string for working proxies. I could use that and incase it times out, I could try to get another proxy. As it gets the proxy in real time, I suppose there shouldnt be a problem as according to the documentation, it checks whether the proxy is working.

@vacom13
Copy link
Contributor

vacom13 commented Mar 22, 2022

Or I could just scrape the list myself for the first 5 proxies and then check those out?

@benbusby
Copy link
Owner

Hey @vacom13, I think that's actually a good idea, but potentially a bit out of scope for this issue. I think all this issue should really support is multiple user specified proxies. So if a user has access to multiple, they can specify them as a comma separated string (or something along those lines).

So currently WHOOGLE_PROXY_LOC only accepts one IP:PORT string, but could be updated to have multiple and cycle through them if one of them returns an error response code. If a user just specified the proxy locations as a comma separated string such as IP:4000,IP:4001, then in the request module we could do something like:

proxy_paths = os.environ.get('WHOOGLE_PROXY_LOC', '').split(',')
if proxy_paths:
    # ...
    for path in proxy_paths:
        # Validate a 200 response and no captcha from search URL

I think your idea could be an entirely separate issue, but I'd like to personally look into the free-proxies library a bit first.

@vacom13
Copy link
Contributor

vacom13 commented May 1, 2022

@benbusby i have tried to work on this but it's just that not having multiple working proxies available just makes it confusing. I used a couple of free proxies but i ran into internal errors maybe because the connection keeps timing out. The free proxies did work in a test script i created but whenever i add the proxy to the whoogle env and then run, it always leads to some problem. I did also run into a rate limiting issue with a proxy. Anyway, I will keep at it.

@vacom13 vacom13 removed their assignment Aug 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants