Browser instance on a network with proxy #120

Open
cassionandi opened this Issue Nov 8, 2011 · 39 comments

Comments

Projects
None yet

Hi.

I'm trying to run Splinter in a Windows 7 machine, in a local network that needs a proxy.

When a new instance of the Browser() is created in the python shell, a firefox window pop-ups with the profile different than the one that I use. So, this new instance and profile don't have none of the configs for proxy.

This is the message generated by the "browser = Browser()" code:

Traceback (most recent call last):
File "C:\Users\cassio.nandi\Desktop\splinter\exemplo.py", line 3, in
browser = Browser()
File "C:\Python27\lib\site-packages\splinter\browser.py", line 46, in Browser
return driver(_args, *_kwargs)
File "C:\Python27\lib\site-packages\splinter\driver\webdriver\firefox.py", line 23, in init
self.driver = Firefox(firefox_profile)
File "C:\Python27\lib\site-packages\selenium\webdriver\firefox\webdriver.py", line 47, in init
desired_capabilities=DesiredCapabilities.FIREFOX)
File "C:\Python27\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 61, in init
self.start_session(desired_capabilities, browser_profile)
File "C:\Python27\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 98, in start_session
'desiredCapabilities': desired_capabilities,
File "C:\Python27\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 144, in execute
self.error_handler.check_response(response)
File "C:\Python27\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 100, in check_response
raise exception_class(value)
WebDriverException: Message: '\n\n<TITLE>ERRO: A URL solicitada não pode ser recuperada</TITLE>\n<STYLE type="text/css"></STYLE>\n\n

ERRO

\n

A URL solicitada não pode ser recuperada

\n
\n

\nNa tentativa de recuperar a URL:\nhttp://127.0.0.1:51372/hub/session\n

\nO seguinte erro foi encontrado:\n

    \n
  • \n\nProibido o Acesso.\n\n

    \nO controle de acessos impediu sua requisição.\nCaso você não concorde com isso, por favor, \ncontate seu provedor de serviços, ou o administrador de sistemas.\n

\n

\n\n\n
\n
\n\nGenerated Tue, 08 Nov 2011 18:27:08 GMT by proxy (squid/2.6.STABLE21)\n\n\n'

Contributor

fsouza commented Nov 8, 2011

Marked as a bug. Thanks for reporting.

Owner

tarsisazevedo commented Nov 8, 2011

How do you use this proxy? Is a configuration on SO or a plugin in firefox?

On a pt-br firefox

Opções -> Avançado -> Rede -> Configurar Conexão.

Configuração manual de proxy

Usar este proxy para todos os protocolos

Owner

tarsisazevedo commented Nov 8, 2011

this configuration works on any instances of firefox?

It is not present in the intance opened by the splinter, I supose because the temporary profile created on-the-fly.

Owner

tarsisazevedo commented Nov 8, 2011

try open it with your firefox profile

I configured the IE network settings with the proxy and to ignore 127.0.0.1. This makes everything work fine, but I need to say that my profile was loaded but ignored and overided by the system network config

Contributor

fsouza commented Nov 10, 2011

The problem related to httplib.

More details: http://groups.google.com/group/splinter-users/msg/1d9c0e89d1e34e23

flaviamissi reopened this Nov 15, 2011

Owner

flaviamissi commented Nov 15, 2011

Howdy!

This question, like Francisco said, is related to httplib, once it doesn't look for proxy configs in your system. It could be solved with urllib2 easily, I haven't done this yet because this would cause an interface change in splinter, we would have to remove the status_code property from the webdrivers, which was implemented solving issue 35. The reason why we can't keep it, is that urllib2 does not give us access to http status code when we make a request, only when it raises an HTTPError.

The status_code support was not the goal of issue 35, it's a gain.

I'm bringing this up because the status_code is part of the webdriver's api, and remove it will break people who are using it.

Member

douglascamata commented Nov 15, 2011

If the problem is that urllib2 doesn't provide the status_code of every page, we can move straight to urllib3, as it provides the status_code of every request.

What do you think about it?

Owner

flaviamissi commented Nov 16, 2011

Howdy!
Maybe it's a good idea, but I read (not carefully :P) the urlilb3 docs, and I didn't found any way to configure a proxy with authentication, if it not support this feature, we're gonna be in the same place we are right now, with httplib.

Do you have a link showing how to do that with urllib3? Just in case that I haven't seen this (what probably happened).

I heard something about httplib2 giving support to proxy.

Member

douglascamata commented Nov 17, 2011

Looks like httplib2 only supports SOCKS proxies without authentication. I still think we should use urllib3, so we can fetch all those status_code and will have increased performance in some cases. In my opinion, there's no networking library for python with support to authenticated proxies.

Owner

flaviamissi commented Nov 17, 2011

Hey!
I haven't seen httplib2 yet.. so I can't say anything about it right now.
This is our requirements:

  • support proxy with authentication
  • have a status_code attribute

@douglascamata do you have a link that shows urllib3 proxy authentication sopport? I was not able to find it myself..

Member

douglascamata commented Nov 17, 2011

@flaviamissi no, and I can't find any python lib that can do that.

Contributor

gabriellima commented Nov 17, 2011

Well, don't know if you have found any resources related to 'urllib and proxy support', but this might help:

and the best one (tooked from http://bytes.com/topic/python/answers/22918-proxy-authentication-using-urllib2):

import urllib2

proxy_info = {
'user' : 'username',
'pass' : 'password',
'host' : "proxy.name.com",
'port' : 80 # or 8080 or whatever
}

# build a new opener that uses a proxy requiring authorization
proxy_support = urllib2.ProxyHandler({"http" : \
"http://%(user)s:%(pass)s@%(host)s:%(port)d" % proxy_info})
opener = urllib2.build_opener(proxy_support, urllib2.HTTPHandler)

# install it
urllib2.install_opener(opener)

# use it
f = urllib2.urlopen('http://www.python.org/')
print f.headers
print f.read()
Member

douglascamata commented Nov 17, 2011

Big thanks @gabriellima, I'll try to make it work with urllib3.

Owner

flaviamissi commented Nov 17, 2011

@gabriellima Yes, as I said before, I know urllib2 has this support, they have an example just like yours in the docs, what urllib2 doesn't have is a status_code, that we also need.

But thank's for the sample.

Contributor

gabriellima commented Nov 17, 2011

Now I got what you mean. Hope you find this 'status_code' attribute helper :)

Good luck :)

Ahh, doesn't it come with the response of trying to access the page?
I don't know urllib fairly well.

Contributor

fsouza commented Nov 24, 2011

I'm removing the "bug" label. We were not expecting to support this feature :)

Contributor

medwards commented Nov 16, 2012

What here isn't solved by inputing your proxy settings like so:

profile = {}
profile['network.proxy.type'] = 1
profile['network.proxy.http'] = '192.168.255.195'
# etc.
ff_browser = Browser(profile_preferences=profile)

(not that I have tested this, but it looks like this would setup Firefox to use the right proxy settings)

icybox commented May 17, 2013

@medwards This does work. How can you set a port for proxy? I can't find any detailed instructions.

Contributor

medwards commented May 17, 2013

This stackoverflow question says its network.proxy.http_port
If I remember right, later Splinter goes through this dictionary calling set_preference using the same key-value pairs.

Alternatively just open up about:config and search for network.proxy.

This was referenced Sep 14, 2013

I still seem to have this problem with teh network proxy.
The approaches I have tried are:

  1. Reference my local profile e.g. browser = Browser('firefox', profile = r'\server1\userame\Redirect\AppData\Mozilla\Firefox\Profiles\bkn7r1ar.default') this already has the proxy setting setup. When i call the browser it appears, however when i try and use browser.visit('http://www.google.com') i get [Errno 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond
    [Dbg]>>>
  2. I have tried to create a new profile as per "medwards" suggestion of creating a new profile, and inputting the proxy settings but get the same error.

Any other suggestions.

Owner

andrewsmedina commented Nov 21, 2013

Hi @hsiddique can you paste your profile code?

@andrewsmedina
I have tried
profile = {'network.proxy.type': 5,
'network.proxy.http': '128.128.100.100'
'network.proxy.http_port': 8080,
'network.proxy.ssl': "",
'network.proxy.ssl_port': 0}
browser = Browser('firefox', profile = profile)

Also
profile = {}
profile['network.proxy.type'] = 5
#profile['network.proxy.http'] = '128.128.100.100'
#profile['network.proxy.http_port'] = 8080
browser = Browser('firefox', profile = profile

Both of the above instances open the browser but once I do browser.visit("http://www.google.com") I get the following error error: [Errno 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.

More interestingly if I create a browser instance within splinter using my normal profile
e.g.
browser = Browser('firefox', profile = r'\Server1\ME\Redirect\AppData\Mozilla\Firefox\Profiles\bkn7r1ar.default'), the browser opens, and if i enter an address in the address bar manually , it works perfectly, however when I try browser.visit("http://www.google.com") back in python I get the [Errno 10060] message,

Contributor

medwards commented Nov 21, 2013

That last bit seems pretty sketchy.

What happens if you launch your browser using the fully specified proxy
(first example) and you manually browse to a page?

On 21 November 2013 16:21, hsiddique notifications@github.com wrote:

@andrewsmedina https://github.com/andrewsmedina
I have tried
profile = {'network.proxy.type': 5,
'network.proxy.http': '128.128.100.100'
'network.proxy.http_port': 8080,
'network.proxy.ssl': "",
'network.proxy.ssl_port': 0}
browser = Browser('firefox', profile = profile)

Also
profile = {}
profile['network.proxy.type'] = 5
#profile['network.proxy.http'] = '128.128.100.100'
#profile['network.proxy.http_port'] = 8080
browser = Browser('firefox', profile = profile

Both of the above instances open the browser but once I do browser.visit("
http://www.google.com") I get the following error error: [Errno 10060] A
connection attempt failed because the connected party did not properly
respond after a period of time, or established connection failed because
connected host has failed to respond.

More interestingly if I create a browser instance within splinter using my
normal profile
e.g.
browser = Browser('firefox', profile =
r'\Server1\ME\Redirect\AppData\Mozilla\Firefox\Profiles\bkn7r1ar.default'),
the browser opens, and if i enter an address in the address bar manually ,
it works perfectly, however when I try browser.visit("
http://www.google.com") back in python I get the [Errno 10060] message,


Reply to this email directly or view it on GitHubhttps://github.com/cobrateam/splinter/issues/120#issuecomment-28992811
.

I can manually browse to the page using the first example.

Owner

andrewsmedina commented Nov 22, 2013

It is a splinter bug. It is caused by splinter request handler. :(

Deal all,

will the issue be fixed, if not, anyone has a workaround for the issue? I have same proxy issue, it works when I open the URL in the opened browser, but it will fail in python code.

Thanks,
Shenghong

As a workaround you can patch the visit method to remove the check... (as long as you don't need to worry about running into issue #35)

from splinter import Browser

browser = Browser('firefox')

# Patch visit method as it checks the http response code which doesn't work behind a proxy
browser.__class__.visit = lambda self, url: self.driver.get(url)

Hopefully this issue can be fixed, it would be useful to either be able to configure proxy settings or obey http_proxy, https_proxy, no_proxy environment variables.

Hey guys, recently I've tried to workaround proxy support in splinter and ended up with the following code:

myProxy = "125.0.80.126:8887"

proxy = Proxy({
    'proxyType': ProxyType.MANUAL,
    'httpProxy': myProxy,
    'ftpProxy': myProxy,
    'sslProxy': myProxy,
    'noProxy': ''
})

browser.driver = selenium.webdriver.Firefox(proxy=proxy)

What do you think about it?

Member

douglascamata commented Jun 5, 2014

I think that's nice!

On Thu, Jun 5, 2014 at 4:36 PM, nikitatrophimov notifications@github.com
wrote:

Hey guys, recently I've tried to workaround proxy support in splinter and
ended up with the following code:

myProxy = "125.0.80.126:8887"
proxy = Proxy({
'proxyType': ProxyType.MANUAL,
'httpProxy': myProxy,
'ftpProxy': myProxy,
'sslProxy': myProxy,
'noProxy': ''})

browser.driver = selenium.webdriver.Firefox(proxy=proxy)

What do you think about it?


Reply to this email directly or view it on GitHub
#120 (comment).

Douglas Camata
Graduando em Ciência da Computação (UENF)

Blog: http://douglascamata.net
Github: http://github.com/douglascamata
Twitter: @douglascamata http://twitter.com/douglascamata

http://twitter.com/douglascamataSkype: douglas_camata

Linux User #509211

So, the only fix for this is to remove the check?

Owner

andrewsmedina commented Jun 26, 2014

we have to change the way we do this check, just remove does not fix this problem. removing the check browser.status_code will stop working.

rwillmer commented Dec 1, 2014

I've created 2 new related issues:
#358 documentation for how to handle unauthenticated proxies
#359 need to support authenticated proxies

yesbox commented May 29, 2015

This issue blocks the usage of Splinter in environments that require the use of an HTTP proxy. This appears to be a long-standing issue, is there a known workaround?

This script is used in the following scenarios on a headless CentOS 7 server with XVFB, Firefox 38 and Splinter 0.7.2. tcpdump running to analyze its behavior:

from splinter.browser import Browser

fx_config = {
#    'network.proxy.type': 1,
#    'network.proxy.http': 'internal.proxy.local',
#    'network.proxy.http_port': 8080,
    'browser.startup.homepage': 'http://www.example-site-1.net/',
}

browser = Browser('firefox', profile_preferences=fx_config)
browser.visit('http://www.example-site-2.net/')

Baseline:

Without an http_proxy env variable set the browser can be seen resolving site 1 and site 2 DNS names and attempting to open HTTP connections to both of them (along with some Mozilla services), all of which will fail due to the restrictive network environment, eventually raising "socket.error: [Errno 101] Network is unreachable".

http_proxy env variable set:

When setting the http_proxy env variable (and https_proxy, HTTP_PROXY and HTTPS_PROXY all to the same value) and running the script the HTTP proxy DNS name is looked up the browser opens a connection to the server on the proxy port. No connections to site 1 are seen. After this, the site 2 DNS name is resolved and the browser attempts to connect directly to it. This too fails.

Firefox proxy settings applied:

When uncommenting the network.proxy settings in the script, settings the Firefox HTTP proxy preferences as has previously been suggested in this thread, the behavior appears identical to the baseline, regardless of whether the http_proxy env variable is set or not. I'm not certain if these settings are correct, but they do match the changes found under about:config when I setup and successfully use Firefox with the same proxy server via the UI on my desktop machine.

Summery

To summarize, it appears the browser.visit method does not respect the http_proxy env variable like other connections Firefox opens when used with Splinter. In addition it appears that setting network.proxy settings via Splinter breaks HTTP proxy browsing further, attempting to use direct connections both with and without the browser.visit call.

Owner

andrewsmedina commented May 30, 2015

@yesbox I have pushed a commit that fix this problem. Can you test the master version?

yesbox commented Jun 1, 2015

Running master commit e47f044 against an HTTP proxy with the http_proxy and https_proxy variable set now works.

Using the profile_preferences to configure HTTP proxy in the same manner as in the previous example still breaks it.

Both http_proxy and https_proxy must be set, either used alone will not give any response. I did not test these env variables separately previously but always used both when testing.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment