Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wpull crash when http_proxy is set #148

Open
yi opened this issue Feb 21, 2019 · 2 comments
Open

wpull crash when http_proxy is set #148

yi opened this issue Feb 21, 2019 · 2 comments

Comments

@yi
Copy link

yi commented Feb 21, 2019

grab-site suddenly stop workings and no long work since then.

I've tried uninstall-then-reinstall wpull and grab-site. But still not working.

Cry for help, please!

Traceback (most recent call last):
  File "/Users/aaa/gs-venv/lib/python3.7/site-packages/wpull/application/app.py", line 157, in run
    yield from pipeline.process()
  File "/Users/aaa/gs-venv/lib/python3.7/site-packages/wpull/pipeline/pipeline.py", line 194, in process
    yield from self._process_one_worker()
  File "/Users/aaa/gs-venv/lib/python3.7/site-packages/wpull/pipeline/pipeline.py", line 215, in _process_one_worker
    task.result()
  File "/Users/aaa/gs-venv/lib/python3.7/site-packages/wpull/pipeline/pipeline.py", line 119, in process
    item = yield from self.process_one(_worker_id=worker_id)
  File "/Users/aaa/gs-venv/lib/python3.7/site-packages/wpull/pipeline/pipeline.py", line 103, in process_one
    yield from task.process(item)
  File "/usr/local/Cellar/python/3.7.2_2/Frameworks/Python.framework/Versions/3.7/lib/python3.7/asyncio/coroutines.py", line 120, in coro
    res = func(*args, **kw)
  File "/Users/aaa/gs-venv/lib/python3.7/site-packages/wpull/application/tasks/network.py", line 21, in process
    self._build_connection_pool(session)
  File "/Users/aaa/gs-venv/lib/python3.7/site-packages/wpull/application/tasks/network.py", line 85, in _build_connection_pool
    http_proxy = session.args.http_proxy.split(':', 1)
AttributeError: 'NoneType' object has no attribute 'split'
CRITICAL Sorry, Wpull unexpectedly crashed.
Disconnected from ws:// server: RuntimeError('Event loop is closed')
Exception ignored in: <coroutine object sender at 0x10e9d7ac8>
Traceback (most recent call last):
  File "/Users/aaa/gs-venv/lib/python3.7/site-packages/libgrabsite/dashboard_client.py", line 54, in sender
    await asyncio.sleep(delay)
  File "/usr/local/Cellar/python/3.7.2_2/Frameworks/Python.framework/Versions/3.7/lib/python3.7/asyncio/tasks.py", line 566, in sleep
    future, result)
  File "/usr/local/Cellar/python/3.7.2_2/Frameworks/Python.framework/Versions/3.7/lib/python3.7/asyncio/base_events.py", line 657, in call_later
    context=context)
  File "/usr/local/Cellar/python/3.7.2_2/Frameworks/Python.framework/Versions/3.7/lib/python3.7/asyncio/base_events.py", line 667, in call_at
    self._check_closed()
  File "/usr/local/Cellar/python/3.7.2_2/Frameworks/Python.framework/Versions/3.7/lib/python3.7/asyncio/base_events.py", line 480, in _check_closed
    raise RuntimeError('Event loop is closed')

@ivan
Copy link
Contributor

ivan commented Feb 24, 2019

That http_proxy = session.args.http_proxy.split(':', 1) makes me think something set an environmental variable to use an HTTP proxy.

Try env | grep -i proxy and maybe unset the variable?

Please let me know if it's not that.

@ivan ivan changed the title CRITICAL Sorry, Wpull unexpectedly crashed. wpull crash when http_proxy is set Nov 8, 2019
@codsane
Copy link

codsane commented Dec 2, 2019

I'm receiving an identical error after setting wpull's proxy using --wpull-args="--http-proxy=0.0.0.0:16379"

Unfortunately env | grep -i proxy doesn't seem to return anything, and I've even made sure to run it within the container that grab-site is running in.

Even after removing --wpull-args, grab-site seems to be crashing with the same event loop error when attempting to crawl. In my case I was able to reinstall grab-site to fix this. I've even switched to dockerized grab-site, to make it easier to spin up fresh environments for testing.

As I'd like to eventually bring full onion archive capabilities to grab-site, I have decided to go ahead and make sure my wget onion archive configuration is able to be ported to wpull first.

I've opened an issue to address my personal issues using proxies in wpull. Assuming I can get that stuff cleared up, I will take another look at the proxy issues we're receiving in grab-site. grab-site appears to run a fork of wpull, so I'm wondering if our proxy issue may be specific to the fork of wpull or the plugins that grab-site introduces.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants