Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scrapy_splash use chromium ERROR #964

Open
toheart opened this issue Nov 9, 2019 · 3 comments
Open

Scrapy_splash use chromium ERROR #964

toheart opened this issue Nov 9, 2019 · 3 comments

Comments

@toheart
Copy link

toheart commented Nov 9, 2019

Splash version: 3.4
engine: chromium
Code:

  SplashRequest(url=url,
                            callback=self.parse,
                            args={
                                'wait': 0.5,
                                'image': 0,
                                'render_all': 1,
                                'engine': 'chromium'
                            },
                            meta={
                                'depth': 0,
                                'download_timeout': self.wait_time,
                                'page_type': conf.PageType.IN_CHAIN
                            }, errback=self.error_failure)

Error:

'Traceback (most recent call last):\n  File "/root/venv/WebMonitor/lib/python3.7/site-packages/twisted/internet/defer.py", line 568, in _startRunCallbacks\n    self._runCallbacks()\n  File "/root/venv/WebMonitor/lib/python3.7/site-packages/twisted/internet/defer.py", line 654, in _runCallbacks\n    current.result = callback(current.result, *args, **kw)\n  File "/root/venv/WebMonitor/lib/python3.7/site-packages/twisted/internet/defer.py", line 1613, in unwindGenerator\n    return _cancellableInlineCallbacks(gen)\n  File "/root/venv/WebMonitor/lib/python3.7/site-packages/twisted/internet/defer.py", line 1529, in _cancellableInlineCallbacks\n    _inlineCallbacks(None, g, status)\n--- <exception caught here> ---\n  File "/root/venv/WebMonitor/lib/python3.7/site-packages/twisted/internet/defer.py", line 1418, in _inlineCallbacks\n    result = g.send(result)\n  File "/root/venv/WebMonitor/lib/python3.7/site-packages/scrapy/core/downloader/middleware.py", line 53, in process_response\n    spider=spider)\n  File "/root/venv/WebMonitor/lib/python3.7/site-packages/scrapy_splash/middleware.py", line 388, in process_response\n    response = self._change_response_class(request, response)\n  File "/root/venv/WebMonitor/lib/python3.7/site-packages/scrapy_splash/middleware.py", line 409, in _change_response_class\n    response = response.replace(cls=respcls, request=request)\n  File "/root/venv/WebMonitor/lib/python3.7/site-packages/scrapy/http/response/text.py", line 54, in replace\n    return Response.replace(self, *args, **kwargs)\n  File "/root/venv/WebMonitor/lib/python3.7/site-packages/scrapy/http/response/__init__.py", line 81, in replace\n    return cls(*args, **kwargs)\n  File "/root/venv/WebMonitor/lib/python3.7/site-packages/scrapy_splash/response.py", line 99, in __init__\n    self._load_from_json()\n  File "/root/venv/WebMonitor/lib/python3.7/site-packages/scrapy_splash/response.py", line 145, in _load_from_json\n    error = self.data[\'info\'][\'error\']\nbuiltins.TypeError: string indices must be integers\n'

Reason:
image
If POST Request have headers, the splash send error back。

@eherrerosj
Copy link

Same problem here, did you find any way to solve it?

@davidkong0987
Copy link

davidkong0987 commented Nov 25, 2021

http://0.0.0.0:8050/info?engine=chromium&wait=0.5&images=1&expand=1&timeout=90.0&url=https://www.lightship.capital/portfolio%2F&lua_source=function+main%28splash%2C+args%29%0D%0A++assert%28splash%3Ago%28args.url%29%29%0D%0A++assert%28splash%3Await%280.5%29%29%0D%0A++return+%7B%0D%0A++++html+%3D+splash%3Ahtml%28%29%2C%0D%0A++++png+%3D+splash%3Apng%28%29%2C%0D%0A++++har+%3D+splash%3Ahar%28%29%2C%0D%0A++%7D%0D%0Aend

does not work

however

http://0.0.0.0:8050/render.html?engine=chromium&wait=0.5&images=1&expand=1&timeout=90.0&url=https://www.lightship.capital/portfolio%2F&lua_source=function+main%28splash%2C+args%29%0D%0A++assert%28splash%3Ago%28args.url%29%29%0D%0A++assert%28splash%3Await%280.5%29%29%0D%0A++return+%7B%0D%0A++++html+%3D+splash%3Ahtml%28%29%2C%0D%0A++++png+%3D+splash%3Apng%28%29%2C%0D%0A++++har+%3D+splash%3Ahar%28%29%2C%0D%0A++%7D%0D%0Aend

works

I believe the issue is related to this only being relevant to the render.html endpoint

@dekotale
Copy link

dekotale commented Sep 1, 2022

Any solution? still hapenning now with splash 3.5

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants