-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Exception: Connection closed while reading from the driver #574
Comments
Seem this error was raised by Playwright. You can try to
it doesn't work, could you share the full exception stack with us? |
@goasleep I installed the latest version of Playwright==1.46.0 and the network is working fine. The complete exception information is as follows: --- Executing Fetch Node ---
--- (Fetching HTML from: https://blog.csdn.net/mopmgerg54mo/article/details/141028116) ---
---------------------------------------------------------------------------
Exception Traceback (most recent call last)
[/tmp/ipykernel_188540/3999856684.py](http://map-search-jupyter-allhjasa-32-59-8897.intra.didiglobal.com/tmp/ipykernel_188540/3999856684.py) in <module>
19 )
20
---> 21 result = smart_scraper_graph.run()
22 print(result)
[~/miniconda3/envs/py39ddmpeng/lib/python3.9/site-packages/scrapegraphai/graphs/smart_scraper_graph.py](http://map-search-jupyter-allhjasa-32-59-8897.intra.didiglobal.com/home/odin/miniconda3/envs/py39ddmpeng/lib/python3.9/site-packages/scrapegraphai/graphs/smart_scraper_graph.py) in run(self)
112
113 inputs = {"user_prompt": self.prompt, self.input_key: self.source}
--> 114 self.final_state, self.execution_info = self.graph.execute(inputs)
115
116 return self.final_state.get("answer", "No answer found.")
[~/miniconda3/envs/py39ddmpeng/lib/python3.9/site-packages/scrapegraphai/graphs/base_graph.py](http://map-search-jupyter-allhjasa-32-59-8897.intra.didiglobal.com/home/odin/miniconda3/envs/py39ddmpeng/lib/python3.9/site-packages/scrapegraphai/graphs/base_graph.py) in execute(self, initial_state)
261 return (result["_state"], [])
262 else:
--> 263 return self._execute_standard(initial_state)
264
265 def append_node(self, node):
[~/miniconda3/envs/py39ddmpeng/lib/python3.9/site-packages/scrapegraphai/graphs/base_graph.py](http://map-search-jupyter-allhjasa-32-59-8897.intra.didiglobal.com/home/odin/miniconda3/envs/py39ddmpeng/lib/python3.9/site-packages/scrapegraphai/graphs/base_graph.py) in _execute_standard(self, initial_state)
183 exception=str(e)
184 )
--> 185 raise e
186 node_exec_time = time.time() - curr_time
187 total_exec_time += node_exec_time
[~/miniconda3/envs/py39ddmpeng/lib/python3.9/site-packages/scrapegraphai/graphs/base_graph.py](http://map-search-jupyter-allhjasa-32-59-8897.intra.didiglobal.com/home/odin/miniconda3/envs/py39ddmpeng/lib/python3.9/site-packages/scrapegraphai/graphs/base_graph.py) in _execute_standard(self, initial_state)
167 with get_openai_callback() as cb:
168 try:
--> 169 result = current_node.execute(state)
170 except Exception as e:
171 error_node = current_node.node_name
[~/miniconda3/envs/py39ddmpeng/lib/python3.9/site-packages/scrapegraphai/nodes/fetch_node.py](http://map-search-jupyter-allhjasa-32-59-8897.intra.didiglobal.com/home/odin/miniconda3/envs/py39ddmpeng/lib/python3.9/site-packages/scrapegraphai/nodes/fetch_node.py) in execute(self, state)
125 return self.handle_local_source(state, source)
126 else:
--> 127 return self.handle_web_source(state, source)
128
129 def handle_directory(self, state, input_type, source):
[~/miniconda3/envs/py39ddmpeng/lib/python3.9/site-packages/scrapegraphai/nodes/fetch_node.py](http://map-search-jupyter-allhjasa-32-59-8897.intra.didiglobal.com/home/odin/miniconda3/envs/py39ddmpeng/lib/python3.9/site-packages/scrapegraphai/nodes/fetch_node.py) in handle_web_source(self, state, source)
277 else:
278 loader = ChromiumLoader([source], headless=self.headless, **loader_kwargs)
--> 279 document = loader.load()
280
281 if not document or not document[0].page_content.strip():
[~/miniconda3/envs/py39ddmpeng/lib/python3.9/site-packages/langchain_core/document_loaders/base.py](http://map-search-jupyter-allhjasa-32-59-8897.intra.didiglobal.com/home/odin/miniconda3/envs/py39ddmpeng/lib/python3.9/site-packages/langchain_core/document_loaders/base.py) in load(self)
28 def load(self) -> List[Document]:
29 """Load data into Document objects."""
---> 30 return list(self.lazy_load())
31
32 async def aload(self) -> List[Document]:
[~/miniconda3/envs/py39ddmpeng/lib/python3.9/site-packages/scrapegraphai/docloaders/chromium.py](http://map-search-jupyter-allhjasa-32-59-8897.intra.didiglobal.com/home/odin/miniconda3/envs/py39ddmpeng/lib/python3.9/site-packages/scrapegraphai/docloaders/chromium.py) in lazy_load(self)
109
110 for url in self.urls:
--> 111 html_content = asyncio.run(scraping_fn(url))
112 metadata = {"source": url}
113 yield Document(page_content=html_content, metadata=metadata)
[~/.local/lib/python3.9/site-packages/nest_asyncio.py](http://map-search-jupyter-allhjasa-32-59-8897.intra.didiglobal.com/home/odin/.local/lib/python3.9/site-packages/nest_asyncio.py) in run(future, debug)
30 loop = asyncio.get_event_loop()
31 loop.set_debug(debug)
---> 32 return loop.run_until_complete(future)
33
34 if sys.version_info >= (3, 6, 0):
[~/.local/lib/python3.9/site-packages/nest_asyncio.py](http://map-search-jupyter-allhjasa-32-59-8897.intra.didiglobal.com/home/odin/.local/lib/python3.9/site-packages/nest_asyncio.py) in run_until_complete(self, future)
68 raise RuntimeError(
69 'Event loop stopped before Future completed.')
---> 70 return f.result()
71
72 def _run_once(self):
[~/miniconda3/envs/py39ddmpeng/lib/python3.9/asyncio/futures.py](http://map-search-jupyter-allhjasa-32-59-8897.intra.didiglobal.com/home/odin/miniconda3/envs/py39ddmpeng/lib/python3.9/asyncio/futures.py) in result(self)
199 self.__log_traceback = False
200 if self._exception is not None:
--> 201 raise self._exception
202 return self._result
203
[~/miniconda3/envs/py39ddmpeng/lib/python3.9/site-packages/playwright/_impl/_connection.py](http://map-search-jupyter-allhjasa-32-59-8897.intra.didiglobal.com/home/odin/miniconda3/envs/py39ddmpeng/lib/python3.9/site-packages/playwright/_impl/_connection.py) in wrap_api_call(self, cb, is_internal)
510 self._api_zone.set(parsed_st)
511 try:
--> 512 return await cb()
513 except Exception as error:
514 raise rewrite_error(error, f"{parsed_st['apiName']}: {error}") from None
[~/miniconda3/envs/py39ddmpeng/lib/python3.9/site-packages/playwright/_impl/_connection.py](http://map-search-jupyter-allhjasa-32-59-8897.intra.didiglobal.com/home/odin/miniconda3/envs/py39ddmpeng/lib/python3.9/site-packages/playwright/_impl/_connection.py) in inner_send(self, method, params, return_as_dict)
95 if not callback.future.done():
96 callback.future.cancel()
---> 97 result = next(iter(done)).result()
98 # Protocol now has named return values, assume result is one level deeper unless
99 # there is explicit ambiguity.
[~/miniconda3/envs/py39ddmpeng/lib/python3.9/asyncio/futures.py](http://map-search-jupyter-allhjasa-32-59-8897.intra.didiglobal.com/home/odin/miniconda3/envs/py39ddmpeng/lib/python3.9/asyncio/futures.py) in result(self)
199 self.__log_traceback = False
200 if self._exception is not None:
--> 201 raise self._exception
202 return self._result
203
[~/miniconda3/envs/py39ddmpeng/lib/python3.9/asyncio/tasks.py](http://map-search-jupyter-allhjasa-32-59-8897.intra.didiglobal.com/home/odin/miniconda3/envs/py39ddmpeng/lib/python3.9/asyncio/tasks.py) in __step(***failed resolving arguments***)
254 # We use the `send` method directly, because coroutines
255 # don't have `__iter__` and `__next__` methods.
--> 256 result = coro.send(None)
257 else:
258 result = coro.throw(exc)
[~/miniconda3/envs/py39ddmpeng/lib/python3.9/site-packages/scrapegraphai/docloaders/chromium.py](http://map-search-jupyter-allhjasa-32-59-8897.intra.didiglobal.com/home/odin/miniconda3/envs/py39ddmpeng/lib/python3.9/site-packages/scrapegraphai/docloaders/chromium.py) in ascrape_playwright(self, url)
78 logger.info("Starting scraping...")
79 results = ""
---> 80 async with async_playwright() as p:
81 browser = await p.chromium.launch(
82 headless=self.headless, proxy=self.proxy, **self.browser_config
[~/miniconda3/envs/py39ddmpeng/lib/python3.9/site-packages/playwright/async_api/_context_manager.py](http://map-search-jupyter-allhjasa-32-59-8897.intra.didiglobal.com/home/odin/miniconda3/envs/py39ddmpeng/lib/python3.9/site-packages/playwright/async_api/_context_manager.py) in __aenter__(self)
44 if not playwright_future.done():
45 playwright_future.cancel()
---> 46 playwright = AsyncPlaywright(next(iter(done)).result())
47 playwright.stop = self.__aexit__ # type: ignore
48 return playwright
[~/miniconda3/envs/py39ddmpeng/lib/python3.9/asyncio/futures.py](http://map-search-jupyter-allhjasa-32-59-8897.intra.didiglobal.com/home/odin/miniconda3/envs/py39ddmpeng/lib/python3.9/asyncio/futures.py) in result(self)
199 self.__log_traceback = False
200 if self._exception is not None:
--> 201 raise self._exception
202 return self._result
203
Exception: Connection closed while reading from the driver |
I got it. You use Jupyter notebook to run this code. Jupyter have their own async event loop and asyncio.run will open new event loop so it will raise this error. you could switch to a plain Python file to run your script. If you're keen on sticking with Jupyter, just make sure to run certain lines of code before executing your main script. !pip install nest-asyncio
import nest_asyncio
nest_asyncio.apply()
graph_config = .... |
@goasleep I added the following code in Jupyter and the error still occurs import nest_asyncio
nest_asyncio.apply() In addition, I wrote a python file to run that code on Linux, and it also reported this error
|
I try it on linux but I cannot reproduce this problem. Could you help to run below code in Jupyter?If still get same error. maybe reach out to the Playwright folks for some assistance. @xjtupy import asyncio
import nest_asyncio
nest_asyncio.apply()
from playwright.async_api import async_playwright
url = "https://blog.csdn.net/mopmgerg54mo/article/details/141028116"
async def main():
async with async_playwright() as p:
browser = await p.chromium.launch(headless=True)
page = await browser.new_page()
await page.goto(url, wait_until="domcontentloaded")
await browser.close()
print(page)
asyncio.run(main()) |
@goasleep Unfortunately, this problem still occurs
|
if get same error in running above code? if yes, you can ask playwright for help and create new issue in playwright issue and linking new playwright issue in this issue. I guess it is your env problems cause it. I suggest you use docker to isolate the environment and then try again. @xjtupy |
Hi, Try running the following code
Report an error:
Exception: Connection closed while reading from the driver
Could you help me solve it?
Thanks
The text was updated successfully, but these errors were encountered: