docker部署下怎么运行wxbot爬取公众号信息啊 #293

WuYu-sky · 2025-03-05T16:30:24Z

No description provided.

bigbrother666sh · 2025-03-06T04:53:04Z

这需要整合 wxbot 的 docker，出一个 compose, 欢迎贡献 PR

leoxu2024 · 2025-03-09T05:56:32Z

INIT].... → Crawl4AI 0.5.0.post2
2025-03-09 05:16:22.423 | DEBUG | general_process:main_process:147 - process new url, still 0 urls in working list
[ERROR]... × https://mp.weixin.qq.com/s?__biz=MzUxNjg4NDEzNA==&... | Error:
┌───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ × Unexpected error in _crawl_web at line 579 in _crawl_web (../../../.local/lib/python3.10/site- │
│ packages/crawl4ai/async_crawler_strategy.py): │
│ Error: Failed on navigating ACS-GOTO: │
│ Page.goto: net::ERR_NETWORK_CHANGED at │
│ https://mp.weixin.qq.com/s?__biz=MzUxNjg4NDEzNA==&mid=2247522395&idx=1&sn=0a3aabc3bb6b5cdb8fe8fe313225fd4b │
│ Call log: │
│ - navigating to │
│ "https://mp.weixin.qq.com/s?__biz=MzUxNjg4NDEzNA==&mid=2247522395&idx=1&sn=0a3aabc3bb6b5cdb8fe8fe313225fd4b", │
│ waiting until "commit" │
│ │
│ │
│ Code context: │
│ 574 response = await page.goto( │
│ 575 url, wait_until=config.wait_until, timeout=config.page_timeout │
│ 576 ) │
│ 577 redirected_url = page.url │
│ 578 except Error as e: │
│ 579 → raise RuntimeError(f"Failed on navigating ACS-GOTO:\n{str(e)}") │
│ 580 │
│ 581 await self.execute_hook( │
│ 582 "after_goto", page, context=context, url=url, response=response, config=config │
│ 583 ) │
│ 584 │
└───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

2025-03-09 05:16:40.268 | WARNING | general_process:main_process:170 - https://mp.weixin.qq.com/s?__biz=MzUxNjg4NDEzNA==&mid=2247522395&idx=1&sn=0a3aabc3bb6b5cdb8fe8fe313225fd4b failed to crawl
2025-03-09 05:16:40.468 | DEBUG | general_process:main_process:236 - task finished, focus_id: 6g27p3ibst0t279

leoxu2024 · 2025-03-09T06:19:17Z

0.3.9crawl不了微信公众号文章，报错，怎么解决呢

bigbrother666sh · 2025-03-10T03:13:34Z

试一下 0.3.9-patch2版本，重新拉下代码
不过这个版本微信公众号文章内容解析有问题，建议等到这个周末，我们会发布0.3.9-patch3

leoxu2024 · 2025-03-17T02:21:21Z

INIT].... → Crawl4AI 0.5.0.post2 2025-03-09 05:16:22.423 | DEBUG | general_process:main_process:147 - process new url, still 0 urls in working list [ERROR]... × https://mp.weixin.qq.com/s?__biz=MzUxNjg4NDEzNA==&... | Error: ┌───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ × Unexpected error in _crawl_web at line 579 in _crawl_web (../../../.local/lib/python3.10/site- │ │ packages/crawl4ai/async_crawler_strategy.py): │ │ Error: Failed on navigating ACS-GOTO: │ │ Page.goto: net::ERR_NETWORK_CHANGED at │ │ https://mp.weixin.qq.com/s?__biz=MzUxNjg4NDEzNA==&mid=2247522395&idx=1&sn=0a3aabc3bb6b5cdb8fe8fe313225fd4b │ │ Call log: │ │ - navigating to │ │ "https://mp.weixin.qq.com/s?__biz=MzUxNjg4NDEzNA==&mid=2247522395&idx=1&sn=0a3aabc3bb6b5cdb8fe8fe313225fd4b", │ │ waiting until "commit" │ │ │ │ │ │ Code context: │ │ 574 response = await page.goto( │ │ 575 url, wait_until=config.wait_until, timeout=config.page_timeout │ │ 576 ) │ │ 577 redirected_url = page.url │ │ 578 except Error as e: │ │ 579 → raise RuntimeError(f"Failed on navigating ACS-GOTO:\n{str(e)}") │ │ 580 │ │ 581 await self.execute_hook( │ │ 582 "after_goto", page, context=context, url=url, response=response, config=config │ │ 583 ) │ │ 584 │ └───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

2025-03-09 05:16:40.268 | WARNING | general_process:main_process:170 - https://mp.weixin.qq.com/s?__biz=MzUxNjg4NDEzNA==&mid=2247522395&idx=1&sn=0a3aabc3bb6b5cdb8fe8fe313225fd4b failed to crawl 2025-03-09 05:16:40.468 | DEBUG | general_process:main_process:236 - task finished, focus_id: 6g27p3ibst0t279

@bigbrother666sh 你好，微信公众号采集的bug解决得怎么样了？ patch3能解决了吗？现在还是采集不了的哦

leoxu2024 · 2025-03-17T02:22:20Z

试一下 0.3.9-patch2版本，重新拉下代码不过这个版本微信公众号文章内容解析有问题，建议等到这个周末，我们会发布0.3.9-patch3

@bigbrother666sh 还在等着patch3解决采集不了微信公众号的问题

bigbrother666sh · 2025-03-17T09:57:48Z

再多等一天了，明晚应该可以发布

leoxu2024 · 2025-03-19T13:47:15Z

再多等一天了，明晚应该可以发布

@bigbrother666sh 更新到3.9-patch3了，但还是不能采集微信公众号文章，报错如下：
2025-03-19 12:05:26.042 | DEBUG | general_process:main_process:62 - focus_id: 6g27p3ibst0t279, focus_point: AI、人工智能、知
[INIT].... → Crawl4AI 0.5.0.post4
2025-03-19 12:05:27.265 | DEBUG | general_process:main_process:145 - process new url, still 7 urls in working list
Task exception was never retrieved
future: <Task finished name='Task-28' coro=<main_process() done, defined at /home/jzh/wiseflow/wiseflow-3.9/weixin_mp/../core/r a dict')>
Traceback (most recent call last):
File "/home/jzh/wiseflow/wiseflow-3.9/weixin_mp/../core/general_process.py", line 168, in main_process
result = custom_scrapersdomain
File "/home/jzh/wiseflow/wiseflow-3.9/weixin_mp/../core/scrapers/mp_scraper.py", line 33, in mp_scraper
raise TypeError('fetch_result must be a CrawlResult or a dict')
TypeError: fetch_result must be a CrawlResult or a dict

bigbrother666sh · 2025-03-19T14:25:15Z

升级3.9patch3之后先执行 pip uninstall crawl4ai

leoxu2024 · 2025-03-20T07:20:39Z

升级3.9patch3之后先执行 pip uninstall crawl4ai

@bigbrother666sh 升级到3.9patch3后，可采集到公众号文章了，感谢。

bigbrother666sh closed this as completed Mar 20, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docker部署下怎么运行wxbot爬取公众号信息啊 #293

docker部署下怎么运行wxbot爬取公众号信息啊 #293

WuYu-sky commented Mar 5, 2025

bigbrother666sh commented Mar 6, 2025

leoxu2024 commented Mar 9, 2025

leoxu2024 commented Mar 9, 2025

bigbrother666sh commented Mar 10, 2025

leoxu2024 commented Mar 17, 2025

leoxu2024 commented Mar 17, 2025

bigbrother666sh commented Mar 17, 2025

leoxu2024 commented Mar 19, 2025

bigbrother666sh commented Mar 19, 2025

leoxu2024 commented Mar 20, 2025

docker部署下怎么运行wxbot爬取公众号信息啊 #293

docker部署下怎么运行wxbot爬取公众号信息啊 #293

Comments

WuYu-sky commented Mar 5, 2025

bigbrother666sh commented Mar 6, 2025

leoxu2024 commented Mar 9, 2025

leoxu2024 commented Mar 9, 2025

bigbrother666sh commented Mar 10, 2025

leoxu2024 commented Mar 17, 2025

leoxu2024 commented Mar 17, 2025

bigbrother666sh commented Mar 17, 2025

leoxu2024 commented Mar 19, 2025

bigbrother666sh commented Mar 19, 2025

leoxu2024 commented Mar 20, 2025