result.url is inconsistent with the actual browser display #886
Unanswered
xih1919
asked this question in
Forums - Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I am very interested in the function of crawl4ai, but I don't know much about its implementation principle. I set include_external=False, stream=True, DFSDeepCrawlStrategy, and the url in the result will be printed during the crawling process, but the url of the browser page is inconsistent with the url in the result. The browser url is cross-domain. Sorry, I can't provide the crawled website. It is an internal privacy website. The cross-domain url of the browser is an internal identity authentication website
The python program is roughly as follows:
import asyncio
from crawl4ai import AsyncWebCrawler, BrowserConfig, CrawlerRunConfig
from crawl4ai.deep_crawling import DFSDeepCrawlStrategy
async def main():
browser_conf = BrowserConfig(
headless=False,
cookies=[
{"name": "uid", "value": "adfcf4111111", "url": "https://aaaa.bbbb.cccc.com/pages/1"},
]
)
if name == "main":
asyncio.run(main())
Beta Was this translation helpful? Give feedback.
All reactions