-
-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Starting crawl from subdirectory #196
Comments
I notice this works with unlighthouse@0.5.1 but not 0.6.0 or after. |
--include-urls does not solve this issue. It hangs the same as the original issue. |
Hi @Robanna777, thanks for the issue. Seems like this wasn't supported and worked by accident in earlier versions. I've pushed up a fix for it, you can use it as: npx unlighthouse@0.11.5 --site https://teamsideline.com/sites/apex/home Let me know if you have any issues with it. |
That's awesome. Thank you. That works perfectly. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Details
When I run a command for --site https://site/subdirector on my mac, everything works as I'd like; starting with that page, doesn't find a sitemap file, so falls back to crawling from https://site/subdirector but on a windows machine, the crawling starts at the domain https://site.
Is there a configuration that I can force it to start at the subdirectory? I tried -include /subdirector/.* but that doesn't seem to do it. With that, it just hangs.
Debug shows this "GET /api/reports 200 object - 0ms" repeating over and over.
Mac:
Successfully connected to https://teamsideline.com/Layouts/minimalist/Home.aspx?d=ZHcj%2bsPHK5g%2bZkLyQaVo0Q%3d%3d/, status code: 200. unlighthouse 07:50:32
───────────────────────────────────────────────────╮
│ │
│ ⛵ unlighthouse cli @ v0.5.0 │
│ │
│ ▸ Scanning: https://teamsideline.com/Layouts/minimalist/Home.aspx?d=ZHcj%2bsPHK5g%2bZkLyQaVo0Q%3d%3d/ │
│ ▸ Route Discovery: Crawler
Windows:
Successfully connected to https://teamsideline.com/. (Status: 200). Unlighthouse 2:50:40 PM
─────────╮
│ │
│ ⛵ Unlighthouse cli @ v0.11.4 │
│ │
│ ▸ Scanning: https://teamsideline.com/ │
│ ▸ Route Discovery: Crawler
The text was updated successfully, but these errors were encountered: