Skip to content
This repository has been archived by the owner on Jan 11, 2023. It is now read-only.

Allow export to take multiple entrypoints to use when crawling. #825

Merged
merged 4 commits into from Jul 28, 2019

Conversation

pngwn
Copy link
Member

@pngwn pngwn commented Jul 26, 2019

I am PRing this again because I'm a genius and the original PR was from the master of my svelte fork. I was then too terrified to actually try to do anything on another branch in-case I broke something.

Original PR: #756.


This PR allows you to pass a list of entry points to export, which allows users to fully export apps that are not entirely linked by anchor tags.

I'm not sure if this is the best way to do it but it seems to work, I've added a test that provides a few unlinked entry points in the various styles people might try to pass in. I've done my best to handle unnecessary index.htmls and /s (I'm no regex expert so that might need looking at). I also moved the oninfo call so users get a console message for each entry point, followed by the list of saved files for that 'crawl'.

I added .vscode to the gitignore as I was having trouble disabling my autoformatting with out it. I don't know what has happened with the package-lock.json it just seems to add some stuff when I install it.

Closes #749.

@Conduitry
Copy link
Member

What does the currentRoot variable do that's different from the i in the loop? I'm confused about how this is implemented.

@pngwn
Copy link
Member Author

pngwn commented Jul 28, 2019

From what I can tell, nothing, I probably just thought it was more descriptive. I don't really remember, I wrote this code quite a while ago.

In terms of implementation, It's just running handle in a loop instead of doing it just the once on /.

@@ -201,7 +212,7 @@ async function _export({
type = 'text/html';
body = `<script>window.location.href = "${location.replace(origin, '')}"</script>`;

tasks.push(handle(resolve(root.href, location)));
tasks.push(handle(resolve(entryPoints[currentRoot].href, location)));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason for resolving this relative to the current entry point? I think this should still always be relative to root. My understanding is that the multiple entry points is just so that the crawler can find pages it wouldn't ordinarily find. All of them should still have the same base href value.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test you wrote still passes when using root.href, fwiw

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think you're right. I don't think I looked too deeply into this but I guess this clause is handling redirects/ 300s? I think it's using the Location header and building a new url to handle from that. My tests definitely don't test this behaviour, I don't know if any of the existing ones do either but resolving with anything but the root url would fail, I think.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, looking back this clears up some of the confusion I had at the time with the whole entryPoints[currentRoot] thing, I wondered why that was necessary. This correction removes the need for that entirely. I probably should have taken more time to understand the code before implementing this.

@onionhammer
Copy link

It looks like this was broken recently

@kevlened
Copy link

kevlened commented Jul 9, 2020

@onionhammer I also thought this was broken; I was using the wrong syntax. I tried comma-separation and multiple --entry flags, but entries should be separated by spaces: sapper export --entry="/ /products /products/example"

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

export routes not reachable by <a> elements
4 participants