fix(v2): markdown reference to file should not be page not found #2064

endiliey · 2019-11-28T05:58:53Z

Motivation

See https://v2.docusaurus.io/feedback/p/ability-to-download-static-assets-such-as-pdfs

In v1 it was possible to link to pdfs and other non-img static files in markdown using the syntax [file name] (/path/to/asset.pdf). When the user would click this link, the browser would trigger a download. In v2, docusaurus routing does not allow this and results in a 404, even though hitting my-site.com/path/to/asset.pdf directly will download the file just fine.

It's because we replace all a in MDX with Link. But we should've checked if its a file first. 99.99% webpage url that links to file will have .xxx notation in last part of URL

There might be edge cases like where some bad user has a url like /docs/docusaurus.png in which its a valid internal URL (not a file), its still OK to use as it only causes page refresh.

Have you read the Contributing Guidelines on pull requests?

yes

Test Plan

Before

After

docusaurus-bot · 2019-11-28T06:03:07Z

Deploy preview for docusaurus-2 ready!

Built with commit 457e867

https://deploy-preview-2064--docusaurus-2.netlify.com

docusaurus-bot · 2019-11-28T06:04:05Z

Deploy preview for docusaurus-preview ready!

Built with commit 457e867

https://deploy-preview-2064--docusaurus-preview.netlify.com

lex111

What will be the disadvantages if this check move to the Link component?

docusaurus/packages/docusaurus/src/client/exports/Link.js

Lines 70 to 72 in 6ff6656

    
           return !targetLink || !isInternal ? ( 
        
             // eslint-disable-next-line jsx-a11y/anchor-has-content 
        
             <a {...props} href={targetLink} />

yangshun

There might be edge cases like where some bad user has a url like /docs/docusaurus.png in which its a valid internal URL (not a file), its still OK to use as it only causes page refresh.

How about we have a convention that and any path starting with "public" gets whitelisted from client-side navigations? Users can then put in their PDF files within website/static/public.

Do we have all the possible routes on the client side? If we do we can know which paths are not part of the app and then fallback to full page navigation.

On a side note, the TOC for the Client API page is broken... I think it's treating the contents as a component to be rendered.

wgao19 · 2019-11-29T01:32:32Z

i feel parsing can be more evil than maintaining a known type of links that are supposed to be anchors

endiliey · 2019-11-29T03:32:01Z

What will be the disadvantages if this check move to the Link component?

More fine grain control. So note that this PR only changes the behavior of changing syntax in markdown file.

So that [some link](/some/url]) became

<a href="/some/url">Some link</a>

or

<Link to="/some/url">Some link</Link>

So if we move it to Link, if some bad people have a valid /docs/docusaurus.png (not a file), we are going to make them have a page ~~redirect~~ refresh for this. When they especially already use <Link /> example in src/pages/index.js.

How about we have a convention that and any path starting with "public" gets whitelisted from client-side navigations? Users can then put in their PDF files within website/static/public.

I feel that's gonna be more hard (to teach) and might cause lot of parsing problems. Like what if their baseUrl itself is /public or the docs routebasepath is /public. Alright that might be okay, but teaching this kind of thing (which is not a standard) can be hard.
Actually we can get away with better convention like using https: in the starting path so that it won't use client side nav

Do we have all the possible routes on the client side? If we do we can know which paths are not part of the app and then fallback to full page navigation.

We do, but I see lot of problems for these. Like if we have 2000 routes, we gonna have to try and match if there's any matching URL, for every single Link.
And everyone's route can be dynamic. For example: /docs/some-typo-url is in fact will match nested routes /docs/:route. What if my static file is located in website/docs/superman.png. It will be back to this problem Page Not Found.

Another dynamic example is https://v2.docusaurus.io/feedback/p/support-for-markdown-to-do-list-feature

/feedback/p/xxxx is not a valid route in Docusaurus. It will match the wildcard "*" page not found component, but I put a hack in

docusaurus/website/src/theme/NotFound.js

Lines 12 to 15 in 34e942e

    
           function NotFound({location}) { 
        
             if (/^\/feedback/.test(location.pathname)) { 
        
               return <Feedback />; 
        
             }

such that if its /feedback/xxx, i render Feedback component.
I can point out a lot more problems such as all the Link component will depends on routes, causing hot reload performance issues and so on.

On a side note, the TOC for the Client API page is broken... I think it's treating the contents as a component to be rendered.

Oh i guess its separate from this PR. Current v2 site also has it.

Here's my recommendation:

Use this PR and call it a day, it should solve most of the described problem, the only downside of this PR is that it might causes page refresh (using a instead of Link if the markdown syntax used is [some link](/some/url.xxx). However, a good website should not have a .xxx url in their last part of url as valid route :x (not a file)
Or, tell user to use https:// convention, since that forces not using client side.

Any other recommendation is appreciated

Edit: We did it for v1 routing, but blacklisting .html and since its an express.js, routing we start with ^\/ regex (internal only)

docusaurus/packages/docusaurus-1.x/lib/server/routing.js

Lines 16 to 18 in 2f9a368

    
           function dotfiles() { 
        
             return /(?!.*html$)^\/.*\.[^\n/]+$/; 
        
           }

yangshun

Ok then 😄

endiliey · 2019-11-29T04:16:23Z

Oh we are one such bad user https://v2.docusaurus.io/docs/docusaurus.config.js is not a file 😭😭😭😭

yangshun · 2019-11-29T04:19:21Z

Yes, see the edit history of my comment 🤣. I wrote it without realizing you already considered it, then I removed it.

We cannot assume that our internal links don't have a period in them. For example, we have a page docusaurus.config.js and our client API page links to it.

The consequence is mainly a full-page refresh, which still works but not sure if there's a better way around it.

endiliey · 2019-11-29T04:21:15Z

The good thing is that the page refresh is on markdown linking. using Link still works perfectly 🤣

yangshun · 2019-11-29T04:25:06Z

Yeah! At first I was trying out the docs bottom navigation and was wondering why no full-page refresh lol

fix(v2): markdown reference to file should not be page not found

457e867

endiliey requested review from lex111, wgao19 and yangshun as code owners November 28, 2019 05:58

endiliey added the pr: bug fix This PR fixes a bug in a past release. label Nov 28, 2019

facebook-github-bot added the CLA Signed Signed Facebook CLA label Nov 28, 2019

lex111 reviewed Nov 28, 2019

View reviewed changes

yangshun suggested changes Nov 28, 2019

View reviewed changes

yangshun approved these changes Nov 29, 2019

View reviewed changes

yangshun merged commit 3430fbf into master Nov 29, 2019

yangshun deleted the endi/markdown-linking branch November 29, 2019 04:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(v2): markdown reference to file should not be page not found #2064

fix(v2): markdown reference to file should not be page not found #2064

endiliey commented Nov 28, 2019

docusaurus-bot commented Nov 28, 2019

docusaurus-bot commented Nov 28, 2019

lex111 left a comment

yangshun left a comment •

edited

Loading

wgao19 commented Nov 29, 2019

endiliey commented Nov 29, 2019 •

edited

Loading

yangshun left a comment

endiliey commented Nov 29, 2019

yangshun commented Nov 29, 2019

endiliey commented Nov 29, 2019

yangshun commented Nov 29, 2019

	return !targetLink \|\| !isInternal ? (
	// eslint-disable-next-line jsx-a11y/anchor-has-content
	<a {...props} href={targetLink} />

fix(v2): markdown reference to file should not be page not found #2064

fix(v2): markdown reference to file should not be page not found #2064

Conversation

endiliey commented Nov 28, 2019

Motivation

Have you read the Contributing Guidelines on pull requests?

Test Plan

docusaurus-bot commented Nov 28, 2019

docusaurus-bot commented Nov 28, 2019

lex111 left a comment

Choose a reason for hiding this comment

yangshun left a comment • edited Loading

Choose a reason for hiding this comment

wgao19 commented Nov 29, 2019

endiliey commented Nov 29, 2019 • edited Loading

yangshun left a comment

Choose a reason for hiding this comment

endiliey commented Nov 29, 2019

yangshun commented Nov 29, 2019

endiliey commented Nov 29, 2019

yangshun commented Nov 29, 2019

yangshun left a comment •

edited

Loading

endiliey commented Nov 29, 2019 •

edited

Loading