Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relative links to not work for local files (opened using 'file://' scheme) #12415

Closed
septatrix opened this issue Sep 25, 2020 · 19 comments
Closed
Labels

Comments

@septatrix
Copy link

link.pdf

Configuration:

  • Web browser and its version: Mozilla Firefox 82.0b3
  • Operating system and its version: Ubuntu 20.04.1 LTS
  • PDF.js version: PDF.js: 2.7.69
  • Is a browser extension: no

Steps to reproduce the problem:

  1. Open the attached PDF
  2. The first link is shown and rendered, the second is not

What is the expected behavior? (add screenshot)
image

What went wrong? (add screenshot)
image

LaTeX Code to generate the pdf
\documentclass{article}
\usepackage{hyperref}

\begin{document}

Also see \url{www.example.com} or \url{file:///usr/share/doc/firefox/copyright}.

\end{document}
@Snuffleupagus
Copy link
Collaborator

For (obvious) security reasons, browsers themselves will generally disable linking to file:// URLs; hence these type of links would not work anyway even if they're rendered. (Rendering a link which does nothing when clicked doesn't really seem helpful/useful.)

Given that this is a security feature of browsers, there's unfortunately nothing that can be done here and the issue is thus invalid.

@septatrix
Copy link
Author

For (obvious) security reasons, browsers themselves will generally disable linking to file:// URLs; hence these type of links would not work anyway even if they're rendered. (Rendering a link which does nothing when clicked doesn't really seem helpful/useful.)

Given that this is a security feature of browsers, there's unfortunately nothing that can be done here and the issue is thus invalid.

The same can also be observed for relative links like ../Downloads/link.pdf which are not affected by the mentioned security concerns. The act of linking such urls is also not disabled - just loading e.g. js files or images from the file:// schema are prohibited. Furthermore do chrome and evince among others allow file:// URLs and PDF.js should too as it could e.g. be embedded in a local electron application or similar.


That being sad the issue also occurs with relative links like I said above which - even if you would decide to not support file:// links (whose handling IMO should be delegated to the browser) should definitely supported as this prevents e.g. cross document links. Example file: link.pdf

And the corresponding code:

\documentclass{article}
\usepackage{hyperref}

\begin{document}

Also see \url{www.example.com} or \url{../Downloads/link.pdf}.

\end{document}

@Snuffleupagus
Copy link
Collaborator

[...] PDF.js should too as it could e.g. be embedded in a local electron application or similar.

Please note that that kind of embedding is not something that the PDF.js library supports (or intends to support) out-of-the-box.

Please also see http://mozilla.github.io/pdf.js/getting_started/#introduction (emphasis mine):

The viewer is built on the display layer and is the UI for PDF viewer in Firefox and the other browser extensions within the project. It can be a good starting point for building your own viewer. However, we do ask if you plan to embed the viewer in your own site, that it not just be an unmodified version. Please re-skin it or build upon it.


That being sad the issue also occurs with relative links like I said above [...]

Note that relative links, using protocols such as e.g. HTML, already works in e.g. the built-in Firefox PDF viewer. This is achieved by providing the docBaseUrl option when calling getDocument, see

pdf.js/src/display/api.js

Lines 136 to 138 in 120c5c2

* @property {string} [docBaseUrl] - The base URL of the document, used when
* attempting to recover valid absolute URLs for annotations, and outline
* items, that (incorrectly) only specify relative URLs.

@septatrix
Copy link
Author

Note that relative links, using protocols such as e.g. HTML, already works in e.g. the built-in Firefox PDF viewer. This is achieved by providing the docBaseUrl option when calling getDocument, see

Yes but this only works when opening pdf from a URL and not for local pdf. As I and probably also many other use Firefox as their primary pdf viewer this is limiting. When starting a local file server I can open the pdf without problems and the link is visible - when opening the same file from the file explorer (e.g. the file:// scheme) the link does not work.

@septatrix
Copy link
Author

I do not see a reason to not include file: in the function here

pdf.js/src/shared/util.js

Lines 368 to 382 in 120c5c2

function _isValidProtocol(url) {
if (!url) {
return false;
}
switch (url.protocol) {
case "http:":
case "https:":
case "ftp:":
case "mailto:":
case "tel:":
return true;
default:
return false;
}
}

The only attack surface I could think of is embedded javascript inside a local pdf file which - upon opening - sends some local resources to a remote address. However I do not know whether that is possible at all given that all resources are embedded in the pdf itself. As far as I can see the above function is only used for links and as such I do not see a surface for a XSS attack. If I missed something please tell me - though most other pdf viewers I know allow such links and therefore mitigating such an attack should be possible if it exists...

@septatrix
Copy link
Author

May I ask why relative links are resolved to an absolute path at all? It seems easier to just leave them as a relative link which would also fix this problem.

@septatrix septatrix changed the title File links do not work and their boxes are not rendered Relative links to not work for local files (opened using 'file://' scheme) Sep 26, 2020
@septatrix
Copy link
Author

For (obvious) security reasons, browsers themselves will generally disable linking to file:// URLs; hence these type of links would not work anyway even if they're rendered. (Rendering a link which does nothing when clicked doesn't really seem helpful/useful.)

I would also like to point out that firefox and chrome both support file:// links when the current protocol is also file:. Therefore a simple fix respecting that behaviour would be to expand _isValidProtocol to also take another argument being the same root url as given to createValidAbsoluteUrl and also return true if both urls have the file: scheme.

@septatrix
Copy link
Author

@Snuffleupagus would you be open for the above solution?

@Snuffleupagus
Copy link
Collaborator

@Snuffleupagus would you be open for the above solution?

No, since what you suggest is too simplistic and it would not actually work in the Firefox PDF Viewer anyway.

The technical reason for this, is that the Firefox PDF Viewer actually runs with a resource:// URL (a Firefox-internal concept) and those are not allowed to link to e.g. the file:// protocol (since that's then regarded as a cross-origin navigation).

@septatrix
Copy link
Author

Okay I see. But why is cross-origin navigation prohibited? The sole act of linking should not open an attack surface or what am I missing?

@Snuffleupagus
Copy link
Collaborator

Snuffleupagus commented Oct 2, 2020

But why is cross-origin navigation prohibited?

Because otherwise e.g. a web-server (running at http://) may be able to access your local files (at file://), which would pose a huge security risk; hence why browsers generally prevent all kinds of navigation to file:// URLs from other protocols.

@septatrix
Copy link
Author

So the browser does not differentiate between a link and a resource as a link is also loading a resource to some extent - I see that makes sense, thank you. However the current implementation means that protocols like tg: or whatsapp: or any other non-whitelisted protocol is also prohibited which would theoretically be allowed from any other protocol. Furthermore I would still like to see the link rendered or at least outlines (if specified) as not rendering them deviates from the PDF spec.

@timvandermeij
Copy link
Contributor

timvandermeij commented Oct 2, 2020

Rendering a link that's not actually doing anything when clicked is guaranteed to result in an equal number of questions, if not more. By definition a link has an action when clicked, so since we cannot provide an action in this case due to browser limitations there's no point in my opinion in rendering the link at all.

In a custom deployment of PDF.js you can of course tweak this, but it's not something we support.

@septatrix
Copy link
Author

How about other protocols like e.g. whatsapp:?

@timvandermeij
Copy link
Contributor

timvandermeij commented Oct 3, 2020

They are evaluated on a case-by-case basis. Given that tel is allowed:

case "tel:":

I don't immediately see an issue for whatsapp:.

@septatrix
Copy link
Author

septatrix commented Oct 6, 2020

I would like to keep this issue open. I understand that my initial solution (just allowing the file protocol) could lead to a degraded user experience as the links would not do anything. On the other hand however the issue is not fixed by any means and still represent a violation of the PDF spec. Furthermore this issue prevents links between local PDFs to work which are not too rare in larger documents. Therefore would I like to take a closer look at this and maybe even the original bug on the mozilla tracker regarding relative links and see if there may be an alternative solution which would work for both use cases.

Maybe this bug should however be reported in the firefox bugtracker as it seems impossible to implement a fix in pdf.js alone and it would rather need changes in how firefox embeds pdf.js?

@Snuffleupagus
Copy link
Collaborator

Snuffleupagus commented Oct 7, 2020

regarding relative links

The PDF.js viewer is purposely not supporting relative links, and that simply won't change, without docBaseUrl being set.

There's (generally speaking) no problem generating an absolute URL even for a file:// URL, but as was already mentioned in #12415 (comment) the problem is that the PDF Viewer (in Firefox) runs with a resource:// URL.

see if there may be an alternative solution which would work for both use cases.

The only solution would be if a resource:// URL was allowed to link to a file:// URL, which we however cannot fix here (and there's probably security considerations involved as well).

@ericrbg
Copy link

ericrbg commented Nov 14, 2020

I agree that something should be done about this. You can end up with hideously broken files like mwe.pdf if you open it locally on FF instead of on a PDF viewer, but if you open it with, say, the online pdf.js demo, it'll work fine.

LaTeX to generate the presentation
\documentclass{beamer}
\usetheme{Hannover}
\title{Lorem Ipsum}
\begin{document}

\frame{\titlepage}
\section{AAAAAAAA}

\begin{frame}
    \frametitle{A}
    Lorem ipsum, dolor sit amet!
\end{frame}

\section{BBBBBBBB}

\subsection{1111111111111}
\begin{frame}
    \frametitle{B1}
    Lorem ipsum, dolor sit amet!
\end{frame}

\subsection{2222222222222}
\begin{frame}
    \frametitle{B2}
    Lorem ipsum, dolor sit amet!
\end{frame}

\section{CCCCCCCC}

\begin{frame}
    \frametitle{C}
    Lorem ipsum, dolor sit amet!
\end{frame}

\end{document}

@septatrix
Copy link
Author

I opened a ticket at firefox bugzilla regarding the problem and briefly outlined the current problem, steps which are necessary, and behaviour of competitor products.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants