Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

svg backend: use openhtmltopdf UserAgentCallback for fetching external resources #444 #486

Merged

Conversation

syjer
Copy link
Contributor

@syjer syjer commented May 16, 2020

Hi @danfickle , this PR is a work in progress for making more coherent the loading of resources in the svg backend (see issue #444)

Basically we inject our own UserAgentCallback which contain all the logic inside a custom DocumentLoader.

There still some open point:

  • currently it will block by default the loading through the UserAgentCallback as allowExternalResources is set false by default. Unfortunately at the moment it's a binary choice. We can totally open or totally close the access to the external world. How can we improve that so we can, for example, only allow some specific protocols (most likely a custom one)? Maybe we can extend the interface UserAgentCallback so he can provide some - customizable by the user - logic for checking the load of external resources ? Or maybe we pass a whitelist of protocol in the svg builder?

  • tests :)

What do you think? :)

@danfickle
Copy link
Owner

Hi @syjer,

The way I've handled this in general is everything is meant to go through any configured uri resolver which can return null if it doesn't want to allow a particular resource. Uri resolver with stream handlers means that the user should have total customisation options to loading resources. The move to disallow external resources in svg by default was because it was not using the uri resolver.

Ideally the svg renderer would use the configured uri resolver and any stream handlers. We may then not need the ability to provide the custom user agent? If you have found where to plugin to batik to intercept resource calls, we could just call into our user agent which is already aware of uri resolver and stream handlers.

What do you think? Thanks again.

@syjer
Copy link
Contributor Author

syjer commented May 18, 2020

hi @danfickle ,

Ideally the svg renderer would use the configured uri resolver and any stream handlers. We may then not need the ability to provide the custom user agent? If you have found where to plugin to batik to intercept resource calls, we could just call into our user agent which is already aware of uri resolver and stream handlers.

This is exactly what this PR is doing :)

The most important line is: https://github.com/danfickle/openhtmltopdf/pull/486/files#diff-ab8a6ba6ef10e59d4db15a47c794c89dR23

. Everything else is for making it possible to inject our user agent inside Batik.

I'll begin to add tests asap.

@danfickle
Copy link
Owner

That's embarrassing! I should have read the code more carefully. Yes, with tests this pr would be awesome. As to the question of whether we should allow external resources by default, albeit through the uri resolver, etc, I think we're stuck with the current situation. If they want external resources, use the other constructor. Otherwise we would be silently allowing external resources. Not everybody reads the change log!

@danfickle danfickle marked this pull request as ready for review May 23, 2020 07:57
@danfickle
Copy link
Owner

Hi @syjer,

Unless you or @rototor have any objections, I plan to do a release tomorrow or Monday. If you have no objections, I'll merge this beforehand and add a couple of quick tests.

Thanks again.

@syjer
Copy link
Contributor Author

syjer commented May 23, 2020

hi @danfickle , no objection on my side. I'll add some tests today :)

@rototor
Copy link
Contributor

rototor commented May 23, 2020

@danfickle I also have no objection.

@syjer
Copy link
Contributor Author

syjer commented May 23, 2020

@danfickle , I've added the possibility in the BatikSVGDrawer constructor to specify a whitelist of protocols.

Now I'll add the tests :)

@syjer
Copy link
Contributor Author

syjer commented May 23, 2020

@danfickle , I've added 3 visual tests:

  • one that check if it will block the loading of external resource (default)
  • one that whitelist the file protocol and check the case of relative file url
  • one with a custom:/ protocol

@syjer syjer force-pushed the 444-batik-use-useragentcallback branch from 004cdbe to 468464c Compare May 23, 2020 13:17
@danfickle
Copy link
Owner

Awesome, you've even used @page sizes to minimise test runtime! Thanks!

@danfickle danfickle merged commit bcba46e into danfickle:open-dev-v1 May 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants