Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a way to enumerate fonts used in a SVG before rendering #555

Closed
RReverser opened this issue Nov 2, 2022 · 22 comments
Closed

Add a way to enumerate fonts used in a SVG before rendering #555

RReverser opened this issue Nov 2, 2022 · 22 comments

Comments

@RReverser
Copy link

Loading fonts can be a pretty expensive operation, especially on targets like JS/Wasm, where there is no mmap-like feature, and one must load fonts asynchronously before invoking resvg.

To optimise this case, it would be handy if usvg provided a way to get set of fonts used in the SVG document before rendering it to the final form. That way, resvg-js users, as well as native ones, could only load fonts that will be definitely necessary for rendering the document.

I'm not sure how feasible this is given that usvg downcasts text to shapes as part of its processing, but maybe it could preserve text nodes in some intermediate form to give user the chance to do this async loading?

@RazrFalcon
Copy link
Collaborator

I don't thinks it's possible for multiple reasons.

First, the reason usvg loads all system fonts is because we have to query them. How would you resolve font-family="Arial" font-weight="bold"? We don't know what font file should be loaded.

Second, text-to-path conversion is part of the "parsing" process at the moment and cannot be decoupled. So you cannot parse an SVG, query fonts and then render the SVG to begin with.
I do plan to make it sort of optional/on-demand. But it will be a huge rewrite.

@RReverser
Copy link
Author

How would you resolve font-family="Arial" font-weight="bold"? We don't know what font file should be loaded.

I mean if resvg provides just that data as a raw list, that would be already helpful, as there are various databases providing the necessary mappings.

As a concrete example, for resvg-js usecase that I mentioned above, I could use a service like Google Fonts that accepts URLs in form of https://fonts.googleapis.com/css?family=(family):(weight) for CSS as well as extended API that returns direct URLs to font files:

{
   "kind": "webfonts#webfont",
   "family": "Anonymous Pro",
   "variants": [
    "regular",
    "italic",
    "700",
    "700italic"
   ],
   "subsets": [
    "greek",
    "greek-ext",
    "cyrillic-ext",
    "latin-ext",
    "latin",
    "cyrillic"
   ],
   "version": "v3",
   "lastModified": "2012-07-25",
   "files": {
    "regular": "http://themes.googleusercontent.com/static/fonts/anonymouspro/v3/Zhfjj_gat3waL4JSju74E-V_5zh5b-_HiooIRUBwn1A.ttf",
    "italic": "http://themes.googleusercontent.com/static/fonts/anonymouspro/v3/q0u6LFHwttnT_69euiDbWKwIsuKDCXG0NQm7BvAgx-c.ttf",
    "700": "http://themes.googleusercontent.com/static/fonts/anonymouspro/v3/WDf5lZYgdmmKhO8E1AQud--Cz_5MeePnXDAcLNWyBME.ttf",
    "700italic": "http://themes.googleusercontent.com/static/fonts/anonymouspro/v3/_fVr_XGln-cetWSUc-JpfA1LL9bfs7wyIp6F8OC9RxA.ttf"
   }
  },
  ...

Or, in Chrome, I could use the new Local Font Access API that allows to access any system-wide fonts:

await (await queryLocalFonts()).find(font => font.family === 'Arial' && font.style === 'Bold').blob()
Blob {size: 980756, type: 'application/octet-stream'}

Second, text-to-path conversion is part of the "parsing" process at the moment and cannot be decoupled.

Yeah, that's the issue I mentioned in my last paragraph, but thought I'd submit a feature request anyway, as I think this could be a valuable optimisation.

@RReverser
Copy link
Author

As a non-JS usecase, I should probably mention that some other systems have ways of cheaply querying fonts without loading all of them, too.

E.g. on Windows you can look up font filename by its family + weight in the registry, which should be also relatively easy to do from Rust:

image

@RazrFalcon
Copy link
Collaborator

I mean if resvg provides just that data as a raw list, that would be already helpful, as there are various databases providing the necessary mappings.

This is sort of planned but would take forever, since it would require a substantial rewrite.

As a non-JS usecase, I should probably mention that some other systems have ways of cheaply querying fonts without loading all of them, too.

They don't. I've tried. That's why resvg loads them manually. Not to mention that implementing and supporting at least 3 OS APIs is far from trivial.

As for the Windows registry - this is not the info I need. You're underestimating how much information about the font SVG needs. Parsing them manually is the only way.

Also, why do you think that fonts loading is slow? fontdb initialization with a hot disk cache takes like 10ms on my mac. Not instantly, but still 10x faster than the rendering step itself.

@RReverser
Copy link
Author

RReverser commented Nov 2, 2022

Also, why do you think that fonts loading is slow?

As I said, for me the primary usecase is resvg-js:

especially on targets like JS/Wasm, where there is no mmap-like feature, and one must load fonts asynchronously before invoking resvg

Any fonts for rendering need to be literally downloaded from the internet, so you're stuck with two choices: either bundle a very limited set of fonts and hope that any given SVG doesn't rely on any advanced fonts, or download a lot of fonts during runtime - easily in hundred-MB range - which is obviously impractical.

That's where knowing the set of required fonts in advance would help immensely, as it would allow to lazy-load only what's strictly necessary.

@RReverser
Copy link
Author

This is sort of planned but would take forever, since it would require a substantial rewrite.

I wonder if, for a start, it could be easier to just make font access accept a custom trait instead of specifically fontdb? Then a user like me could at least do a 2-pass parse via usvg - first one just to collect list of fonts via custom trait that only stores the font names but reports "font absent" for any query, then asynchronously download all the fonts, and then 2nd pass via usvg+resvg with all the fonts now ready to use.

This would be still a bit inefficient, but probably a relatively simple change to the codebase, and definitely cheaper than downloading arbitrary set of fonts off the internet before it's even known which ones will be required.

@RazrFalcon
Copy link
Collaborator

Once again, you're underestimating the complexity. Sure, we can probably replace the fontdb with a trait (C API would be a nightmare... or whatever you use from JS), but only for the query part.
What about the font fallback? In the most simple case we basically have to find a font that provides a glyph for a specific Unicode character. How would you handle it? Simply ignore?
What about ownership? Who owns the font? Definitely not resvg. But then we have to have a separate callback to get font's data so we can pass it to ttf-parser and rustybuzz.
What about caching? We don't have any at the moment, but ideally we should cache glyph outlines. To do so, we have to identify unique fonts. How should we do this? I don't know.
What about embedded fonts? Should we store them in, I don't know, an internal font database?
And so on and so on.

SVG Text is a nightmare on itself. Making it even worse by providing a generic API for font's handling isn't worth it.

Sure, if resvg/usvg was designed with this use case in mind from the group up it would be easier. But to implement it right now we would have to basically rewrite everything.

Yes, fonts loading is a bit expensive. Yes, I know about this bottleneck. No, I don't have a viable solution to this problem.


A bit off topic, but why do you use resvg in WASM/JS to begin with? A browser already provides everything you need. I just don't understand the use case. I get that resvg is probably the only option available (afaik), but wouldn't JS + Canvas library be a way better option?
I'm not a web dev, but using resvg seems like an overkill to me. Not too mention the WASM performance.

@RReverser
Copy link
Author

I get that resvg is probably the only option available (afaik), but wouldn't JS + Canvas library be a way better option?

Yeah canvas would make this a lot easier, but unfortunately there's tons of environments these days that support JS/Wasm but don't have either DOM or native code support - Cloudflare Workers, Deno, Stackblitz, and more. There using Wasm is the perfect option, but it does mean patching over the missing bits. Can't go into details of the specific usecase yet, but it does have to do with rendering SVG in such environments.

I agree there's tons of edge cases, and obviously you've thought about those problems a lot longer than I have - I literally only ran into this only a few days ago and explored a few other options before submitting a feature request - I'm just trying to come up with an alternative that wouldn't require a complete rewrite / refactoring, hence the suggestion above.

@RReverser
Copy link
Author

I literally only ran into this only a few days ago and explored a few other options before submitting a feature request

Fun fact, resvg itself is already an alternative option after looking into canvaskit-wasm (which, turns out, doesn't have SVG) & generally Skia (which does have some SVG support, but is in an even worse shape in terms of support & API).

Oh well, this is looking increasingly less optimistic. I'll try to think of other possible options, or maybe try to tinker with the trait idea in a local checkout for now. I still suspect that, even ignoring the edge cases you described, it could already cover 80+% usecases.

@RazrFalcon
Copy link
Collaborator

it could already cover 80+% usecases

Well, if you read the readme - resvg is all about those 100%.

The only solution I can think of is grep-ing SVG by hand and populating fontdb accordingly. This will give you "80%".

The only other solution is to convert text to path on demand. I.e. introducing usvg::Text type and usvg::Text::outline method. But again, this would require a rewrite.

And this still will be a mess, because fonts this day store not only outlines. That's why usvg requires a rewrite anyway to support vector and raster emojis.

Text is a never ending hell.

@RReverser
Copy link
Author

RReverser commented Nov 3, 2022

Well, if you read the readme - resvg is all about those 100%.

To be fair, while I appreciate it's a worthy goal, right now in the JS scenario I'm describing if the font is not preloaded, then SVG text won't be rendered correctly anyway, so even imperfect list of fonts in advance would already improve compatibility a lot, even if it still misses some edge cases.

The only solution I can think of is grep-ing SVG by hand and populating fontdb accordingly. This will give you "80%".

Yeah but usvg already has a proper parser, so seems better to piggyback on that rather than just grep.

@RReverser
Copy link
Author

Text is a never ending hell.

I'm sorry 😅

@RazrFalcon
Copy link
Collaborator

JS scenario

You're holding it wrong. (c)

It's not the resvg problem that you're trying to use it in a weird way. It was never designed to work in WASM/JS environments. It's a system, desktop library.

@RReverser
Copy link
Author

And yet it works, and is relatively popular. Your fault for creating a library that works so well in other environments :P (just kidding, obviously)

(в будь-якому разі, дякую за допомогу, поки пограюсь далі сам :) )

@RazrFalcon
Copy link
Collaborator

And yet it works

Accidentally. Thanks to being pure Rust and Rust's WASM support. But I want to point out it again that it wasn't designed to be used like this.
Your use case will be supported eventually. The question is when (not soon).

(нема за що)

@RReverser
Copy link
Author

Slightly off-topic, something I came across while looking into lazy font loading: have you looked into Servo's font scanning code?

Too bad it's not a separate crate (at least yet), but they seem to have a pretty robust implementation for font scanning in macOS / Windows / Linux via FreeType, that lazily (?) enumerates font families & variants and can invoke arbitrary callbacks, together with fallback support: https://github.com/servo/servo/tree/master/components/gfx/platform, e.g. https://github.com/servo/servo/blob/master/components/gfx/platform/windows/font_list.rs

It's probably a bit too hefty for the fontdb since it says "And since fontdb tries to be small and portable [...]", but if/when usvg has that custom font interface, I imagine users could choose to plug that Servo's implementation if they want lazy-loading.

@RazrFalcon
Copy link
Collaborator

resvg doesn't use system libraries by design, so this is out of scope.
And providing a generic API instead of relying on fontdb would be way too hard.

@RReverser
Copy link
Author

And providing a generic API instead of relying on fontdb would be way too hard.

Wait I guess I misunderstood your earlier

Your use case will be supported eventually.

I thought you meant giving user ability to preload fonts after parsing but before rendering?

If so, that functionality would be already generic enough to allow someone to use a library like Servo's implementation to find & load the necessary fonts.

@RazrFalcon
Copy link
Collaborator

I thought you meant giving user ability to preload fonts after parsing but before rendering?

Yes. fontdb isn't going anywhere.

@RReverser
Copy link
Author

Sure, and I wasn't suggesting it was. I guess we just had a misunderstanding.

@RazrFalcon
Copy link
Collaborator

You can do it like this now.

    let mut tree = usvg::Tree::from_data(&svg, &opt).unwrap();

    let mut fonts = HashSet::new();
    for node in tree.root.descendants() {
        if let usvg::NodeKind::Text(ref text) = *node.borrow() {
            for chunk in &text.chunks {
                for span in &chunk.spans {
                    for family in &span.font.families {
                        fonts.insert(family.to_owned());
                    }
                }
            }
        }
    }
    println!("{:?}", fonts);
    // load your fonts into fontdb and then
    tree.convert_text(&fontdb, opt.keep_named_groups);

@RReverser
Copy link
Author

Looks promising, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants