Add to_svg #519

jojolebarjos · 2019-12-24T21:34:18Z

This is another tentative at vector file export. I initially did some experiments using freetype-py, but I was able to get the same results using Pillow bindings of FreeType (there are inconsistencies, though).

There are however slight mismatches between rasterized and vector outputs. I also spotted inconsistencies using different browsers for some fonts (e.g. Segoe Script). I added an option (embed_image) to include the output of to_image in the SVG file, which is useful for debugging. I believe most differences are due to subtleties in kerning and ligatures, which is not significant in common fonts.

It may also be useful to consider embedding the font file directly in the SVG file (or only the required subset, ideally). As my personal use case does not require fancy fonts, I did not dig any further on this topic. This means that using the default font will render properly only if Droid Sans Mono is available to the SVG reader.

I have attached two modified examples, which use different fonts for portability.

Also, contour is not implemented.

Last but not least, I want to credit @jnothman and his pull request (in particular, his hack to detect font name/weight/style). As I am not fluent in forks and pull requests, I had to create a new pull request. But I would be glad to learn how to extend an existing pull request!

jnothman · 2019-12-25T00:13:09Z

Paste pics?

…

On Wed., 25 Dec. 2019, 8:34 am Jojo le Barjos, ***@***.***> wrote: This is another tentative at vector file export. I initially did some experiments using freetype-py <https://pypi.org/project/freetype-py/>, but I was able to get the same results using Pillow bindings of FreeType (there are inconsistencies <https://stackoverflow.com/questions/43060479/how-to-get-the-font-pixel-height-using-pil-imagefont>, though). There are however slight mismatches between rasterized and vector outputs. I also spotted inconsistencies using different browsers for some fonts (e.g. Segoe Script). I added an option (embed_image) to include the output of to_image in the SVG file, which is useful for debugging. I believe most differences are due to subtleties in kerning and ligatures, which is not significant in common fonts. It may also be useful to consider embedding the font file directly in the SVG file (or only the required subset, ideally). As my personal use case does not require fancy fonts, I did not dig any further on this topic. This means that using the default font will render properly only if Droid Sans Mono is available to the SVG reader. I have attached two modified examples <https://github.com/amueller/word_cloud/files/3999068/examples.zip>, which use different fonts for portability. Also, contour is not implemented. Last but not least, I want to credit @jnothman <https://github.com/jnothman> and his pull request <#163> (in particular, his hack to detect font name/weight/style). As I am not fluent in forks and pull requests, I had to create a new pull request. But I would be glad to learn how to extend an existing pull request! ------------------------------ You can view, comment on, or merge this pull request online at: #519 Commit Summary - Add to_svg prototype, using freetype-py to get metrics - Use PIL bindings instead of freetype-py - Handle scale attribute - Small fix in offset - Add option to embed image in SVG, useful for debug File Changes - *M* wordcloud/wordcloud.py <https://github.com/amueller/word_cloud/pull/519/files#diff-0> (126) Patch Links: - https://github.com/amueller/word_cloud/pull/519.patch - https://github.com/amueller/word_cloud/pull/519.diff — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#519?email_source=notifications&email_token=AAATH2ZPRYUNRBE3UFHGJKLQ2J55XA5CNFSM4J7A3HP2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4ICRP27A>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAATH24K7U6NXLV2PXGJCVTQ2J55XANCNFSM4J7A3HPQ> .

jojolebarjos · 2019-12-25T09:29:33Z

Sure, here are rasterized versions of the SVG I attached:

to_image:
to_svg (rasterized with Inkscape):
to_image:
to_svg (rasterized with Inkscape):

I also attach an example that use Segoe Script, to highlight differences between browsers:

to_image:
to_svg (rasterized with Chrome):
to_svg (rasterized with Internet Explorer):

jnothman · 2019-12-29T00:12:07Z

Your PR has flake8 failures. Please add another commit rectifying these:


[scikit-ci] Executing: flake8
./wordcloud/wordcloud.py:733: [W293] blank line contains whitespace
    
^
./wordcloud/wordcloud.py:802: [W293] blank line contains whitespace
        
^
./wordcloud/wordcloud.py:822: [W293] blank line contains whitespace
            
^
./wordcloud/wordcloud.py:829: [E501] line too long (156 > 120 characters)
            # TODO some browser do not render glyphs the same way (e.g. Segoe Script in Chrome is different, while in Internet Explorer it matches to_image)
                                                                                                                        ^
./wordcloud/wordcloud.py:832: [F841] local variable 'min_y' is assigned to but never used
            min_y = ascent - size_y
            ^

jnothman

Neat!

wordcloud/wordcloud.py

jnothman · 2019-12-29T00:16:37Z

I agree the masking of each word is not perfect, but it's decent...

jojolebarjos · 2019-12-29T22:28:21Z

Thank you for your feedback, I really appreciate. I have fixed my code, according to your suggestions. Indeed, I'm not sure whether we can do anything about these differences in SVG renderers.

jnothman · 2019-12-31T14:03:25Z

This lgtm

jojolebarjos · 2020-01-04T11:51:51Z

I have added an option to embed the required font subset inside the SVG. This does solve the missing font problem, for instance when using the default wordcloud font. fonttools is required only if embed_font is true.

wordcloud/wordcloud.py

amueller · 2020-01-07T21:46:30Z

Hm looks like python34 is failing but looks unrelated. Thanks a lot for the contribution. Good to merge?

jnothman · 2020-01-07T21:59:57Z

would it be worth adding at least a smoke test before merge, if not an image comparison test?

amueller · 2020-01-07T22:00:44Z

Smoke test should be good. So far I have not done any image comparison tests, for better or worse, so I wouldn't ask for that.

jojolebarjos · 2020-01-08T10:11:04Z

I added a small smoke test, which run to_svg and check that it is valid XML.

michaelsjackson · 2020-01-11T19:24:15Z

It seems we (I) can not access this new svg export feature from the command line yet? (wordcloud_cli.py). It would be cool if it could be added there as well.

It could work as follows:

--imagefile /path/image.png   would export as png
--imagefile /path/image.svg   would export as svg

Or anything similar which works in the end.

amueller · 2020-01-13T20:28:50Z

@michaelsjackson I think that would be cool!

EidrianGM · 2020-03-20T11:37:18Z

Hi, when will the method to_svg() uploaded to pip ?

amueller · 2020-04-07T19:52:30Z

There's some issues with the build right now: #535
wordcloud right now uses a build system written by @jcfr and I'm not sure what the issue is.
If I don't hear from him, I might redo the whole build process which might be some work. Any help trying to debug this is welcome.

jcfr · 2020-04-07T20:32:54Z

Sorry for the late reply.
I will have a look at #535

amueller · 2020-04-07T20:39:01Z

Thanks!

thaoth58 · 2020-04-09T07:14:57Z

Thanks, guys!
Waiting for this feature in pip

michaelsjackson · 2020-04-25T15:32:45Z

which repository has to_svg, if any?

amueller · 2020-05-02T20:43:09Z

released as 1.7.0 https://pypi.org/project/wordcloud/

KonradHoeffner · 2021-05-17T10:12:53Z

Is SVG export now possible from the command line? If yes, how?

amueller · 2021-07-09T21:18:24Z

@KonradHoeffner looks like it hasn't been added to the CLI yet. PR welcome :)

KonradHoeffner · 2021-07-12T08:49:20Z

@amueller: I tried to create a PR but I'm not a Python expert and couldn't figure out how to develop your tool quickly, as the README doesn't contain a section about it. Maybe you could add a section about that to the README?

Steps tried

fork the project
create a virtual environment with venv
python -r requirements.txt
python -r requirements-dev.txt

(venv) word_cloud$ python wordcloud/wordcloud_cli.py 

To execute the CLI, instead consider running:

  wordcloud_cli --help

or

  python -m wordcloud --help

It seems to me like it denies being called directly and only wants to be called as a production version? Unfortunately I don't have more time right now and I'm not versed enough in Python and the project so I hope someone with more knowledge about both can do that PR.

P.S.: I asked my colleague, who is more experienced in Python, but he also couldn't figure it out how the development workflow is supposed to be.

jojolebarjos added 5 commits December 23, 2019 16:17

Add to_svg prototype, using freetype-py to get metrics

2f130bc

Use PIL bindings instead of freetype-py

2030ff9

Handle scale attribute

ad12167

Small fix in offset

28ac67c

Add option to embed image in SVG, useful for debug

a71825e

jnothman reviewed Dec 29, 2019

View reviewed changes

wordcloud/wordcloud.py Show resolved Hide resolved

wordcloud/wordcloud.py Outdated Show resolved Hide resolved

wordcloud/wordcloud.py Outdated Show resolved Hide resolved

jojolebarjos added 2 commits December 29, 2019 22:32

Fix documentation and code style

49af1e9

Properly escape words in XML

061ee06

jojolebarjos added 3 commits January 3, 2020 13:42

Add embed_font option for to_svg

4ccece5

Subset font when embedding in SVG

4ed503c

Add option for font optimization

674e0fa

jnothman reviewed Jan 7, 2020

View reviewed changes

wordcloud/wordcloud.py Show resolved Hide resolved

jojolebarjos added 2 commits January 7, 2020 23:36

Update to_svg documentation about Complex Text Layout

289fef5

Add smoke test for to_svg

2c521b9

jnothman approved these changes Jan 8, 2020

View reviewed changes

amueller merged commit 1fc6868 into amueller:master Jan 8, 2020

This was referenced Jan 8, 2020

[WIP] SVG export #163

Closed

Vector graphics #58

Closed

quick'n'dirty HTML implementation #23

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add to_svg #519

Add to_svg #519

jojolebarjos commented Dec 24, 2019

jnothman commented Dec 25, 2019 via email

jojolebarjos commented Dec 25, 2019

jnothman commented Dec 29, 2019

jnothman left a comment

jnothman commented Dec 29, 2019

jojolebarjos commented Dec 29, 2019

jnothman commented Dec 31, 2019 via email

jojolebarjos commented Jan 4, 2020

amueller commented Jan 7, 2020

jnothman commented Jan 7, 2020

amueller commented Jan 7, 2020

jojolebarjos commented Jan 8, 2020

michaelsjackson commented Jan 11, 2020

amueller commented Jan 13, 2020

EidrianGM commented Mar 20, 2020

amueller commented Apr 7, 2020

jcfr commented Apr 7, 2020 •

edited

amueller commented Apr 7, 2020

thaoth58 commented Apr 9, 2020

michaelsjackson commented Apr 25, 2020

amueller commented May 2, 2020

KonradHoeffner commented May 17, 2021

amueller commented Jul 9, 2021

KonradHoeffner commented Jul 12, 2021 •

edited

Add to_svg #519

Add to_svg #519

Conversation

jojolebarjos commented Dec 24, 2019

jnothman commented Dec 25, 2019 via email

jojolebarjos commented Dec 25, 2019

jnothman commented Dec 29, 2019

jnothman left a comment

Choose a reason for hiding this comment

jnothman commented Dec 29, 2019

jojolebarjos commented Dec 29, 2019

jnothman commented Dec 31, 2019 via email

jojolebarjos commented Jan 4, 2020

amueller commented Jan 7, 2020

jnothman commented Jan 7, 2020

amueller commented Jan 7, 2020

jojolebarjos commented Jan 8, 2020

michaelsjackson commented Jan 11, 2020

amueller commented Jan 13, 2020

EidrianGM commented Mar 20, 2020

amueller commented Apr 7, 2020

jcfr commented Apr 7, 2020 • edited

amueller commented Apr 7, 2020

thaoth58 commented Apr 9, 2020

michaelsjackson commented Apr 25, 2020

amueller commented May 2, 2020

KonradHoeffner commented May 17, 2021

amueller commented Jul 9, 2021

KonradHoeffner commented Jul 12, 2021 • edited

Steps tried

jcfr commented Apr 7, 2020 •

edited

KonradHoeffner commented Jul 12, 2021 •

edited