Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FIX] tools: properly guess image/svg+xml #68211

Closed
wants to merge 1 commit into from

Conversation

qsm-odoo
Copy link
Contributor

One of the two "magic" library we allow to use to guess to mimetype of
data is sometimes returning "image/svg" instead of the wanted
"image/svg+xml".

@qsm-odoo qsm-odoo self-assigned this Mar 22, 2021
@qsm-odoo qsm-odoo requested review from a team as code owners March 22, 2021 16:37
@robodoo
Copy link
Contributor

robodoo commented Mar 22, 2021

@qsm-odoo
Copy link
Contributor Author

Not sure who to ping for this kind of code review... maybe you @xmo-odoo? What do you think about this quick fix?

odoo/tools/mimetypes.py Outdated Show resolved Hide resolved
@C3POdoo C3POdoo added the RD research & development, internal work label Mar 22, 2021
Copy link
Contributor

@mart-e mart-e left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the two library [...] is somtimes [...]

This feels very uncertain :D When? Why? Do you have a bug report to link to?
If a library is returning a wrong result, we should report this upstream, and only if we can not backport it (e.g. fix included in a release with breaking changes or unmaintained), we should monkey patch it.
I can see this kind code hanging around in 6 years with nobody remembering why it was put there in the first place.

Regarding the patch itself, if I am not mistaken, the python-magic package does not identify the mimetype itself, it loads the libmagic library installed on the system.

So

  • isn't the python-magic provided by Debian has the same issue?
  • isn't the issue in one of these system library instead?

@qsm-odoo
Copy link
Contributor Author

This feels very uncertain :D When? Why?

Not uncertain at all... when? When I use a SVG file (= sometimes). Why? I don't know, I did not write the lib, the image/svg mimetype simply is not a valid mimetype that I know of. We could make this patch even in ir.attachment to not allow to save the "image/svg" mimetype and close our eyes on the guess method if we were careless 😉

Do you have a bug report to link to?

No, otherwise I would have 😉 It is just a bug we detected on our own. Why did not we see it before? Because we never guess the mimetype of svg file using those libraries as we mostly guess the mimetype checking the filename, see _compute_mimetype.

isn't the python-magic provided by Debian has the same issue?

@JKE-be is this what you tested yesterday? I remember the conflict between the two python-magic libraries but did not know about them using a random magic package as explained by @mart-e 🤷 Anyways, you know better than me what to do, you found the fix in 5minutes in the first place 😉

@mart-e
Copy link
Contributor

mart-e commented Mar 23, 2021

So if you can reproduce it on your computer, can you check which implementation/version of magic is installed on your computer? Because I tried localy and it worked well for me with

$ pacman -Qo /usr/lib/libmagic.so
/usr/lib/libmagic.so is owned by file 5.39-1
$ file --mime-type /tmp/test.svg
/tmp/test.svg: image/svg+xml

(and same with invoking magic.from_buffer(data, mime=True))

@xmo-odoo
Copy link
Collaborator

xmo-odoo commented Mar 23, 2021

@mart-e searching for "image/svg" libmagic seems to have a fair number of hits, but most are duplicates of mdn/kuma#6125, which the MDN folks apparently didn't bother investigating, they just monkepatched the incorrect mimetype: mdn/kuma#6144

And from the comment on mdn/kuma#6144 it might not even be a reliable misdetection:

The magic.Magic() will, for unknown reasons, sometimes think an SVG image's mime type is image/svg

@mart-e
Copy link
Contributor

mart-e commented Mar 23, 2021

maybe file/file@bc86bfe (changed at file/file@1a08bb5 ) ?

@xmo-odoo
Copy link
Collaborator

xmo-odoo commented Mar 23, 2021

Also wrt "two libraries" it kinda looks like the debian one has been deprecated / removed / ...? starting from buster the package uses the description of "pypi magic" and links to its github instead of the old magic from file(1). Though if the issue is in libmagic I expect that will be buggy either way.

(/cc @d-fence)

@JKE-be
Copy link
Contributor

JKE-be commented Mar 23, 2021

You are right @mart-e It happens in both cases.
Yesterday when I try to reproduce with debian package I install py2 package instead of py3 probably; my bad.

But the bug is really not "uncertain", install debian python3-magic or with pip3 and try to upload any svg and you will see that it returns the wrong 'image/svg' mimetype instead of "image/svg+xml"

Until now, we don't see the error because most of us don't upload svg, or just don't have the "unrequired" magic package.
So we fallback on our own guess_mimetype which one return the right 'image/svg+xml'

Let me know if we need to rewrite the PR to fix both import (debian or pip) or if you have better plan (like remove dependencies to magic?)...
image

@JKE-be
Copy link
Contributor

JKE-be commented Mar 23, 2021

ps: add xml on top of the svg will works too...

<?xml version="1.0" encoding="UTF-8"?>
or/and
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN" "http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">

Maybe it is the simple solution

@JKE-be JKE-be closed this Mar 23, 2021
@JKE-be JKE-be reopened this Mar 23, 2021
@JKE-be
Copy link
Contributor

JKE-be commented Mar 23, 2021

After more discussion, we continue to think that we should never use image/svg and apply something similar to this PR like
KUMA that you linked previously

@JKE-be
Copy link
Contributor

JKE-be commented Mar 23, 2021

Although it is not required, an SVG file may start with an XML declaration, indicating the XML version used and the character encoding. The following explicitly declares the default (version 1.0) XML syntax in a file with a UTF-8 Unicode character set:

The XML declaration is only required if you are using a non-Unicode character encoding (technically, anything other than UTF-8 or UTF-16). We always use UTF-8, so we don’t often use an XML declaration. If used, the declaration should always appear on the first line of the file.

You may also include an SGML DOCTYPE declaration, but it is no longer recommended for SVG. The DOCTYPE points to the document type definition of the SVG file format, and is used by some validators and code editing tools. However, some of these validation tools will not recognize perfectly valid XML content from non-SVG namespaces.

src: from the O'Reilly Media book by Amelia Bellamy-Royds, Kurt Cagle, and Dudley Storey.

@mart-e
Copy link
Contributor

mart-e commented Mar 23, 2021

But the bug is really not "uncertain", install debian python3-magic or with pip3 and try to upload any svg and you will see that it returns the wrong 'image/svg' mimetype instead of "image/svg+xml"

>>> ms.file("/tmp/icon.svg")
'text/plain'

My debian is too old :D

After more discussion, we continue to think that we should never use image/svg

I see your point, especially if we can see that file misidentified it during 3 years according to the commits I gave above.
On the server side we should be good as always using startswith for comparison

odoo/odoo/fields.py

Lines 1997 to 2000 in a7110b6

# Full mimetype detection
if (guess_mimetype(decoded_value).startswith('image/svg') and
not record.env.is_system()):
raise UserError(_("Only admins can upload SVG files."))

But it's not always the case client side

if (response.headers.get('content-type') === 'image/svg+xml') {

Anyway, this patch should also be applied on the Debian version, and with a good comment.

Now, is it bad enough to change this in 12.0?

@JKE-be
Copy link
Contributor

JKE-be commented Mar 23, 2021

Now, is it bad enough to change this in 12.0?

Imo yes, since it doesn't work in v12 and we add svg support in v12: https://github.com/odoo/odoo//1be50fdeafcd2b94f7b2f47d6e5bc4b6ebaceadd

We always follow the same policy: fix bug in first version supported where code is still the same and fwd port will not have conflict. I don't see any reason to don't follow it now.

@xmo-odoo
Copy link
Collaborator

@qsm-odoo could you explain the problematic use-case though? As in, how you stumbled upon this issue.

@d-fence
Copy link
Contributor

d-fence commented Mar 23, 2021

Here are my 2 cents:
Only certain type of svg files are impacted. The files that starts with <svg.
Look at the magic file code for svg in version 5.35 which is the Debian Buster version.

Now fixed in master but not yet in Debian Buster (fixed with magic 5.39 in Debian Bullseye)

You can easily test that by creating a file with only <svg at its beginning ... and use file on it.

@JKE-be
Copy link
Contributor

JKE-be commented Mar 23, 2021

Hello @xmo-odoo

Have magic installed on your computer
Install website / create a new one... in master

It will load https://github.com/odoo/odoo/blob/master/addons/website/static/src/img/website_logo.svg?short_path=991f026 as attachment (the default website logo)

which one only contains svg without (optional ?) xml version and doctype.

At this point, you have no error, but the mimetype in db ir.attachment is image/svg

Now go to your website, and you will see a "placeholder broken" image on Chrome, or nothing on Firefox.
Check your network and you see that the svg is loaded with the wrong mimetype ofc.

update in DB or apply patch and create a new website, now that logo have the right mimetype, you can see you logo appears on your website

@JKE-be
Copy link
Contributor

JKE-be commented Mar 23, 2021

@d-fence , yes it was the other solution proposed here #68211 (comment)

But since browser cannot display image/svg, idea was to fix for all future svg too

@d-fence
Copy link
Contributor

d-fence commented Mar 23, 2021

IMHO the present fix makes sense and to answer @mart-e , I think it's needed in 12.0 as the Debian versions were not fixed at the time of 12.0 release ... with a good comment that explains why.

JKE-be added a commit to odoo-dev/odoo that referenced this pull request Mar 23, 2021
Without this 'XML declaration', if you have magic installed, the mime type
is image/svg instead of image/svg+xml and browser don't render it.

This commit is related to odoo#68211 meanwhile a more generic solution (if needed).
@xmo-odoo
Copy link
Collaborator

xmo-odoo commented Mar 23, 2021

Did you mean, change the api of guess_mimetypes in stable ?

Not sure what you're responding to, but the version I'm suggesting doesn't change the API of guess_mimetypes.

@JKE-be
Copy link
Contributor

JKE-be commented Mar 23, 2021

guess_mimetypes take a 3rd params _guesser now. What is considered as one change of api for Odoo no ?
My first version was a partial to avoid this new params, but code became more complexe imo.

@xmo-odoo
Copy link
Collaborator

guess_mimetypes take a 3rd params _guesser now. What is considered as one change of api for Odoo no ?

That's just an optimisation (so the guesser is looked up as a local instead of a global), the way guess_mimetypes is called doesn't change in any way, nor is this parameter supposed to be externally supplied (hence the _ prefix). In fact it probably can't be as the "reference" version (the one if no libmagic is installed) was not changed.

robodoo pushed a commit that referenced this pull request Mar 23, 2021
Without this 'XML declaration', if you have magic installed, the mime type
is image/svg instead of image/svg+xml and browser don't render it.

This commit is related to #68211 meanwhile a more generic solution (if needed).

closes #68239

Signed-off-by: Quentin Smetz (qsm) <qsm@odoo.com>
@C3POdoo C3POdoo requested review from a team September 16, 2021 10:46
@JKE-be JKE-be changed the title [FIX] tools: properly guess image/svg+xml mimetype with first magic lib [FIX] tools: properly guess image/svg+xml Sep 20, 2021
ms = magic.open(magic.MAGIC_MIME_TYPE)
ms.load()
guess_mimetype = lambda bin_data, default=None: ms.buffer(bin_data)
_guesser = ms.buffer
def guess_mimetypes(bin_data, default=None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def guess_mimetypes(bin_data, default=None):
def guess_mimetype(bin_data, default=None):

Magic return image/svg instead of image/svg+xml if file starts with <svg
@JKE-be
Copy link
Contributor

JKE-be commented Sep 20, 2021

@robodoo r+

For tracking, a svg with doctype svg and padding is considered as svg by libmagic, but is it not an issue since a browser cannot display it and consider that <?xml should be the first chars from the line.

@robodoo
Copy link
Contributor

robodoo commented Sep 20, 2021

@JKE-be, you may want to rebuild or fix this PR as it has failed CI.

robodoo pushed a commit that referenced this pull request Sep 20, 2021
Magic return image/svg instead of image/svg+xml if file starts with <svg

closes #68211

Signed-off-by: Jérémy Kersten (jke) <jke@openerp.com>
@robodoo robodoo closed this Sep 20, 2021
@robodoo robodoo temporarily deployed to merge September 20, 2021 16:49 Inactive
@fw-bot
Copy link
Contributor

fw-bot commented Sep 24, 2021

This pull request has forward-port PRs awaiting action (not merged or closed): #76876

@fw-bot
Copy link
Contributor

fw-bot commented Sep 25, 2021

This pull request has forward-port PRs awaiting action (not merged or closed): #77188, #77193

2 similar comments
@fw-bot
Copy link
Contributor

fw-bot commented Sep 26, 2021

This pull request has forward-port PRs awaiting action (not merged or closed): #77188, #77193

@fw-bot
Copy link
Contributor

fw-bot commented Sep 27, 2021

This pull request has forward-port PRs awaiting action (not merged or closed): #77188, #77193

@qsm-odoo qsm-odoo deleted the 12.0-fix-magic-lib-svg-qsm branch September 27, 2021 07:32
@fw-bot
Copy link
Contributor

fw-bot commented Sep 28, 2021

This pull request has forward-port PRs awaiting action (not merged or closed): #77188

1 similar comment
@fw-bot
Copy link
Contributor

fw-bot commented Sep 29, 2021

This pull request has forward-port PRs awaiting action (not merged or closed): #77188

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
RD research & development, internal work
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants