New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mime types of bytea attributes #447

Closed
remys opened this Issue Oct 27, 2017 · 8 comments

Comments

Projects
None yet
3 participants
@remys

remys commented Oct 27, 2017

Hi,
One of my bytea fields contains a file of type *.xlxs (Excel).
When viewing this attribute, Postico determies this as "unknown data type"
When downloading this file to my file system (using "Save as" button), no file extension is suggested by Postico.
Is there a way to add mime type recognition to Postico?

@jakob

This comment has been minimized.

Show comment
Hide comment
@jakob

jakob Oct 27, 2017

Owner

Postico uses the file command to guess the file command, which should detect Excel files. However, it seems that for some Excel files it fails to detect that it is an Excel file, it just detects it as Microsoft OOXML with a mime type of application/octet-stream. Looking at the magic file at /usr/share/file/magic/msooxml it looks like the magic might be confused by the order of files in the archive (XLSX files are actually zip archives). I'm not 100% sure though, need to investigate this in more detail.

We should probably file a bug report with Apple, or with libmagic, or find out where the magic files on macOS come from.

If Apple doesn't fix this, I'll have to look for workarounds. The compiled magic file on 10.12 is 3.7MB, so I don't really want to ship my own with Postico. Maybe I can write my own code to detect MS office file types more reliably.

Owner

jakob commented Oct 27, 2017

Postico uses the file command to guess the file command, which should detect Excel files. However, it seems that for some Excel files it fails to detect that it is an Excel file, it just detects it as Microsoft OOXML with a mime type of application/octet-stream. Looking at the magic file at /usr/share/file/magic/msooxml it looks like the magic might be confused by the order of files in the archive (XLSX files are actually zip archives). I'm not 100% sure though, need to investigate this in more detail.

We should probably file a bug report with Apple, or with libmagic, or find out where the magic files on macOS come from.

If Apple doesn't fix this, I'll have to look for workarounds. The compiled magic file on 10.12 is 3.7MB, so I don't really want to ship my own with Postico. Maybe I can write my own code to detect MS office file types more reliably.

@remys

This comment has been minimized.

Show comment
Hide comment
@remys

remys Oct 27, 2017

remys commented Oct 27, 2017

@jakob

This comment has been minimized.

Show comment
Hide comment
@jakob

jakob Oct 27, 2017

Owner

No, that is not correct. The file type should be "Microsoft Excel 2007+", not "Microsoft OOXML". The latter is a generic catch-all type when it's not possible to tell whether it is a Word, Powerpoint, or Excel file.

What version of macOS are you on?

Owner

jakob commented Oct 27, 2017

No, that is not correct. The file type should be "Microsoft Excel 2007+", not "Microsoft OOXML". The latter is a generic catch-all type when it's not possible to tell whether it is a Word, Powerpoint, or Excel file.

What version of macOS are you on?

@jakob

This comment has been minimized.

Show comment
Hide comment
@jakob

jakob Oct 27, 2017

Owner

In my testing, this problem appears on macOS 10.12 and 10.13

Owner

jakob commented Oct 27, 2017

In my testing, this problem appears on macOS 10.12 and 10.13

@remys

This comment has been minimized.

Show comment
Hide comment
@remys

remys Oct 27, 2017

remys commented Oct 27, 2017

@jakob

This comment has been minimized.

Show comment
Hide comment
@jakob

jakob Oct 27, 2017

Owner

I've submitted a bug report to Apple.

I've researched the issue a bit, and it looks like the file command and the magic files come from http://www.darwinsys.com/file/, but that website is currently not reachable. There's a mirror of the repo at https://github.com/file/file/blob/origin/magic/Magdir/msooxml

Apparently there is a mailing list for the file command, but all the servers are currently down, so it's hard to find any info on who to send this bug report to.

I'm writing all this down mostly as a note to myself so I remember to follow up on Monday. No action required from your part :)

But I'm curious if the problem shows up with all XLSX files. Which app did you use to create the XLSX file? I used Numbers to create the problematic file.

Owner

jakob commented Oct 27, 2017

I've submitted a bug report to Apple.

I've researched the issue a bit, and it looks like the file command and the magic files come from http://www.darwinsys.com/file/, but that website is currently not reachable. There's a mirror of the repo at https://github.com/file/file/blob/origin/magic/Magdir/msooxml

Apparently there is a mailing list for the file command, but all the servers are currently down, so it's hard to find any info on who to send this bug report to.

I'm writing all this down mostly as a note to myself so I remember to follow up on Monday. No action required from your part :)

But I'm curious if the problem shows up with all XLSX files. Which app did you use to create the XLSX file? I used Numbers to create the problematic file.

@jakob

This comment has been minimized.

Show comment
Hide comment
@jakob

jakob Nov 2, 2017

Owner

Apple is not going to fix this issue:

Engineering has provided the following information regarding this issue:

The file is a 3rd party open source tool that we ship on the OS. We will not be doing local development of file for this case. If the 3rd party owners of file fix this, we may eventually include their fix. But we will not be doing any development for this issue.

The issue tracker and the mailing list of file are still not reachable.

It would be neat if I could come up with a fix for the msooxml magic file, then I could ship that with Postico and correctly detect the file type. At least in theory. During testing, /usr/bin/file ignored both -m AND -M arguments...

So the only thing left is to write some custom code that detects MS Office XML files.

Owner

jakob commented Nov 2, 2017

Apple is not going to fix this issue:

Engineering has provided the following information regarding this issue:

The file is a 3rd party open source tool that we ship on the OS. We will not be doing local development of file for this case. If the 3rd party owners of file fix this, we may eventually include their fix. But we will not be doing any development for this issue.

The issue tracker and the mailing list of file are still not reachable.

It would be neat if I could come up with a fix for the msooxml magic file, then I could ship that with Postico and correctly detect the file type. At least in theory. During testing, /usr/bin/file ignored both -m AND -M arguments...

So the only thing left is to write some custom code that detects MS Office XML files.

@postico-bot

This comment has been minimized.

Show comment
Hide comment
@postico-bot

postico-bot Nov 3, 2017

This issue was mentioned in a commit message.

Refactored the file type detection for bytea columns #447

Postico now checks if the data is a Microsoft Office document before calling /usr/bin/file
This is necessary because file does not correctly detect msooxml files. (jakob)

Download Build 1968

postico-bot commented Nov 3, 2017

This issue was mentioned in a commit message.

Refactored the file type detection for bytea columns #447

Postico now checks if the data is a Microsoft Office document before calling /usr/bin/file
This is necessary because file does not correctly detect msooxml files. (jakob)

Download Build 1968

@jakob jakob closed this Dec 20, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment