Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not identifying file of the type text/html if file does not contain the "<html>" tag. #19

Closed
jg0000 opened this issue Dec 31, 2015 · 3 comments

Comments

@jg0000
Copy link

jg0000 commented Dec 31, 2015

lsb_release -a
LSB Version: core-2.0-amd64:core-2.0-noarch:core-3.0-amd64:core-3.0-noarch:core-3.1-amd64:core-3.1-noarch:core-3.2-amd64:core-3.2-noarch:core-4.0-amd64:core-4.0-noarch:core-4.1-amd64:core-4.1-noarch:security-4.0-amd64:security-4.0-noarch:security-4.1-amd64:security-4.1-noarch
Distributor ID: Ubuntu
Description: Ubuntu 14.04.3 LTS
Release: 14.04
Codename: trusty

aptitude show libfile-mimeinfo-perl
Package: libfile-mimeinfo-perl
State: installed
Automatically installed: no
Version: 0.22-1
Priority: optional
Section: perl
Maintainer: Ubuntu Developers ubuntu-devel-discuss@lists.ubuntu.com
Architecture: all
Uncompressed Size: 142 k
Depends: perl, libfile-basedir-perl, libfile-desktopentry-perl, shared-mime-info
Description: Perl module to determine file types
File::MimeInfo can be used to determine the mime type of a file. It tries to implement the freedesktop specification for
a shared MIME database.

This package also contains two related utilities:

If a file does not contain the <html> tag, /usr/bin/mimetype identifies the file as text/plain. /usr/bin/file seems to be more intelligent, or more tolerant, in this regard.

@mbeijen
Copy link
Owner

mbeijen commented Dec 31, 2015

Can you please provide an example of a case that mimetype does NOT handle, but /usr/bin/file does handle correctly?

@jg0000
Copy link
Author

jg0000 commented Jan 1, 2016

Hi Michiel

Please see attached test file.

My tests:

/usr/bin/mimetype /tmp/mimetestfile

/tmp/mimetestfile: text/plain

file /tmp/mimetestfile

/tmp/mimetestfile: HTML document, ISO-8859 text, with very long lines

Regards,

Jie

Date: Thu, 31 Dec 2015 02:07:00 -0800
From: Michiel Beijen notifications@github.com
To: mbeijen/File-MimeInfo File-MimeInfo@noreply.github.com
CC: jg0000 j.gao@sydney.edu.au
Subject: Re: [File-MimeInfo] Not identifying file of the type text/html if
file does not contain the "" tag. (#19)

Can you please provide an example of a case that mimetype does NOT handle, but /usr/bin/file does handle correctly?


Reply to this email directly or view it on GitHub:
#19 (comment)

<style type="text/CSS">
body {
FONT-FAMILY : Verdana, Arial, Helvetica, sans-serif;
FONT-WEIGHT: none;
TEXT-DECORATION: none;
FONT-SIZE: 11px;
}
</style>

�� FREE SHIPPING - TODAY ONLY      
Hurry! Offer ends tonight!Free Regular Shipping or $5.95 Express - TODAY ONLY
Thanks for reading!
Cable Chick Signature
Cable Chick and the Team

Cable Chick's Latest Blogs

What is USB OTG and What Can It Do?

What is USB OTG and What Can It Do?

USB OTG (On-the-Go) is a powerful feature of many Android smartphones and tablets. Learn how to take advantage of it today!   Read More
Product Launch - Cat6 Colour Range & New Cat6A 500Mhz cables

Product Launch - Cat6 Colour Range & New Cat6A 500Mhz cables

Colour code your home and office networks with our new rainbow range of Category 6 cables and travel to the future with CAT6A!   Read More
Why does my Amplifier use Negative dB for Volume?

Why does my Amplifier use Negative dB for Volume?

Have you ever wondered why your home theatre receiver shows volume as a negative number? Wonder no more!   Read More

Flat Rate Shipping on Regular and Express Services

Please read the full terms and conditions available on our website.
Savings based on RRP. Prices and Specifications subject to change without notice.
Promotion valid until 11:59pm AEST Friday January 1st 2016 or until sold out. Sorry, no rainchecks.

Cable Chick Website    Facebook      Twitter
Cable Chick Accepts: American Express, Visa, Mastercard, Paypal and more

Product and Gift stock is limited and may sell out at any time. Prices and Prizes are subject to change.
� 2006-2016 www.CableChick.com.au. All rights reserved.

This message was sent to the following email address: j.gao@sydney.edu.au
We hope you find this message useful, however if you would rather not receive any more
Cable Chick Newsletters, please click here to unsubscribe. Or Log in to manage your Subscriptions.

@mbeijen
Copy link
Owner

mbeijen commented Jan 3, 2017

Determining that a file is an HTML file when it is called something and not something.html and it merely contains some tags that look like HTML but not the required <html> - I'm not sure if I'd find it correct, it might lead to other sorts of problems. File::MimeInfo does not need to be bug-compatible with /usr/bin/file. Closing.

@mbeijen mbeijen closed this as completed Jan 3, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants