file-metadata
is a python package that aims to analyze files and find
metadata that can be used from it.
Before installing file-metadata, a few dependencies need to be installed. For Ubuntu, these can be installed with:
$ sudo apt-get install perl openjdk-7-jre python-dev pkg-config \ > libfreetype6-dev libpng12-dev liblapack-dev libblas-dev gfortran \ > cmake libboost-python-dev libzbar-dev
Next, use pip
to install the library. To install the latest stable
version, use:
$ pip install file-metadata
To get development builds from the master branch of the github repo, use:
$ pip install --pre file-metadata
To use the package, you first need a file which can be any media file.
Let us first download an example qrcode from commons wikimedia:
$ wget https://upload.wikimedia.org/wikipedia/commons/5/5b/Qrcode_wikipedia.jpg -O qrcode.jpg
And now, let us create a File object from this:
>>> from file_metadata.generic_file import GenericFile >>> qr = GenericFile.create('qrcode.jpg')
Notice that when creating the file, the class automatically finds the best
type of class to analyze the file. In this case, it auto detecs that the
file is an image file, and uses the ImageFile
class:
>>> qr.__class__.__name__ 'ImageFile'
Now, to find possible analysis routines supported for the file, help(qr)
can be checked. All routines beginning with analyze_
perform analysis.
As the example we have is a qrcode, let us use analyze_barcode_zxing()
:
>>> qr.analyze_barcode_zxing() {'zxing:Barcodes': [{'data': 'http://www.wikipedia.com', 'format': 'QR_CODE', 'points': [(50.0, 316.0), (50.0, 52.0), (314.0, 52.0), (278.0, 280.0)], 'raw_data': 'http://www.wikipedia.com'}]}
Which tells us the bounding box of the barcode (points
) and also the data
(http://www.wikipedia.com
). It also mentions that the format of the barcode
is QR_CODE.
Similarly, to check the mimetype, the analysis routing analyze_mimetype()
can be used:
>>> qr.analyze_mimetype() {'File:MIMEType': 'image/jpeg'}
To perform all the analyze routines on the image, the
analyze()
method can be used. It runs all the analysis routines on the
file and gives back the merged result:
>>> qr.analyze()
To test the code, install dependencies using:
$ pip install -r test-requirements.txt
and then execute:
$ python -m pytest
To pull the latest
docker image use:
$ docker pull pywikibotcatfiles/file-metadata
- Supported tags and respective
Dockerfile
links: latest
,ubuntu-14.04
(docker/Dockerfile)ubuntu-16.04
(docker/Dockerfile)centos-7
(docker/Dockerfile)- show all...
For more information about this image and its history, please see
pywikibotcatfiles/file-metadata
(on docker-hub).
This image is updated via push to the pywikibot-catfiles/docker-file-metadata
GitHub repo
or the pywikibot-catfiles/file-metadata
GitHub repo (by
Triggering builds through the Travis CI API).
This package has been derived from pywikibot-compat.
Specifically, the script catimages.py
which can be found at
pywikibot-compat/catimages.py.
These packages were created by DrTrigon who
is the original author of this package.
This code falls under the MIT License. Please note that some files or content may be copied from other places and have their own licenses. Dependencies that are being used to generate the databases also have their own licenses.