Skip to content

ImageDataExtractor is a toolkit for the automatic extraction of microscopy images. Check out the publication at: https://pubs.acs.org/doi/abs/10.1021/acs.jcim.9b00734 and the website at: https://www.imagedataextractor.org

License

Notifications You must be signed in to change notification settings

ktm2/ImageDataExtractor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ImageDataExtractor

ImageDataExtractor is a toolkit for the automatic extraction of microscopy images.

** This version is DEPRECATED. See the new version here. **

Features

  • Automatic detection and download of microscopy images from scientific articles
  • HTML and XML document format support
  • High-throughput capabilities
  • Direct extraction from image files
  • PNG, GIF, JPEG, TIFF image format support

Installation

It is best to install ImageDataExtractor using pip, but it is also possible to directly install from source. See below for installation instructions.

NOTE: The current version of IDE uses Tesseract 3. The source code can be downloaded here and instructions on how to compile can be found here.

NOTE: It is advised that all installations of ImageDataExtractor are run inside a virtual environment. Click here for more information.

Installing with pip

To install with pip, simply run:

pip install ImageDataExtractor

Then download the necessary data files to run ChemDataExtractor-IDE by running:

cde data download

Installing from source

Install ChemDataExtractor-IDE

In order to use ImageDataExtractor first install the bespoke version of ChemDataExtractor, ChemDataExtractor-IDE.

Clone the repository by running:

$ git clone https://github.com/edbeard/chemdataextractor-ide.git

and install with:

$ python setup.py install

Then download the required machine learning models with:

$ cde data download

See https://github.com/edbeard/chemdataextractor-ide for more details

Install ImageDataExtractor

Now to install ImageDataExtractor, clone the repository with:

$ git clone  https://github.com/ktm2/ImageDataExtractor.git

Then create a wheel file by running:

$ python setup.py bdist_wheel

You may have to run pip install wheel if this fails.

Then install using pip:

$ pip install dist/ImageDataExtractor-0.0.1-py3-none-any.whl  

Running the code

Full documentation on running the code can be found at www.imagedataextractor.org .

Open a python terminal and run

>>> import imagedataextractor as ide

Then run:

>>> ide.extract_document(<path/to/document>)

to automatically identify and extract the images from a document. Full details on supported input and output formats can be found at our website .

About

ImageDataExtractor is a toolkit for the automatic extraction of microscopy images. Check out the publication at: https://pubs.acs.org/doi/abs/10.1021/acs.jcim.9b00734 and the website at: https://www.imagedataextractor.org

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages