README

ImagePlag is an adaptive, scalable, and extensible image-based plagiarism detection system suitable for analyzing a wide range of image similarities. The system integrates established image analysis methods, such as perceptual hashing, with newly developed similarity assessments for images, such as ratio hashing for bar charts, and position-aware OCR text matching for figures that contain little text.

ImagePlag extracts images from PDFs, stores their feature descriptors in a SQLite database and compares the descriptors with an input PDF or input images.

You can find the full documentation of the system at: docs/_build/html/index.html

Installation

ImagePlag runs on Linux systems and has been tested for Ubuntu 14.04.5 LTS.

To install the system, run:

$ pip install --user .

in a directory of your choice (do not use sudo)

Make sure that ~/.local/bin is in Path:

$ export PATH=$PATH:~/.local/bin (temporary)

OR add ~/.local/bin to /etc/environment, then reboot (permanent)

For using OCR functionality, tesseract must be installed, which is used by pytesser.

To use the PDF Module, poppler-utils and ImageMagick have to be installed. They will be called from the command line.

GPU support is available for NVIDIA GPU.

Some dependencies might need to be installed manually.

Example Installation

This example is based on a clean server environment and the use of python-virtualenv.

Ubuntu 16.04.1 LTS

1. Install system requirements

$ sudo apt install python-pip
$ sudo apt install python-virtualenv
$ sudo apt install tesseract-ocr
$ sudo apt install poppler-utils

2. Set up a virtualenv

a. Create virtual environment: virtualenv 'imageplag/'
b. activate the virtual environment and install python dependencies

$ source imageplag/bin/activate
$ pip install backports.tempfile
$ pip install pillow
$ pip install imagehash
$ pip install opencv-python
$ pip install protobuf
$ pip install pytesseract
$ pip install falcon
$ pip install gunicorn

3. Install caffe

Installing caffe can be tough. We present the general approach that worked for us. Caffe can't be installed in the virtualenv, but should be installed as a system dependency. The same path is later used for calling ImagePlag.

Get caffe and read the installation guide

a. http://caffe.berkeleyvision.org/installation.html
b. https://github.com/BVLC/caffe

Our goal is to later run 'make pycaffe'.

Compile caffe locally

a. activate the virtual environment
b. change the main directory and install all requirements of caffe for req in
$(cat requirements.txt); do pip install $req; done

Makefile.config changes

a. uncomment CPU_ONLY := 1
b. There was a renaming problem for hdf5, change the INCLUDE_DIRS to:
INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include /usr/include/hdf5/serial
c. In 'Makefile' also rename 'hdf5' to 'hdf5_serial'
d. run all make commands

We had several issue compiling everything, here are the descriptions of how we resolved them:

4. Set the python path

export PYTHONPATH=/home/vincent/IdeaProjects/caffe/caffe/python:

5. Obtain classification models (too large for GitHub)

Separate directories extracted to '/image_plag/API'

DNN_bar_no_bar
DNN_chart_no_chart
DNN_pure_no_pure

Download files (617 MB)

Usage

Use with virtualenv

Use the gunicorn installation in the virtualenv. Additionally, the pythonpath should be set to include the caffe installation. Finally, change to the API folder in the imageplag installation, e.g., '/imageplag/API'.

$ cd API
$ ../bin/gunicorn --pythonpath "/opt/imageplag/caffe/python" -b localhost:5000 app

Use without virtualenv

$ cd <path to app.py>
$ service nginx start
$ gunicorn -b localhost:5000 app

Make sure to use Python 2.7, e.g.:

$ python /usr/lib/python2.7/dist-packages/gunicorn/app/wsgiapp.py -b localhost:5000 app

API

GET /images, response: 200 JSON
GET /images/{name}, response: 200 raw image
POST /images, params: id, body: raw image, response: 201 string

Contributors

Christopher Gondek (gondek.christopher THAT-SIGN gmail.com)

Norman Meuschke (norman.meuschke THAT-SIGN uni.kn)

Vincent Stange (vinc.sohn THAT-SIGN gmail.com)

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
API		API
docs		docs
.gitignore		.gitignore
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

README

Installation

Example Installation

1. Install system requirements

2. Set up a virtualenv

3. Install caffe

Get caffe and read the installation guide

Compile caffe locally

Makefile.config changes

4. Set the python path

5. Obtain classification models (too large for GitHub)

Usage

Use with virtualenv

Use without virtualenv

API

Contributors

About

Releases

Packages

Contributors 3

Languages

gipplab/imageplag

Folders and files

Latest commit

History

Repository files navigation

README

Installation

Example Installation

1. Install system requirements

2. Set up a virtualenv

3. Install caffe

Get caffe and read the installation guide

Compile caffe locally

Makefile.config changes

4. Set the python path

5. Obtain classification models (too large for GitHub)

Usage

Use with virtualenv

Use without virtualenv

API

Contributors

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages