Skip to content

opf-attic/ref

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Vulnerability Warning
=====================
Be aware that this project is affected by the [log4j vulnerability](https://www.ncsc.gov.uk/information/log4j-vulnerability-what-everyone-needs-to-know). Please read and understand the issue before deploying or using this project.

Requirements
============
* php-cli
* php-mysql
* mysql-server
* openjdk-6-jre
* apache2
* libapache2-mod-php5

Installation (Debian/Ubuntu)
============================
apt-get install php5-mysql php5-cli mysql-server libapache2-mod-php5 openjdk-6-jre

Configuration 
==============

# 1 - Create and Configure the Database
--------------------------

cd database_structure
echo "create database opf_ref" | mysql -u root
mysql -u root opf_ref < opf_ref.sql
cd ..

cd bin
# edit settings.php to fill in correct values

cd ..
cd cfg
# edit database_connector.php to match settings.php

# TODO remove database_connector.php in next version


# 2 - Configure the Tools
-------------------------

# Make all the old versions of file (./configure && ./make)
cd tools/file
php make.php

# Configure DROID
cd tools/Droid
./relink.sh

# Enter any Droid 5 directory and run droid once to create your local .droid profile in ~/.droid
cp -r signature_files ~/.droid

# Enter any Droid 6 directory and run droid once to create your local .droid6 profile in ~/.droid6
cp -r signature_files container_sigs ~/.droid6

# WARNING & TODO
# Droid 6.0 appears to not work with the version of db-derby that now ships with ubuntu LTS
# I tried following the instructions from http://db.apache.org/derby/docs/dev/getstart/tgs26250.html however these do not appear to help.

# edit java_paths.php and set them all to the correct locations of things

# 3 - Import the tools
----------------------

cd bin
php import_tool.php absolute_path/tools/file/INFO

SYNOPSIS
========

# CREATE SOME DATA IN THE DATA DIRECTORY (pref in a new dir e.g. mydata1)

php run_scan.php absolute_path/data/mydata1/

php sameas_processor.php

cd /var/www
ln -s $root_path/ref/bin/produce_plot.php index.php

# Open http://your_server/index.php?path=mydata1/PDF_1
# where the path is the think you passed to run_scan relatively.


About The Tools
================

Version numbers are:
 * Major.Minor.Bugfix-Signature_File_Version

In order to add tools you need to download the tool into it's own directory under tools.
At this point you need to create an INFO file detailing the tool (see an existing one).
Additionally, and this is the hard bit, you need to write an OPF-REF.php script which contains the function perform_scan($path,$tool_id), which performs the scan using that tools and processes the results. The best way to write one of these is to adapt an existing example. 

About the Database
==================
The database is purely used as a cache store from which the rdf and linked-data can be built. 
The model works roughly as follows:

Files table
 * 1 entry per file on disk in the data directory

Tools table
 * The tools and data from their info files (may be removed)

Raw_Results table
 * Use it to store the raw_result which the tools outputs corresponding to that file

Results table
 * Stores the refences between the tool, file, results and raw_results ids.

Triples table
 * All the subject/predicate/object triples are stored here e.g.
 ** results/1 file:mime-type msword
   So result id 1 has a mime-type msword.
 * You can put what you like in this table, here is an example of what droid does when it finds multiple matches.
 ** results/2 droid:hasHit hit/1
 ** results/2 droid:hasHit hit/2
 ** hit/1 droid:puid fmt/40
 ** hit/2 droid:puid fmt/41

=======
Explore, have fun, but use the data, don't extend this code too much, there is no point unless you want to generate a whole load more data!