Neo4j powered web application for multimedia collections: bring graph-based exploration and crowd-based indexation.
JavaScript CSS HTML
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
client
controllers
crowdsourcing
helpers
logs logging with winston goes in a specific access log file Apr 24, 2015
models fix noise/user Jan 7, 2017
queries
scripts
test
.eslintrc.js
.gitignore
LICENSE add GNU license Nov 13, 2015
MODEL.md
README.md add data flush for through2 pipe in import script Aug 17, 2016
auth.js add user.is_authentified for user serialisation (auth) Jan 5, 2017
generator.js
gulpfile.js
helpers.js
nginx.conf.example addded configurration example for nginx Aug 24, 2015
package.json
parser.js integrate annotatorjs (wip) Dec 23, 2015
server.js fix noise/user Jan 7, 2017
services.js
settings.example.js remove username from the anonymous user (temporary fix for the authOr… Sep 15, 2016
validator.js

README.md

HG

! wip

HG is the new Histograph, a node-expressapplication aiming at providing digital humanities specialists with an online collaborative environment. Connections between people, documents and images are stored in a Neo4j graph database.

git clone https://github.com/CVCEeu-dh/histograph.git

installation

Once cloned,

npm install

and install mocha globally (to test settings)

npm install -g mocha

Then copy the settings.example.js file to settings.js

cp settings.example.js settings.js

adjust the paths section of your settings.js file, then create the folders and permissions accordingly. For instance, if using the default settings:

cd histograph
mkdir contents
mkdir contents/media    
mkdir contents/txt
mkdir contents/cache
mkdir contents/cache/disambiguation
mkdir contents/cache/dbpedia
mkdir contents/cache/queries
mkdir contents/cache/services

Histograph makes use of node js passport social auth as authentication method. It has been tested with twitter and google plus. Obtain the tokens and credentials for those services, then fill the twitter and google section of the settings.js file accordingly.

Install Neo4j (v 2.3.*) and configure the database indexing by adding auto_indexing features in conf/neo4j.properties file.

# Autoindexing

# Enable auto-indexing for nodes, default is false
node_auto_indexing=true

# The node property keys to be auto-indexed, if enabled
node_keys_indexable=full_search,name_search

Complete the neo4j installation by pointing to a location in your system that will store the neo4j data in conf/neo4j-server.properties:

# location of the database directory
org.neo4j.server.database.location=data/graph.db

Always in conf/neo4j-server.properties, enable temporary access to neo4j browser (remember to comment the related line later):

# Let the webserver only listen on the specified IP. Default is localhost (only
# accept local connections). Uncomment to allow any connection. Please see the
# security section in the neo4j manual before modifying this.
org.neo4j.server.webserver.address=0.0.0.0

start the neo4j server, e.g for unix terminal:

~/tools/neo4j-community-2.3.2/bin/neo4j start

Once started, modify the default password and fill the neo4j.host section in settings.js:

neo4j : { // > 2.2
    host : {
        server: 'http://localhost:7474',
        user: 'neo4j',
        pass: '*************'
    }
},

Run then the setup script: it will add the required constraints to neo4j db.

node scripts/manage.js --task=setup

Modify neo4j related configuration in your histograph settings.js file, then run the unit tests (using mocha) in order to check that the settings have been set properly.

npm run-script test-settings

If everything is ok, run global unit tests

npm test

And finally,

npm start

import data: manage.js script

Once histograph has been installed, documents and links can be loaded from a JSON graph file via the import script by running:

node scripts/manage.js --task=import.fromJSON --src=/your/data/**/*.json 

For detailed instructions about import and annotation process, see the related wiki page

Named Entity Recognition

Histograph enable the enrichment of resources with different webservices that extract and disambiguate the name entities found. Among them, we use AIDA web service, developed by Max Plank Institute. AIDA entity extracton is enabled by default, but the disambiguation engine works only for english texts.

First of all, set the correct endpoint to yago aida in settings.js:

yagoaida: {
    endpoint: 'https://gate.d5.mpi-inf.mpg.de/aida/service/disambiguate' 
},

Then make sure that the disambiguation services include AIDA:

disambiguation: {
    fields: [
        "title",
        "caption"
    ],
    services: {
        "yagoaida": ['en']
    }
}

troubleshooting

enable cache with redis-server (optional)

For production environment we use redis-server to store api result cache (60 seconds cache). For OSx, you can use brew to install redis:

brew install redis
redis-server /usr/local/etc/redis.conf

And uncomment the section cache in settings.js

geocoding api setup

Create a new project at console.developers.google, then select the geocoding api under the api & auth menu, copy the api key to the geocoding section of your settings.js file.

geocoding: { // google geocoding api
    endpoint: 'https://maps.googleapis.com/maps/api/geocode/json',
    key: ''
},

More info available at geocoding documentation page

enabling google analytics

Just set the correct parameters in the google analytics section of the settings.js file.

analytics: {
  account: 'UA-XXXXXXXXX-1',
  domainName: 'example.com'
}