A package which allows you to set up your own EML (Extract, Map and Load) tool.
PHP HTML JavaScript
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
dev/js
public
src
tests
.gitignore
.travis.yml
README.md
composer.json
phpunit.xml

README.md

Input package

Latest Stable Version Build Status

This is the Laravel package called "input" and serves as an extract-map-load configuration (EML) as part of the datatank core application (tdt/core). The current instances of the eml stack are focussed on semantifying data. This means that raw data can be transformed into semantic data by providing a mapping file.

Future work exists in extracting data from large files and loading them into a NoSQL store. This endpoint can then be exploited freely, or proxied by the datatank core.

Configuration with queues

In order to harvest large datasets, jobs will need to be put in a queue so they can be executed asynchronously. One way to do this is to have a beanstalkd service up and running on the server in combination with the artisan queue:listen command.

In order to make sure the artisan listen command, which executes jobs when they enter the beanstalkd queue, keeps running, configure it in supervisord!

Lastly configure the beanstalkd queue in the configuration file of the application. (app/config/queue.php)