monitrix is a monitoring/analytics frontend for the Heritrix 3 Web crawler. Visit the Wiki for information about:
- installing monitrix
- using monitrix
- monitrix internals:
Developers: Quick Start
To start monitrix in development mode, change into the project root folder and type
The application will be at http://localhost:9000.
To generate an Eclipse project, type
Getting Data into monitrix
To load data into monitrix, enter the 'Admin' section, and enter the absolute path of a log file in the form. The log file should immediately appear in the list above the form, with status 'CATCHING UP' (or 'PENDING', followed shortly thereafter by 'CATCHING UP'). monitrix will now load the log file into the database. After the upload is complete, monitrix will continuously check the log file for updates. Warning: Loading data takes time! On my machine, a 10 GB log sample currently takes about 1 hour to process!
Alternatively, you can also populate the database 'manually' using either of the following Java utilities, located in the /test folder of the project:
uk.bl.monitrix.util.BatchLogProcessorwill load a log file into the database in one go and then terminate.
uk.bl.monitrix.util.IncrementalLogProcessorwill load a log file into the database, and then continue to monitor that file (and incrementally sync the DB) until it is terminated forcefully.
MongoDB Cheat Sheet
- Use the
mongodcommand to start MongoDB (hint:
- The MongoDB admin dashboard is at http://localhost:28017. Make sure
you start MongoDB with the
--restoption (when in dev mode) to enable full dashboard functionality. (Note: on my system
sudo mongod --dbpath /var/lib/mongodb --restworks fine.)
mongo monitrix --eval "db.dropDatabase()"to drop the monitrix DB (replacing 'monitrix' with your database name, if needed.)
- MongoDB REST interface docs are here.