Project Layout
monitrix is implemented based on the Play! framework (Java). The Play version at the time of writing is 2.0.4. As it's primary backend, monitrix uses the MongoDB NoSQL database. So far, monitrix has been tested with version 2.0.4 of MongoDB (no typo - identical to the Play version!).
The project layout follows standard Play! conventions, with the following top-level folders:
- app contains the actual application source code
- conf contains application configuration, route definitions and Jasper .jrxml templates for the downloadable reports
- project holds the build file + properties
- public contains static Web resources (images, CSS, javascript files)
- test contains unit test classes + resources
According to Play! conventions, the project has two top-level packages named controllers
and views
, which
contain the controller implementation classes and and view templates, respectively. controllers
also contains a
sub-package mapping
which holds helper classes that wrap different model objects into a form that can be
automatically translated to a convenient JSON representation by the Play! framework.
The core application logic is located in a third package named uk.bl.monitrix
. This package has the following sub-packages:
-
model
This package contains interface (or abstract base class) definitions for the core datamodel concepts used by monitrix (cf. Technical Overview).- CrawlLog and CrawlLogEntry represent the crawl log and an individual log line, respectively
- KnownHostList and KnownHost represent the list of known hosts and an individual host, respectively
- CrawlStats and CrawlStatsUnit represent the crawl stats collection, and an individual base-resolution data point
- AlertLog and Alert represent the alert log collection and an individual alert
- VirusLog and VirusRecord represent the virus log and a record of occurences of an individual virus
-
heritrix
This package contains classes specific to reading and ingesting Heritrix log files.- The class LogFileEntry is an implementation of CrawlLogEntry, based on a line read from a log file.
-
SimpleLogFileReader is a class that exposes a Heritrix log file through an
Iterator
overLogFileEntry
s. - IncrementalLogFileReader is a log file reader that implements incremental batch loading on a log file that is being concurrently written by Heritrix.
- Classes for ingesting data into montrix are contained in the sub-package ingest.
- IngestWatcher provides a "frontend" API to the ingest system (with methods to start and stop the watching process, query status, and add logs for watching).
- IngestActor handles the actual watch- and ingest-process in the background.
- IngestStatus and IngestControlMessage are simple classes used for communication with the ingest system.
-
database
This package contains the DBConnector class (a minimal, generic database read interface), the DBIngestConnector class (a generic database write interface) and one subpackagemongodb
, which holds the implementation classes for the MongoDB storage backend. This package has the following contents:- MongoProperties holds the string constants (collection and field names) used for MongoDB
- MongoDBConnector implements the DBConnector interface for MongoDB
- Package
model
contains implementation classes for monitrix' core datamodel concepts (as contained inuk.bl.monitrix.model
). - Package
ingest
contains extensions to the core datamodel implementations which provide write access. These classes also contain the core ingest processing logic!
-
analytics
This package contains data structures and processing functions for computing various aggregate stats from the raw data held in the backend.-
The class CrawlStatsAnalytics contains helpers to compute/resample timeseries from the data held in the Crawl Stats collection.
-
The class LogAnalytics contains helpers to compute various stats and property distributions from series of log entries.
-
PieChartValue and TimeseriesValue are data structures used to represent computation results in the analytics classes.
-
-
export
This package contains the classes that implement rendering of printable reports, using on Jasper Reports. Note that actual report templates (extension .jrxml) are located in the /conf folder.
Finally, there are two additonal classes in the top-level uk.bl.monitrix
package: Global which is an implementation
of the Play! Global object, and NumberFormat which provides
helpers for formatting numbers and dates in the view templates.