Skip to content

A Guided Tour of monitrix

Andy Jackson edited this page Jun 28, 2013 · 21 revisions

A Guided Tour of monitrix

This page provides a 'work in progress' overview of the currently implemented features of monitrix and their status.

Header

  • Navigation bar

  • Search field: currently searches for host name only! Multiple search results will be shown in a list (no pagination yet). If only one result is found, monitrix will skip the list and direcly open this host's info page.

  • Title/Info bar: apart from the monitrix 'logo', the title bar displays:

    • basic crawl status info: i.e. active/idle + the time since crawl start/idleness
    • (clickable) URLs selected randomly from the most recent 100 crawled URLs

Home Screen

Monitrix: Dashboard

  • Time since crawl start (if crawl currently running) or last activity (if crawl idle - i.e. last log activity more than 2 minutes ago)

  • Gauge of current average crawl rate (URLs per minute), computed based on the 100 most recent log entries (relative to the average of the 2000 most recent log entries). AJAX-updated every 5 seconds. (But limited to the actual update interval of the backend!)

  • Gauge of the current download rate (MB per minute), computed based on the 100 most recent log entries (vs. 2000 most recent average). AJAX updated every 5 seconds.

  • A break-down of the number of hosts crawled by top-level domain.

  • List of 10 most recent alerts in the system (not AJAX updated at the moment).

Crawl Timeline

Monitrix: Timeline (1/2)

  • 'Quick Stats' info box:

    • Time of crawl start
    • Time of last activity
    • Crawl duration
    • Total # of URLs crawled
    • Total amount of data downloaded
  • Graph: data volume downloaded vs. time (100 datapoints resolution)

  • Graph: number of URLs crawled vs. time (100 datapoints resolution)

  • Graph: number of new hosts crawled vs. time (100 datapoints resolution)

  • Graph: number of hosts completed vs. time (100 datapoints resolution)

Each graph has a 'view fullscreen' button underneath, which opens the graph in a lightbox window, at a higher resolution (work in progress).

Hosts

Monitrix: Hosts (1/3)

URLs

Monitrix: URLs & Compressibility

Alerts

Monitrix: Alerts

The alerts page lists all hosts that have generated alerts (plus the type of alerts, and number of alerts the host has generated for each type). Click on a host opens its host info page. No pagination yet.

Virus Log

Monitrix: Viruses

The virus log list all viruses detected during the crawl, along with the number of detections and the list of infected hosts.

The virus log is also downloadble as PDF. This is mostly for demo purposes, to illustrate the integration and use of the Jasper Reports library.

Host Info Page

Monitrix: Host Details

  • Number of URLs crawled at this host
  • Time of first and last access
  • Pie chart of Heritrix fetch code distribution
  • Pie chart of Virus scan results
  • Pie chart of MIME type distribution
  • List of subdomains
  • List of crawlers that have crawled this host

Admin

Monitrix: Admin

The admin page (currently) lists the log files which are being monitored by monitrix, along with their monitoring status. Each log file will have one of the following status types:

  • IDLE: monitrix has ingested this log file, and it is being monitored for updates in the background
  • PENDING: this log file is on the waiting list for initial ingest after it has been registered with monitrix, or after monitrix has been (re)started.
  • CATCHING UP: this log file is being ingested after it has been registered with monitrix, or after monitrix has been (re)started. Note: while log files are in the 'catching up' phase, monitrix may behave slow due to the high amount of database write activity!
  • SYNCHRONIZING: monitrix is synchronizing the database with the recent entries to this log.
  • TERMINATED: the ingest process has terminated.

About Page

At the moment, the about page list version information for all of monitrix dependencies (Java VM, Scala version, .jar library dependencies, etc.)