Skip to content

rbgowin/utilities

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

monitoring and utilities

Author

Robin Gowin, September 2014 - October 2014

Source

Monitoring source files:

  • check.py
  • checklogs.py
  • monitoring.txt
  • my-mail.py
  • test-check.py

Generic Reusable Application Log Scraper

Overview

I have developed a generic and reusable Python class, and a main program that scrapes log files. These software tools can be used as part of application monitoring. Main features:

  • easy to customize

  • automated monitoring that sends results by email

  • email subject line simplifies filtering on "success", “warning”, or “error”

  • search for regular expressions ("regex")

  • easily extensible to other monitoring tools such as hipchat, riemann, graphite

  • scrape application log files or include as part of an application pipeline

  • can process compressed log files

  • optionally process "exact matches" such as “must contain exactly 1 line like this”

High level design

The main two components are in source code files checklogs.py and check.py. A helper function "my-mail.py" automates sending of email results.

checklogs

This program defines a reusable class called MyLogScraper which performs the scraping of logs, either a file or a buffer (for pipelines). Public methods:

	def scrape(self, send_mail=false, silent=false):

	def buffer_scrape(self, send_mail=false, silent=false):

	def build_cmd_line(self, silent=true):

scrape method

Callers use this method to scrape a log file; it is the primary interface. Options include whether to send email and whether to print status messages.

buffer_scrape method

Callers use this method to scrape a buffer, i.e. a pipe as part of a larger pipeline command. The interface is similar to the scrape method.

build_cmd_line method

Used in conjunction with either scrape message, allows the caller to determine whether or not to run the command that sends email with the results of the log scraping.

check

The default check.py program demonstrates a typical example of how to use the checklogs class. The design of the main program looks like:

  • instantiate the class, process command line options

  • cache the success, warning, and error lists

  • scrape the log, optionally send email, process exact matches

high level source

The source code for check looks like this

# process command line options, set up standard variables
my_main = MyLogMain()

# read in the success, warning, error lists for this type of monitoring
my_main.read_lists()

# scrape the log
my_main.scrape_log()

# process 'exact matches' (i.e. looking for a specific number of lines matching a string), if any
my_main.process_exact_matches()

my-mail.py

Program to send email to a specified distribution. Command-line looks like:

my-mail.py --subject=’sss’ [--body=’bbb’ | --file=filename ] [--send_to=’email_distro’] [--attach=afilename]

command line options

  • subject line (required)

  • email body (optional)

  • file containing body of email (optional)

  • email distribution list (optional)

  • file to include as attachment (optional)

Use Cases

The following table illustrates a handful of potential use cases for log scraping.

Use Case Purpose Monitoring
database extract extract 90 tables from SQL db confirm all 90 tables were extracted; confirm status line
database copy copy tables to remote server confirm all tables copied successfully
database load load tables into db appliance confirm all tables load successfully
disk space monitoring monitor disk space alert if above a specified percentage
storage trending graph storage usage over time interface with riemann and graphana
application log monitoring monitor progress of application ability to scrape any number of logs produced by applications
cron changes monitor changes to cron jobs as new jobs are developed and automated, confirm that existing cron jobs are still running e.g. not commented out
input file dependencies confirm time and size of input files verify that expected source input files are being produced when expected
hourly processing confirm dependencies verify that hourly files are produced on schedule
middle tier uptime confirm api status send message to middle tier and confirm that message was processed correctly
daily snapshot confirm db lookup verify that daily metatdata user snapshot file created recently (updated several times per day)
middleware log confirm nginx log search nginx logs for errors or warnings

other utility programs

  • countlines.py simple utility to count lines in a file

  • get-os.py print operating system version details

  • ps.py format output of ps aux and filter and loop

simple utility methods and variables

  • utils.py
  • test-utils.py

source for simple utilities

see github

https://github.com/rbgowin/utilities/tree/master/python-utils

  • my utilities
  • first version 09/30/14

About

utilities

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages