Library Metric Utilities

This repository contains scripts and utilities for generating numbers for various library metrics that we're required to give every year.

This counts the number of features in PostGIS databases.

Setup

Python Dependencies

Run rake init

OR pip install --requirement requirements.txt
Create config.yml

This needs a YAML configuration file to tell it how to connect to all of the databases you need counts from.

The config.yml file should be on the same level as the bin folder (not within the bin folder).

---
postgis:
  - host: postgis.server.com
    port: 5432
    user: postgis_user
    password: secret
  - host: postgis2.server.com
    port: 5432
    user: postgis2_user
    password: secret2
rasters:
  host: geo.server.com
  user: username
  geoserver-data-dir: /var/geodata/geoserver/data
geoserver:
  url: http://geo.server.com:8080/geoserver
  user: admin
  password: secret

Run the script

Now you can call it like this:

./bin/geolayers.py --config config.yml --filter '^ESRI' --filter '^MLB'

OR

rake geoserver

Options

Other options are available to control the output.

--config = The database configuration file.

--filter = Don't get counts for databases matching these. Can be given more than once.

--no-totals = Don't print totals.

--verbose = Extra output.

Notes

The user supplied in the config.yml for connecting to the host must be able to access the pg_hba.conf file (meaning the account will need sudo access).

Website Metrics

The server-logs directory also contains some resources for analyzing server logs. Currently, this is just a Makefile that installed the needed dependencies and tools and then generates summary statistics from a log file.

Tools

The tools that it uses are:

GoAccess to parse the logs, aggregate their data, and dump out the report as JSON, which means we need to process that with ...
jq, which pulls out the data we need from the log summary and write it out as CSV, which we process using ...
csvkit to get summary statistics for the fields we're interested in (unique visitors and page views).

The Makefile will install these using homebrew and pip:

make install

Preparing the Logs

The logs have to be downloaded separately. These are found on our servers in the /etc/httpd/logs directory, and are usually named after their sites (e.g., prosody_access.log).

Generally, we keep about a month's worth of logs in five files: the current file and four numbered rotated logs (prosody_access.log.1 through prosody_access.log.4).

These tools assume that there's one file containing all of these files with a very specific name: PROJECT_access.log. You can easily generate this file with these two short shell command:

mv prosody_access.log prosody_access.log.0
cat prosody_access.log.* > prosody_access.log

The first command backs up the current file by renaming it, and then the second command uses cat to join all the files into one. Notice that the logs are redirected into a file whose name matches the pattern required.

Generating Statistics

Once the logs are in shape, the Makefile will generate the statistical summary for a project:

make prosody.summary

If the tools can find the file prosody_access.log, it will use the tools to generate a file called prosody.summary. Its contents will look something like this:

  2. "visitors"

	Type of data:          Number
	Contains null values:  False
	Unique values:         31
	Smallest value:        103
	Largest value:         594
	Sum:                   16,022
	Mean:                  485.515
	Median:                519
	StDev:                 102.473
	Most common values:    501 (2x)
	                       521 (2x)
	                       103 (1x)
	                       581 (1x)
	                       565 (1x)

  3. "hits"

	Type of data:          Number
	Contains null values:  False
	Unique values:         33
	Smallest value:        1,174
	Largest value:         10,814
	Sum:                   268,436
	Mean:                  8,134.424
	Median:                8,661
	StDev:                 2,412.637
	Most common values:    1,174 (1x)
	                       9,929 (1x)
	                       7,674 (1x)
	                       8,127 (1x)
	                       8,574 (1x)

Row count: 33

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
scripts		scripts
server-logs		server-logs
.gitignore		.gitignore
README.md		README.md
Rakefile		Rakefile
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Library Metric Utilities

Setup

Run the script

Options

Notes

Website Metrics

Tools

Preparing the Logs

Generating Statistics

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Library Metric Utilities

Setup

Run the script

Options

Notes

Website Metrics

Tools

Preparing the Logs

Generating Statistics

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages