Skip to content

NaughtyFlame/logs_analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 

Repository files navigation

Logs Analysis

An internal reporting tool that will use information from the database to discover what kind of articles the site's readers like.

This tool will answer three question.

  • What are the most popular three articles of all time?
  • Who are the most popular article authors of all time?
  • On which days did more than 1% of requests lead to errors?

Quick Start

You should Install python2 to run this script

Install

  • Set up a VirtualBox/Vagrant environment

    1. Install VirtualBox
    2. Install Vagrant
    3. Start the virtual machine

    From your terminal, inside the vagrant subdirectory, run the command vagrant up. This will cause Vagrant to download the Linux operating system and install it.

    When vagrant up is finished running, you will get your shell prompt back. At this point, you can run vagrant ssh to log in to your newly installed Linux VM!

  • [optional step] Install python module psycopg2 (IF haven't installed)

If your pip version supports wheel packages it should be possible to install a binary version of Psycopg including all the dependencies from PyPI. Just run:

$ pip install -U pip      # make sure your pip is up-to-date
$ pip install psycopg2
  • Download database Next, download the data here. You will need to unzip this file after downloading it. The file inside is called newsdata.sql.

To load the data, use the command psql -d news -f newsdata.sql.

  • Create View in database You shold create three view in database, log_articles, log_daily, log_error.

  • Run the script

python logs_analysis.py

Database View

log_articles

Outputing a table of successful log record for every article and its author

CREATE VIEW  log_articles AS
  SELECT title, author, count(*) AS views FROM articles
  LEFT JOIN log
  ON '/article/' || slug = path
  WHERE status = '200 OK'
  GROUP BY title, author;

Example:

title author views
Trouble for troubled troublemakers 2 84810
There are a lot of bears 1 84504
Candidate is jerk, alleges rival 2 338647
Bad things gone, say good people 3 170098
Bears love berries, alleges bear 1 253801
Media obsessed with bears 1 84383
Balloon goons doomed 4 84557
Goats eat Google's lawn 1 84906

log_daily

Outputing a table of all requests for each day.

CREATE VIEW log_daily AS
  SELECT DATE(time), COUNT(*) AS count_total
  FROM log
  GROUP BY DATE(time);

Example:

date count_total
2016-07-01 38705
2016-07-02 55200
2016-07-03 54866
2016-07-04 54903

log_error

Outputing a table of error requests for each day

CREATE VIEW log_error AS
  SELECT DATE(time), COUNT(*) AS count_error
  FROM log
  WHERE status LIKE '404%'
  GROUP BY DATE(time);

Example:

date count_error
2016-07-31 329
2016-07-06 420
2016-07-17 1265
2016-07-19 433

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages