No description, website, or topics provided.
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.

Logs Analysis Project

This is my submission for Udacity's Full Stack Web Developer Nanodegree's Logs Analysis Project. The task was to create a tool the analyzes a PostreSQL database of a news website and prints out reports. The report answers 3 questions:
  1. What are the most popular three articles of all time?
  2. Who are the most popular article authors of all time?
  3. On which days did more than 1% of requests lead to errors?

Sample Output

(1) "Candidate is jerk, alleges rival" with 338647 views
(2) "Bears love berries, alleges bear" with 253801 views
(3) "Bad things gone, say good people" with 170098 views
(1) "Ursula La Multa" with 507594 views
(2) "Rudolf von Treppenwitz" with 423457 views
(3) "Anonymous Contributor" with 170098 views
July 17, 2016 -- 2.3% errors


For this project, I used a Linux-based VM that gave me access to PostreSQL along with other software necessary for the project.

VM Setup

  1. Install the platform package of VirtualBox VirtualBox is what actually runs the VM
  2. Install Vagrant Vagrant configures the VM and lets you share files between your host computer and the VM's filesystem
  3. Fork and clone the repository This is the VM configuration. All the work will be done in this directory.
  4. In terminal, change to this directory with cd. Then, cd into the vagrant folder.
  5. Run the command vagrant up. This tells Vagrant to download the Linux operating system, which can take a long time.
  6. Run the command vagrant ssh to log into the newly installed Linux VM.

Download the Data

  1. Download and unzip
  2. Put this file in the vagrant directory.
  3. To run the reporting tool you will need to load the news site's data into the local database news that has already been created for you. In the VM, cd into the vagrant subdirectory and run the command psql -d news -f newsdata.sql:
    • psql — the PostgreSQL command line program
    • -d news — connect to the database named news which has been set up for you.
    • -f newsdata.sql — run the SQL statements in the file newsdata.sql that you just downloaded, creating tables and populating them with data.

Install psycopg2

psycopg2 is the python library that allows python to work with PostreSQL. Simply run the command pip3 install psycopg2


Everything is setup now. All you have to do is run with python3: python3