Skip to content

SruthiV/Logs-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Logs Analysis Project - Udacity Full Stack Web Developer Nanodegree

DESCRIPTION

For this project, my task was to create a reporting tool that prints out reports( in plain text) based on the data in the given database. This reporting tool is a Python program using the psycopg2 module to connect to the database. This project sets up a mock PostgreSQL database for a fictional news website. The provided Python script uses the psycopg2 library to query the database and produce a report that answers the following three questions:

  1. What are the most popular three articles of all time?
  2. Who are the most popular article authors of all time?
  3. On which days did more than 1% of requests lead to errors?

RUNNING THE PROGRAM

  1. To get started, I recommend the user use a virtual machine to ensure they are using the same environment that this project was developed on, running on your computer. You can download Vagrant and VirtualBox to install and manage your virtual machine. Use vagrant up to bring the virtual machine online and vagrant ssh to login.

  2. Download the data provided by Udacity here. Unzip the file in order to extract newsdata.sql. This file should be inside the Vagrant folder.

  3. Load the database using psql -d news -f newsdata.sql.

  4. Connect to the database using psql -d news.

  5. Create the Views given below. Then exit psql.

  6. Now execute the Python file - python logs_analysis.py.

CREATE THE FOLLOWING VIEWS FOR QUESTION 2 AND QUESTION 3:

Views for Question 2
CREATE VIEW article_authors AS
SELECT title, name
FROM articles, authors
WHERE articles.author = authors.id;
CREATE VIEW article_views AS
SELECT title, count(log.id) as views
FROM articles, log
WHERE log.path = CONCAT('/article/', articles.slug)
GROUP BY articles.title
ORDER BY views desc;
Views for Question 3
CREATE VIEW logs AS
SELECT to_char(time,'DD-MON-YYYY') as Date, count(*) as LogCount
FROM log
GROUP BY Date;
CREATE VIEW errorlogs AS
SELECT to_char(time,'DD-MON-YYYY') as Date, count(*) as ErrorCount
FROM log
WHERE STATUS = '404 NOT FOUND'
GROUP BY Date;

Releases

No releases published

Packages

No packages published

Languages