This program is designed to analyse data from the newsdata.sql file provided by Udacity here using a Virtual Machine created using vagrant found here to pull relevant data using PostgreSQL.
The Python program was written in Python 2. There are three methods in this program, one for each question asked. Each method connects to the news database. It then executes a SQL command inorder to retrieve the necessary data to answer the question being asked. The results of the SQL command are stored into a variable. The program prints outs a message, displaying the question the method is answering. The stored results are then looped through using a for loop and the data for each loop is used to display the relevant information in the format specificed by the Udacity instructions. After looping through the stored data, the database connection is closed.
The purpose of the analysis is to answer three questions:
- What are the most popular three articles of all time?
- Who are the most popular article authors of all time?
- On which days did more than 1% of requests lead to errors?
- Install the VM created by Vagrant, instructions can be found here.
- Download the newsdata.sql file found here to the VM. Once the newsdata.sql file is downloaded and uncompressed, it needs to be imported into the news database. Use the command
psql -d news -f newsdata.sql
inorder to call the PostgreSQL command line program, connect to the database named news which has been set up and run the SQL statements in the file newsdata.sql. - Download the LogsAnalysisProject.py file to the same location in the VM as your newsdata.sql file.
- Using your terminal, locate to the VM where your sql and py files are located. Run:
python LogsAnalysisProject.py
this will print out answers to the questions above.