In this notebook, I go over the Goodreads-book Dataset on Kaggle and store the data in a PostgresSQL Database. I go over the following:
- Loading the dataset
- Performing preliminary analysis to understand the dataset
- Map out a Entity-Relationship Model to store the data in a database
- Create database tables
- Populate the database tables from the dataset using Python
- Perform complex SQL queries to extract insights from the data
In order to use this notebook, please make sure that an instance of PostgreSQL server is running. You can find more information for downloading Postgres here. Once installed, make sure your instance is listening on localhost at port 5432. You can also change the configuration file db_cred.json depending on how you set up your postgres server. This step is important orelse you won't be able to communicate with the database