# GoodReads

### Instructions

* Read in the GoodReads CSV using Pandas

* Remove unnecessary columns from the DataFrame so that only the following columns remain: `isbn`, `original_publication_year`, `original_title`, `authors`, `ratings_1`, `ratings_2`, `ratings_3`, `ratings_4`, and `ratings_5`

* Rename the columns to the following: `ISBN`, `Publication Year`, `Original Title`, `Authors`, `One Star Reviews`, `Two Star Reviews`, `Three Star Reviews`, `Four Star Reviews`, and `Five Star Reviews`

* Write the DataFrame into a new CSV file

### Hints

* The base CSV file uses UTF-8 encoding. Trying to read in the file using some other kind of encoding could lead to strange characters appearing within the dataset.


#### Import Dependencies

In [None]:
import pandas as pd
import os

#### Import the books.csv file as a DataFrame

First, make a reference variable to point to the location of `books.csv`

In [None]:
# Make a reference to the books.csv file path
csv_path = os.path.join("Resources", "books.csv")

# Import the books.csv file as a DataFrame
books_df = pd.read_csv(csv_path, encoding="utf-8")
books_df.head()

#### Remove unecessary columns from the DataFrame and save the new DataFrame

Only keep: *isbn*, *original_publication_year*, *original_title*, *authors*, *ratings_1*, *ratings_2*, *ratings_3*, *ratings_4*, *ratings_5*

In [None]:
reduced_df = books_df[["isbn", "original_publication_year", "original_title", "authors",
                       "ratings_1", "ratings_2", "ratings_3", "ratings_4", "ratings_5"]]
reduced_df.head()

#### Rename the headers to be more explanatory

In [None]:
renamed_df = reduced_df.rename(columns={"isbn": "ISBN",
                                        "original_title": "Original Title",
                                        "original_publication_year": "Publication Year",
                                        "authors": "Authors",
                                        "ratings_1": "One Star Reviews",
                                        "ratings_2": "Two Star Reviews",
                                        "ratings_3": "Three Star Reviews",
                                        "ratings_4": "Four Star Reviews",
                                        "ratings_5": "Five Star Reviews", })
renamed_df.head()

#### Push the remade DataFrame to a new CSV file

In [None]:
output_file = os.path.join("Output", "books_clean.csv")
renamed_df.to_csv(output_file, encoding="utf-8", index=False, header=True)