# DataFrame Analysis Exercise

**Import the `bestsellers.csv` dataset and use it to answer the following questions:**

* Find the lowest User Rating in the DF
* Find the highest Price in the DF
* What is the average User Rating?
* What the average User Rating of the first 5 rows?
* What User Review score appeared the most?
* What is the total (sum) of all the values in the Reviews column?
* How many different authors are featured in the dataset?
* Which author wrote the most number of books on the list?  How many did they write?

In [1]:
import numpy as np
import pandas as pd

In [2]:
df = pd.read_csv('../data/bestsellers.csv')

In [3]:
df.head()

Unnamed: 0,Name,Author,User Rating,Reviews,Price,Year,Genre
0,10-Day Green Smoothie Cleanse,JJ Smith,4.7,17350,8,2016,Non Fiction
1,11/22/63: A Novel,Stephen King,4.6,2052,22,2011,Fiction
2,12 Rules for Life: An Antidote to Chaos,Jordan B. Peterson,4.7,18979,15,2018,Non Fiction
3,1984 (Signet Classics),George Orwell,4.7,21424,6,2017,Fiction
4,"5,000 Awesome Facts (About Everything!) (Natio...",National Geographic Kids,4.8,7665,12,2019,Non Fiction


#### Find the lowest User Rating in the DF

In [5]:
df['User Rating'].sort_values().unique()

array([3.3, 3.6, 3.8, 3.9, 4. , 4.1, 4.2, 4.3, 4.4, 4.5, 4.6, 4.7, 4.8,
       4.9])

3.3 is the lowest user rating

In [6]:
df.min()b

Name           10-Day Green Smoothie Cleanse
Author                      Abraham Verghese
User Rating                              3.3
Reviews                                   37
Price                                      0
Year                                    2009
Genre                                Fiction
dtype: object

#### Find the highest Price in the DF

In [7]:
df.max()

Name           You Are a Badass: How to Stop Doubting Your Gr...
Author                                              Zhi Gang Sha
User Rating                                                  4.9
Reviews                                                    87841
Price                                                        105
Year                                                        2019
Genre                                                Non Fiction
dtype: object

In [11]:
df['Price'].sort_values(ascending=False).unique()

array([105,  82,  54,  53,  52,  46,  42,  40,  39,  36,  32,  30,  28,
        27,  25,  24,  23,  22,  21,  20,  19,  18,  17,  16,  15,  14,
        13,  12,  11,  10,   9,   8,   7,   6,   5,   4,   3,   2,   1,
         0], dtype=int64)

#### What is the average User Rating?

In [17]:
df.describe().loc['mean', 'User Rating']

4.618363636363637

In [19]:
df['User Rating'].mean()

4.618363636363637

#### What the average User Rating of the first 5 rows?

In [23]:
df.head()['User Rating'].mean()

4.7

#### What User Review score appeared the most?

In [40]:
df['User Rating'].value_counts()

4.8    127
4.7    108
4.6    105
4.5     60
4.9     52
4.4     38
4.3     25
4.0     14
4.2      8
4.1      6
3.9      3
3.8      2
3.6      1
3.3      1
Name: User Rating, dtype: int64

In [29]:
df.mode()

Unnamed: 0,Name,Author,User Rating,Reviews,Price,Year,Genre
0,Publication Manual of the American Psychologic...,Jeff Kinney,4.8,8580.0,8.0,2009,Non Fiction
1,,,,,,2010,
2,,,,,,2011,
3,,,,,,2012,
4,,,,,,2013,
5,,,,,,2014,
6,,,,,,2015,
7,,,,,,2016,
8,,,,,,2017,
9,,,,,,2018,


4.8 is the most appeared User Rating

#### What is the total (sum) of all the values in the Reviews column?

In [43]:
df['Reviews'].sum()

6574305

#### How many different authors are featured in the dataset?

In [31]:
df.columns

Index(['Name', 'Author', 'User Rating', 'Reviews', 'Price', 'Year', 'Genre'], dtype='object')

In [38]:
df['Author'].nunique()

248

248 unique different authors are featured in dataset.

In [48]:
df.describe(include='object')

Unnamed: 0,Name,Author,Genre
count,550,550,550
unique,351,248,2
top,Publication Manual of the American Psychologic...,Jeff Kinney,Non Fiction
freq,10,12,310


#### Which author wrote the most number of books on the list? How many did they write?

In [50]:
df['Author'].value_counts()

Jeff Kinney                           12
Gary Chapman                          11
Rick Riordan                          11
Suzanne Collins                       11
American Psychological Association    10
                                      ..
Keith Richards                         1
Chris Cleave                           1
Alice Schertle                         1
Celeste Ng                             1
Adam Gasiewski                         1
Name: Author, Length: 248, dtype: int64

'Jeff Kinney' wrote the most books.