# Goodreads - Rate A Read With GoodReads

## Getting Started

They say <font color=blue>“ *If you want to be a writer, you must do two things above all others: read a lot and write a lot”*.</font> 
<br><br>Well, that’s maybe the primary element, but in this age of Science, some data and some analysis may work as a wonder!A writer puts a lot of efforts into writing a book. So, it can be a good idea to check the market on existing books. What makes a book popular or what are the determinants in a book which earns a good rating?
<br><br>
In this project, I want to work on predicting a Book's popularity or acceptability based on some features of other books on the same subject. I hope any prospective writer could use the tool to get an idea of the future of the book before it is launched.  So, to all the authors out there "Pour your heart and create a Classic, but let's do it scientifically"!

## Step 1: Prepare Data


To access the GoodReads API, at first a **developer Key** is required.The key can be generated by creating a GoodReads account on https://www.goodreads.com/api.

All the credentials are saved in a pkl file.  Here, the credentials are loaded from the .pkl file and used to create an instance to query Goodreads. 

#### Function to Authenticate GoodReads Access

In [1]:
def auth_access():
    ## Get the credentials for goodreads
    goodreads_credentials = open("goodreads_credentials.pkl","rb")
    goodreads = pickle.load(goodreads_credentials)
    
    ## Pass the Key and Secret Code to create a client instance
    key = goodreads['key']
    secret = goodreads['secret']
    gc = client.GoodreadsClient(key, secret)
    return(gc)

#### Create Function create_author_csv to create a csv file with author details for each book

Below,I am using the goodreads API **"find_author"** to fetch the author(s)details. 
<br>For each of the author, I will fetch the below details:
    * Author ID
    * Author Name
    * Date of Birth
    * Date of Death(if any)
    * Fans Count
    * Gender
    * Hometown
    * works Count
    * About the Author
    
   Finally, all the above details will be saved in a csv file.

In [41]:
def create_author_csv(authors,keyword):
    """A Function to create a csv file with the author details for each book"""
    with open(keyword.replace(" ","_")+'_author.csv', 'w') as csvfile:
        # Add the header
        fields = ['book_id','author_id','author_name','born_at','died_at',
                  'fans_count','gender','hometown','works_count','about' ]
        csv_write = csv.DictWriter(csvfile, fieldnames=fields)
        csv_write.writeheader()
        gc=auth_access()
        i=1
        counter =0
    # Write the details for each book
        for i in range(len(authors)):
            try:
               # Write the records in the csv file
                book_id = authors[i][0]
                auth=authors[i][1]
                for name in auth:
                    counter += 1
                    #gc=auth_access()
                    author1 = gc.find_author(name)
                    csv_write.writerow({'book_id': book_id,
                                        'author_id':author1.gid,
                                        'author_name': author1.name, 
                                        'born_at': author1.born_at,
                                        'died_at':author1.died_at,
                                        'fans_count': author1.fans_count(),
                                        'gender':author1.gender,
                                        'hometown': author1.hometown,
                                        'works_count':author1.works_count,
                                        'about': author1.about         
                                       })

            except Exception as e:
                pass
                #print(str(e))
        print("Authors added to the csv file= ", counter)

#### Create Function create_book_csv to create a csv file with book details 

In this function, I will create a CSV file with the book deatils.
The below features are captured here:
    * book_id
    * isbn
    * book title
    * author
    * total_pages
    * average_rating
    * ratings_count
    * rating_dist
    * reviews_count
    * popular shelves
    * publisher
    * book_description
    
 I am calling the function **create_author_csv** at the end to create a CSV file with the author details for each book.

In [24]:
def create_book_csv(books,keyword):
    """A Function to create a csv file with the results for books"""
    with open(keyword.replace(" ","_")+'_books.csv', 'w') as csvfile:
        # Add the header
        fields = ['book_id','isbn','title','author','total_pages',
                  'average_rating','ratings_count','rating_dist', 
                  'reviews_count','popular shelves','publisher',#'buy_links',
                  'book_description' ]
        csv_write = csv.DictWriter(csvfile, fieldnames=fields)
        csv_write.writeheader()
        counter =0
        authors =[]
    # Write the details for each book
        for book in books:
            try:
                counter += 1
               # Write the records in the csv file
                csv_write.writerow({'book_id': book.gid,
                                    'isbn':book.isbn,
                                    'title': book.title, 
                                    'author': book.authors,
                                    'total_pages':book.num_pages,
                                    'average_rating': book.average_rating,
                                    'ratings_count':book.ratings_count,
                                    'rating_dist': book.rating_dist,
                                    'reviews_count':book.text_reviews_count,
                                    'popular shelves': book.popular_shelves,
                                    'publisher':book.publisher,
                                    #'buy_links':book.buy_links,
                                    'book_description':book.description
                                   })
                authors.extend([[book.gid,book.authors]])
            except Exception as e:
                pass
                #print(str(e))
        create_author_csv(authors,keyword)
        #print(authors)
        print("Books added to the csv file= ", counter)
    

#### Create a function search_by_keywords to search books by a keyword

Books are searched from goodreads using the API **search_books** for a search keyword. 
<br>With the search results a CSV file is created by calling the fnction **create_book_csv**.

In [21]:
def search_by_keywords(keyword):
    """A Function to search books from goodreads by a keyword"""
    # Authenticate access
    gc=auth_access()
    # Instialize a Book list
    books = []
    #for i in range(101):
    for i in range(1):
        try:
            # Call goodreads API to search for the genre passed as parameter
             books.extend(gc.search_books(q=keyword,page=i+1,search_field='all'))           
        except TypeError as exc:
           # print("type error: "+ str(exc))
           pass
    print("No. of Books found for the search keyword= ", len(books))
    create_book_csv(books,keyword)

#### Import Libraries

In [11]:
import pickle
import collections
import csv
# Import goodreads api
from goodreads import client
from goodreads.client import GoodreadsClient
from goodreads.book import GoodreadsBook
from goodreads.author import GoodreadsAuthor
from goodreads.shelf import GoodreadsShelf

#### Create the main function to pass the genre

In [42]:
if __name__ == '__main__':
    while True:
        keyword = input("Enter the search string (or quit to stop): ").lower()
        if(keyword == "quit"):
            break
        else:
            search_by_keywords(keyword)


Enter the search string (or quit to stop): children
No. of Books found for the search keyword=  20
Authors added to the csv file=  23
Books added to the csv file=  20
Enter the search string (or quit to stop): quit


## Step 2: Data Wrangling

In [1]:
import pandas as pd

In [32]:
books = pd.read_csv('children_books.csv')
books.shape

(20, 12)

In [33]:
books.head()

Unnamed: 0,book_id,isbn,title,author,total_pages,average_rating,ratings_count,rating_dist,reviews_count,popular shelves,publisher,book_description
0,9460487,,Miss Peregrine's Home for Peculiar Children (M...,[Ransom Riggs],352.0,3.91,766003,5:281772|4:289374|3:193802|2:55411|1:18518|tot...,49394,"[to-read, currently-reading, fantasy, young-ad...",Quirk,"Alternate Cover edition for ISBN <a href=""http..."
1,297249,807508500.0,"The Boxcar Children (The Boxcar Children, #1)","[Gertrude Chandler Warner, L. Kate Deal]",160.0,4.1,105725,5:49420|4:31479|3:20840|2:5086|1:2464|total:10...,2422,"[to-read, currently-reading, childrens, fictio...",Albert Whitman Company,The Aldens begin their adventure by making a h...
2,25499718,1447273000.0,Children of Time (Children of Time #1),[Adrian Tchaikovsky],600.0,4.29,18797,5:15907|4:10905|3:3823|2:911|1:309|total:31855,2382,"[to-read, currently-reading, science-fiction, ...",PanMacmillan,A race for survival among the stars... Humanit...
3,23164983,1594747000.0,Hollow City (Miss Peregrine's Peculiar Childre...,[Ransom Riggs],428.0,4.08,97943,5:59004|4:67036|3:31633|2:5706|1:1185|total:16...,8417,"[to-read, currently-reading, fantasy, young-ad...",Quirk Books,"This second novel begins in 1940, immediately ..."
4,597790,7246226.0,The Children of Húrin,"[J.R.R. Tolkien, Christopher Tolkien, Alan Lee]",313.0,3.97,45744,5:18932|4:21099|3:12517|2:3059|1:723|total:56330,1989,"[to-read, fantasy, currently-reading, fiction,...",HarperCollins,<blockquote>\n <b>Tolkien fans are sure to tr...


In [34]:
authors = pd.read_csv('children_author.csv')
authors.shape

(23, 10)

In [35]:
authors.head()

Unnamed: 0,book_id,author_id,author_name,born_at,died_at,fans_count,gender,hometown,works_count,about
0,9460487,3046613,Ransom Riggs,1979/02/03,,"OrderedDict([('@type', 'integer'), ('#text', '...",male,,37,"Hi, I'm Ransom, and I like to tell stories. So..."
1,297249,10665,Gertrude Chandler Warner,1890/04/16,1979/08/30,"OrderedDict([('@type', 'integer'), ('#text', '...",,"Putnam, Connecticut",349,"Gertrude Chandler Warner was born in Putnam, C..."
2,297249,10665,Gertrude Chandler Warner,1890/04/16,1979/08/30,"OrderedDict([('@type', 'integer'), ('#text', '...",,"Putnam, Connecticut",349,"Gertrude Chandler Warner was born in Putnam, C..."
3,25499718,1445909,Adrian Tchaikovsky,,,"OrderedDict([('@type', 'integer'), ('#text', '...",male,Lincolnshire,107,ADRIAN TCHAIKOVSKY was born in Lincolnshire an...
4,23164983,3046613,Ransom Riggs,1979/02/03,,"OrderedDict([('@type', 'integer'), ('#text', '...",male,,37,"Hi, I'm Ransom, and I like to tell stories. So..."


In [38]:
data= pd.merge(books, authors)
data.shape

(23, 21)

In [39]:
data.head()

Unnamed: 0,book_id,isbn,title,author,total_pages,average_rating,ratings_count,rating_dist,reviews_count,popular shelves,...,book_description,author_id,author_name,born_at,died_at,fans_count,gender,hometown,works_count,about
0,9460487,,Miss Peregrine's Home for Peculiar Children (M...,[Ransom Riggs],352.0,3.91,766003,5:281772|4:289374|3:193802|2:55411|1:18518|tot...,49394,"[to-read, currently-reading, fantasy, young-ad...",...,"Alternate Cover edition for ISBN <a href=""http...",3046613,Ransom Riggs,1979/02/03,,"OrderedDict([('@type', 'integer'), ('#text', '...",male,,37,"Hi, I'm Ransom, and I like to tell stories. So..."
1,297249,807508500.0,"The Boxcar Children (The Boxcar Children, #1)","[Gertrude Chandler Warner, L. Kate Deal]",160.0,4.1,105725,5:49420|4:31479|3:20840|2:5086|1:2464|total:10...,2422,"[to-read, currently-reading, childrens, fictio...",...,The Aldens begin their adventure by making a h...,10665,Gertrude Chandler Warner,1890/04/16,1979/08/30,"OrderedDict([('@type', 'integer'), ('#text', '...",,"Putnam, Connecticut",349,"Gertrude Chandler Warner was born in Putnam, C..."
2,297249,807508500.0,"The Boxcar Children (The Boxcar Children, #1)","[Gertrude Chandler Warner, L. Kate Deal]",160.0,4.1,105725,5:49420|4:31479|3:20840|2:5086|1:2464|total:10...,2422,"[to-read, currently-reading, childrens, fictio...",...,The Aldens begin their adventure by making a h...,10665,Gertrude Chandler Warner,1890/04/16,1979/08/30,"OrderedDict([('@type', 'integer'), ('#text', '...",,"Putnam, Connecticut",349,"Gertrude Chandler Warner was born in Putnam, C..."
3,25499718,1447273000.0,Children of Time (Children of Time #1),[Adrian Tchaikovsky],600.0,4.29,18797,5:15907|4:10905|3:3823|2:911|1:309|total:31855,2382,"[to-read, currently-reading, science-fiction, ...",...,A race for survival among the stars... Humanit...,1445909,Adrian Tchaikovsky,,,"OrderedDict([('@type', 'integer'), ('#text', '...",male,Lincolnshire,107,ADRIAN TCHAIKOVSKY was born in Lincolnshire an...
4,23164983,1594747000.0,Hollow City (Miss Peregrine's Peculiar Childre...,[Ransom Riggs],428.0,4.08,97943,5:59004|4:67036|3:31633|2:5706|1:1185|total:16...,8417,"[to-read, currently-reading, fantasy, young-ad...",...,"This second novel begins in 1940, immediately ...",3046613,Ransom Riggs,1979/02/03,,"OrderedDict([('@type', 'integer'), ('#text', '...",male,,37,"Hi, I'm Ransom, and I like to tell stories. So..."
