## Final Project: Book Recommender

### Phase 1: Web-Scraping

For the scraping requirement for this project, I've decided to go ahead and use Selenium. The benefit of using Selenium, is that it doesn't put in requests or calls to the desired website but instead gathers all of the data at scale.
<br>
<br>
For the web-scraping portion of this project, I've decided to use the Barnes & Nobles top 100 Bestsellers.

##### Importing correct librairies

In [1]:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.common.keys import Keys
import time

import requests
from bs4 import BeautifulSoup as bs4
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import warnings
warnings.filterwarnings('ignore')

In [2]:
def click(x):
    button = driver.find_element(By.XPATH, x)
    time.sleep(2)
    button.click()
    

The webpage B&N Top 100 Bestsellers had 5 pages, therefore running this code to include all 5 pages when web scraping

In [3]:
url_base = "https://www.barnesandnoble.com/b/books/_/N-1fZ29Z8q8?Nrpp=20&page="
pages = []
for i in range(1,6):
    page = url_base + str(i)
    pages.append(page)
pages

['https://www.barnesandnoble.com/b/books/_/N-1fZ29Z8q8?Nrpp=20&page=1',
 'https://www.barnesandnoble.com/b/books/_/N-1fZ29Z8q8?Nrpp=20&page=2',
 'https://www.barnesandnoble.com/b/books/_/N-1fZ29Z8q8?Nrpp=20&page=3',
 'https://www.barnesandnoble.com/b/books/_/N-1fZ29Z8q8?Nrpp=20&page=4',
 'https://www.barnesandnoble.com/b/books/_/N-1fZ29Z8q8?Nrpp=20&page=5']

In [4]:
page_source = []
url_list = []
for i in pages:
    options = Options()
    options.add_argument("--disable-notifications")
    options.add_argument('--disable-gpu')
    options.add_argument('user-agent=fake-useragent')

    driver = webdriver.Chrome(ChromeDriverManager().install())
    driver.maximize_window()
    driver.get(i)
    time.sleep(1)
    html = driver.page_source #this will only get the page_sources and save it in HTML
    # page_source.append(html) # saving page_source
    time.sleep(1)
    soup = bs4(html)
    result = soup.find_all("li",{"class":"pb-s mt-m bd-bottom-disabled-gray record list-view-data"})
    for i in result:
        url = i.find("a").get("href")
        url_list.append(url)
    time.sleep(1)
    driver.close()

[WDM] - Downloading: 100%|█████████████████| 8.72M/8.72M [00:00<00:00, 11.5MB/s]


Checking the length of the URL List and appending the book details to the B&N URL 

In [5]:
len(url_list)

100

In [6]:
book_url = []
for i in url_list:
    i = "https://www.barnesandnoble.com" + i
    book_url.append(i)

In [7]:
book_url[1]

'https://www.barnesandnoble.com/w/if-he-had-been-with-me-laura-nowlin/1112689781;jsessionid=47C2EB06DB113BF70D73545CF8DA1879.prodny_store01-atgap01?ean=9781728205489'

Creating a new list with the web-scraping details to be later used to put into a DataFrame

In [8]:
book_source = []
for i in book_url:
    options = Options()
    options.add_argument("--disable-notifications")
    options.add_argument('--disable-gpu')
    options.add_argument('user-agent=fake-useragent')

    driver = webdriver.Chrome(ChromeDriverManager().install())
    driver.maximize_window()
    driver.get(i)
    time.sleep(1)
    html = driver.page_source #this will only get the page_sources and save it in HTML
    soup = bs4(html)
    book_source.append(soup) # saving page_source
    

In [9]:
print(book_source)

IOPub data rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--NotebookApp.iopub_data_rate_limit`.

Current values:
NotebookApp.iopub_data_rate_limit=1000000.0 (bytes/sec)
NotebookApp.rate_limit_window=3.0 (secs)



In [10]:
len(book_source)

100

##### Shown below are additional steps in how I retracted the information from book_source. 

In [11]:
book_source[0].find_all('h1')[0].get_text()

'Love, Pamela (Signed Book)'

In [12]:
book_source[0].find_all('span',class_= "contributors pdp-book-author")[0].select('input')[0]['value']
#key-contributors > a

'Pamela Anderson'

In [13]:
float(book_source[0].find_all('div',class_="bv_avgRating_component_container notranslate")[0].get_text())

5.0

In [14]:
book_source[4].find_all('span',class_="related-sub-text pt-xxs")[0].get_text()

' 2023 Barnes & Noble Book Club Selections'

In [15]:
book_source[0].find_all('img',class_ = "full-shadow ResolveComplete pdpImgLoadCompleted")[0]["src"]

'https://prodimage.images-bn.com/lf?set=key%5Bresolve.pixelRatio%5D,value%5B1%5D&set=key%5Bresolve.width%5D,value%5B600%5D&set=key%5Bresolve.height%5D,value%5B10000%5D&set=key%5Bresolve.imageFit%5D,value%5Bcontainerwidth%5D&set=key%5Bresolve.allowImageUpscaling%5D,value%5B0%5D&set=key%5Bresolve.format%5D,value%5Bwebp%5D&source=url%5Bhttps://prodimage.images-bn.com/pimages/9780063317420_p0_v2_s600x595.jpg%5D&scale=options%5Blimit%5D,size%5B600x10000%5D&sink=format%5Bwebp%5D'

In [63]:
book_source[0].find_all('div',class_="pt-m mb-l bs-ov-section")[0].get_text()

'\n\n\nOverview\n\n\n\nTO LIVEAND DREAMIS A WICKED DANCE.MY DREAMS OFTEN COME TRUE—A CURSE,AND A BLESSING.PAMELA ANDERSON’s blond bombshell image was ubiquitous in the 1990s. Discovered in the stands during a Canadian football game, she was quickly launched into superstardom, becoming Playboy’s favorite cover girl and an emblem of Hollywood glamour and sex appeal. Yet the Pamela Anderson we think we know was created through happenstance rather than careful cultivation. Love, Pamela brings forth her true story: that of a small-town girl getting tangled up in her own dream.Growing up on Vancouver Island, the daughter of young, wild, and unwittingly stylish parents, Pamela lived a hardscrabble childhood but developed a deep love for nature, populating her world with misfits, apparitional friends, and injured animals. Eventually overcoming her natural shyness, Pamela’s restless imagination propelled her into a life few can dream of, from the beaches of Malibu to the coveted scene at the Pl

In [22]:
book_source[0].find_all('table',class_="plain centered")[0].select("span")[0].get_text()

'HarperCollins Publishers'

In [23]:
book_source[0].find_all('tr')[2].select('td')[0].get_text()

'01/31/2023'

In [24]:
book_source[0].find_all('tr')[4].select('td')[0].get_text()

'256'

In [25]:
book_source[0].find_all('div',class_="tab-container")[0].select('td')[0].get_text()

'9780063317420'

##### Appending the individual lists for  title, genre, rating, author, isbn, publisher, description, page_count, year published, image_link from the book_source(which was scraped from the B&N website) 

In [66]:
title = []
genre = []
rating = []
author = []
isbn = []
publisher = []
description = []
page_count = []
year_published = []
image_link = []

for i in book_source:
    
    a = i.find_all('h1')[0].get_text()
    title.append(a)
    b = i.find_all('span',class_= "contributors pdp-book-author")[0].select('input')[0]['value']
    author.append(b)
    try:
        c = float(i.find_all('div',class_="bv_avgRating_component_container notranslate")[0].get_text())
    except:
        c = 0.0
    rating.append(c)
    try:
        d = i.find_all('span',class_="related-sub-text pt-xxs")[0].get_text()
    except:
        d = "NaN"
    genre.append(d)
    
    try:    
        e = i.find_all('div',class_="tab-container")[0].select('td')[0].get_text()
    except:
        e = 0.0
    isbn.append(e)
    
    try: 
        f = i.find_all('div',class_="pt-m mb-l bs-ov-section")[0].get_text()
    except:
        f ="NaN"
    description.append(f)
    
    try:
        g = i.find_all('tr')[4].select('td')[0].get_text()
    except:
        g = 0.0
    page_count.append(g)
    
    try:
        h = i.find_all('tr')[2].select('td')[0].get_text()
    except:
        h = 0.0
    year_published.append(h)
    
    try:
        x = i.find_all('img',class_ = "full-shadow ResolveComplete pdpImgLoadCompleted")[0]["src"]
    except:
        x = "NaN"
    image_link.append(x)
    
    try:
        j = i.find_all('table',class_="plain centered")[0].select("span")[0].get_text()
    except:
        j = "NaN"
    publisher.append(j)

print(title)
print(author)
print(rating)
print(genre)
print(isbn)
print(publisher)
print(description)
print(page_count)
print(year_published)
print(image_link)

['Love, Pamela (Signed Book)', 'If He Had Been with Me', 'And There He Kept Her: A Novel', 'Chain of Thorns', 'The Snow Hare (Barnes & Noble Book Club Edition)', 'It Ends with Us', 'Spare', "Go Get 'Em, Tiger! (A Hello!Lucky Book) (Barnes & Noble Edition)", 'Final Offer', 'The Creative Act: A Way of Being', 'Lessons in Chemistry (B&N Exclusive Edition) (B&N Book of the Year)', 'It Starts with Us: A Novel', 'The Galveston Diet: The Doctor-Developed, Patient-Proven Plan to Burn Fat and Tame Your Hormonal Symptoms', 'Kingdom of Ash (Throne of Glass Series #7)', 'Twisted Hate (Twisted Series #3)', 'The Way I Used to Be', 'A Court of Mist and Fury (A Court of Thorns and Roses Series #2)', 'A Court of Wings and Ruin (A Court of Thorns and Roses Series #3)', 'Hooky Volume 3', 'Becoming Free Indeed: My Story of Disentangling Faith from Fear', 'Twisted Lies (Twisted Series #4)', 'Look Out for the Little Guy', 'Twisted Games (Twisted Series #2)', 'Cooking for the Culture: Recipes and Stories fro

##### Adding all of the information to a DataFrame

In [67]:
import pandas as pd
books = pd.DataFrame({"title":title,
                      "author":author,
                      "rating":rating,
                      "genre":genre,
                      "description":description,
                      "publisher":publisher,
                      "year_published":year_published,
                      "isbn":isbn,
                      "page_count":page_count,
                      "image_link":image_link})
                            
books

Unnamed: 0,title,author,rating,genre,description,publisher,year_published,isbn,page_count,image_link
0,"Love, Pamela (Signed Book)",Pamela Anderson,5.0,Actors & Actresses - Biography,\n\n\nOverview\n\n\n\nTO LIVEAND DREAMIS A WIC...,HarperCollins Publishers,01/31/2023,9780063317420,256,https://prodimage.images-bn.com/lf?set=key%5Br...
1,If He Had Been with Me,Laura Nowlin,4.6,Dating and sex->Teen fiction->General,\n\n\nOverview\n\n\n\nA New York Times Bestsel...,Sourcebooks,11/01/2019,9781728205489,400,https://prodimage.images-bn.com/lf?set=key%5Br...
2,And There He Kept Her: A Novel,Joshua Moehling,4.3,Crime Thrillers,\n\n\nOverview\n\n\nNotes From Your Bookseller...,Sourcebooks,01/31/2023,9781728275772,336,https://prodimage.images-bn.com/lf?set=key%5Br...
3,Chain of Thorns,Cassandra Clare,4.0,Adventure->Historical->Teen fiction,\n\n\nOverview\n\n\nNotes From Your Bookseller...,Margaret K. McElderry Books,01/31/2023,9781481431934,800,https://prodimage.images-bn.com/lf?set=key%5Br...
4,The Snow Hare (Barnes & Noble Book Club Edition),Paula Lichtarowicz,0.0,2023 Barnes & Noble Book Club Selections,\n\n\nOverview\n\n\nNotes From Your Bookseller...,"Little, Brown and Company",01/31/2023,9780316566902,384,https://prodimage.images-bn.com/lf?set=key%5Br...
...,...,...,...,...,...,...,...,...,...,...
95,November 9,Colleen Hoover,4.7,Contemporary Romance - General,\n\n\nOverview\n\n\nNotes From Your Bookseller...,Atria Books,11/10/2015,9781501110344,98,https://prodimage.images-bn.com/lf?set=key%5Br...
96,A Little Life,Hanya Yanagihara,4.4,2015 Kirkus Prize Finalists,\n\n\nOverview\n\n\nNotes From Your Bookseller...,Knopf Doubleday Publishing Group,01/26/2016,9780804172707,832,https://prodimage.images-bn.com/lf?set=key%5Br...
97,The Bandit Queens (Barnes & Noble Book Club Ed...,Parini Shroff,4.3,2023 Barnes & Noble Book Club Selections,\n\n\nOverview\n\n\nNotes From Your Bookseller...,Random House Publishing Group,01/03/2023,9780593722909,352,https://prodimage.images-bn.com/lf?set=key%5Br...
98,I'm Glad My Mom Died,Jennette McCurdy,4.6,Actors & Actresses - Biography,\n\n\nOverview\n\n\nNotes From Your Bookseller...,Simon & Schuster,08/09/2022,9781982185824,101,https://prodimage.images-bn.com/lf?set=key%5Br...


##### Exporting the information to a CSV

In [50]:
books.to_csv('books.csv',index=False)

##### Exporting a dataset that was obtained from Kaggle

In [53]:
data3 = pd.read_csv('books 3.csv')
data3

Unnamed: 0,isbn13,isbn10,title,subtitle,authors,categories,thumbnail,description,published_year,average_rating,num_pages,ratings_count
0,9780002005883,0002005883,Gilead,,Marilynne Robinson,Fiction,http://books.google.com/books/content?id=KQZCP...,A NOVEL THAT READERS and critics have been eag...,2004.0,3.85,247.0,361.0
1,9780002261982,0002261987,Spider's Web,A Novel,Charles Osborne;Agatha Christie,Detective and mystery stories,http://books.google.com/books/content?id=gA5GP...,A new 'Christie for Christmas' -- a full-lengt...,2000.0,3.83,241.0,5164.0
2,9780006163831,0006163831,The One Tree,,Stephen R. Donaldson,American fiction,http://books.google.com/books/content?id=OmQaw...,Volume Two of Stephen Donaldson's acclaimed se...,1982.0,3.97,479.0,172.0
3,9780006178736,0006178731,Rage of angels,,Sidney Sheldon,Fiction,http://books.google.com/books/content?id=FKo2T...,"A memorable, mesmerizing heroine Jennifer -- b...",1993.0,3.93,512.0,29532.0
4,9780006280897,0006280897,The Four Loves,,Clive Staples Lewis,Christian life,http://books.google.com/books/content?id=XhQ5X...,Lewis' work on the nature of love divides love...,2002.0,4.15,170.0,33684.0
...,...,...,...,...,...,...,...,...,...,...,...,...
6805,9788185300535,8185300534,I Am that,Talks with Sri Nisargadatta Maharaj,Sri Nisargadatta Maharaj;Sudhakar S. Dikshit,Philosophy,http://books.google.com/books/content?id=Fv_JP...,This collection of the timeless teachings of o...,1999.0,4.51,531.0,104.0
6806,9788185944609,8185944601,Secrets Of The Heart,,Khalil Gibran,Mysticism,http://books.google.com/books/content?id=XcrVp...,,1993.0,4.08,74.0,324.0
6807,9788445074879,8445074873,Fahrenheit 451,,Ray Bradbury,Book burning,,,2004.0,3.98,186.0,5733.0
6808,9789027712059,9027712050,The Berlin Phenomenology,,Georg Wilhelm Friedrich Hegel,History,http://books.google.com/books/content?id=Vy7Sk...,Since the three volume edition ofHegel's Philo...,1981.0,0.00,210.0,0.0


In [54]:
data3.shape

(6810, 12)

##### Dropping unecessary columns and renaming and re-ordering columns to uniformely fit for data cleaning and ultimately modeling

In [55]:
data3 = data3.drop(columns= ['isbn10','subtitle','ratings_count'],axis=1)
data3 = pd.DataFrame(data3)
data3 = data3.rename(columns = {'authors':'author','categories':'genre','average_rating':'rating','thumbnail':'image_link','published_year':'year_published','isbn13':'isbn','num_pages':'page_count'})
data3 = data3[['title','author','rating','genre','description','year_published','isbn','page_count','image_link']]
data3

Unnamed: 0,title,author,rating,genre,description,year_published,isbn,page_count,image_link
0,Gilead,Marilynne Robinson,3.85,Fiction,A NOVEL THAT READERS and critics have been eag...,2004.0,9780002005883,247.0,http://books.google.com/books/content?id=KQZCP...
1,Spider's Web,Charles Osborne;Agatha Christie,3.83,Detective and mystery stories,A new 'Christie for Christmas' -- a full-lengt...,2000.0,9780002261982,241.0,http://books.google.com/books/content?id=gA5GP...
2,The One Tree,Stephen R. Donaldson,3.97,American fiction,Volume Two of Stephen Donaldson's acclaimed se...,1982.0,9780006163831,479.0,http://books.google.com/books/content?id=OmQaw...
3,Rage of angels,Sidney Sheldon,3.93,Fiction,"A memorable, mesmerizing heroine Jennifer -- b...",1993.0,9780006178736,512.0,http://books.google.com/books/content?id=FKo2T...
4,The Four Loves,Clive Staples Lewis,4.15,Christian life,Lewis' work on the nature of love divides love...,2002.0,9780006280897,170.0,http://books.google.com/books/content?id=XhQ5X...
...,...,...,...,...,...,...,...,...,...
6805,I Am that,Sri Nisargadatta Maharaj;Sudhakar S. Dikshit,4.51,Philosophy,This collection of the timeless teachings of o...,1999.0,9788185300535,531.0,http://books.google.com/books/content?id=Fv_JP...
6806,Secrets Of The Heart,Khalil Gibran,4.08,Mysticism,,1993.0,9788185944609,74.0,http://books.google.com/books/content?id=XcrVp...
6807,Fahrenheit 451,Ray Bradbury,3.98,Book burning,,2004.0,9788445074879,186.0,
6808,The Berlin Phenomenology,Georg Wilhelm Friedrich Hegel,0.00,History,Since the three volume edition ofHegel's Philo...,1981.0,9789027712059,210.0,http://books.google.com/books/content?id=Vy7Sk...


##### Exporting the dataset for future use

In [56]:
data3.to_csv('data3.csv',index=False)

##### Running 2nd dataset that was obtained from Kaggle 

In [58]:
data5 = pd.read_csv('books_1.Best_Books_Ever.csv')
data5

Unnamed: 0,bookId,title,series,author,rating,description,language,isbn,genres,characters,...,firstPublishDate,awards,numRatings,ratingsByStars,likedPercent,setting,coverImg,bbeScore,bbeVotes,price
0,2767052-the-hunger-games,The Hunger Games,The Hunger Games #1,Suzanne Collins,4.33,WINNING MEANS FAME AND FORTUNE.LOSING MEANS CE...,English,9780439023481,"['Young Adult', 'Fiction', 'Dystopia', 'Fantas...","['Katniss Everdeen', 'Peeta Mellark', 'Cato (H...",...,,['Locus Award Nominee for Best Young Adult Boo...,6376780,"['3444695', '1921313', '745221', '171994', '93...",96.0,"['District 12, Panem', 'Capitol, Panem', 'Pane...",https://i.gr-assets.com/images/S/compressed.ph...,2993816,30516,5.09
1,2.Harry_Potter_and_the_Order_of_the_Phoenix,Harry Potter and the Order of the Phoenix,Harry Potter #5,"J.K. Rowling, Mary GrandPré (Illustrator)",4.50,There is a door at the end of a silent corrido...,English,9780439358071,"['Fantasy', 'Young Adult', 'Fiction', 'Magic',...","['Sirius Black', 'Draco Malfoy', 'Ron Weasley'...",...,06/21/03,['Bram Stoker Award for Works for Young Reader...,2507623,"['1593642', '637516', '222366', '39573', '14526']",98.0,['Hogwarts School of Witchcraft and Wizardry (...,https://i.gr-assets.com/images/S/compressed.ph...,2632233,26923,7.38
2,2657.To_Kill_a_Mockingbird,To Kill a Mockingbird,To Kill a Mockingbird,Harper Lee,4.28,The unforgettable novel of a childhood in a sl...,English,9999999999999,"['Classics', 'Fiction', 'Historical Fiction', ...","['Scout Finch', 'Atticus Finch', 'Jem Finch', ...",...,07/11/60,"['Pulitzer Prize for Fiction (1961)', 'Audie A...",4501075,"['2363896', '1333153', '573280', '149952', '80...",95.0,"['Maycomb, Alabama (United States)']",https://i.gr-assets.com/images/S/compressed.ph...,2269402,23328,
3,1885.Pride_and_Prejudice,Pride and Prejudice,,"Jane Austen, Anna Quindlen (Introduction)",4.26,Alternate cover edition of ISBN 9780679783268S...,English,9999999999999,"['Classics', 'Fiction', 'Romance', 'Historical...","['Mr. Bennet', 'Mrs. Bennet', 'Jane Bennet', '...",...,01/28/13,[],2998241,"['1617567', '816659', '373311', '113934', '767...",94.0,"['United Kingdom', 'Derbyshire, England (Unite...",https://i.gr-assets.com/images/S/compressed.ph...,1983116,20452,
4,41865.Twilight,Twilight,The Twilight Saga #1,Stephenie Meyer,3.60,About three things I was absolutely positive.\...,English,9780316015844,"['Young Adult', 'Fantasy', 'Romance', 'Vampire...","['Edward Cullen', 'Jacob Black', 'Laurent', 'R...",...,10/05/05,"['Georgia Peach Book Award (2007)', 'Buxtehude...",4964519,"['1751460', '1113682', '1008686', '542017', '5...",78.0,"['Forks, Washington (United States)', 'Phoenix...",https://i.gr-assets.com/images/S/compressed.ph...,1459448,14874,2.1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
52473,11492014-fractured,Fractured,Fateful #2,Cheri Schmidt (Goodreads Author),4.00,The Fateful Trilogy continues with Fractured. ...,English,2940012616562,"['Vampires', 'Paranormal', 'Young Adult', 'Rom...",[],...,,[],871,"['311', '310', '197', '42', '11']",94.0,[],https://i.gr-assets.com/images/S/compressed.ph...,0,1,
52474,11836711-anasazi,Anasazi,Sense of Truth #2,Emma Michaels,4.19,"'Anasazi', sequel to 'The Thirteenth Chime' by...",English,9999999999999,"['Mystery', 'Young Adult']",[],...,August 3rd 2011,[],37,"['16', '14', '5', '2', '0']",95.0,[],https://i.gr-assets.com/images/S/compressed.ph...,0,1,
52475,10815662-marked,Marked,Soul Guardians #1,Kim Richardson (Goodreads Author),3.70,--READERS FAVORITE AWARDS WINNER 2011--Sixteen...,English,9781461017097,"['Fantasy', 'Young Adult', 'Paranormal', 'Ange...",[],...,March 15th 2011,"[""Readers' Favorite Book Award (2011)""]",6674,"['2109', '1868', '1660', '647', '390']",84.0,[],https://i.gr-assets.com/images/S/compressed.ph...,0,1,7.37
52476,11330278-wayward-son,Wayward Son,,"Tom Pollack (Goodreads Author), John Loftus (G...",3.85,A POWERFUL TREMOR UNEARTHS AN ANCIENT SECRETBu...,English,9781450755634,"['Fiction', 'Mystery', 'Historical Fiction', '...",[],...,April 5th 2011,[],238,"['77', '78', '59', '19', '5']",90.0,[],https://i.gr-assets.com/images/S/compressed.ph...,0,1,2.86


##### Dropping the unecessary columns and renaming and re-ordering columns for them to unilateraly fit for future cleaning and modeling

In [59]:
data5 = data5.drop(columns= ['bookId','series','awards','language','characters','firstPublishDate','numRatings','ratingsByStars','likedPercent','setting','bbeScore','bbeVotes','price','bookFormat','edition','publisher'],axis=1)

data5 = pd.DataFrame(data5)
data5
data5 = data5.rename(columns = {'genres':'genre','coverImg':'image_link','publishDate':'year_published','pages':'page_count'})
data5 = data5[['title','author','rating','genre','description','year_published','isbn','page_count','image_link']]
data5

Unnamed: 0,title,author,rating,genre,description,year_published,isbn,page_count,image_link
0,The Hunger Games,Suzanne Collins,4.33,"['Young Adult', 'Fiction', 'Dystopia', 'Fantas...",WINNING MEANS FAME AND FORTUNE.LOSING MEANS CE...,09/14/08,9780439023481,374,https://i.gr-assets.com/images/S/compressed.ph...
1,Harry Potter and the Order of the Phoenix,"J.K. Rowling, Mary GrandPré (Illustrator)",4.50,"['Fantasy', 'Young Adult', 'Fiction', 'Magic',...",There is a door at the end of a silent corrido...,09/28/04,9780439358071,870,https://i.gr-assets.com/images/S/compressed.ph...
2,To Kill a Mockingbird,Harper Lee,4.28,"['Classics', 'Fiction', 'Historical Fiction', ...",The unforgettable novel of a childhood in a sl...,05/23/06,9999999999999,324,https://i.gr-assets.com/images/S/compressed.ph...
3,Pride and Prejudice,"Jane Austen, Anna Quindlen (Introduction)",4.26,"['Classics', 'Fiction', 'Romance', 'Historical...",Alternate cover edition of ISBN 9780679783268S...,10/10/00,9999999999999,279,https://i.gr-assets.com/images/S/compressed.ph...
4,Twilight,Stephenie Meyer,3.60,"['Young Adult', 'Fantasy', 'Romance', 'Vampire...",About three things I was absolutely positive.\...,09/06/06,9780316015844,501,https://i.gr-assets.com/images/S/compressed.ph...
...,...,...,...,...,...,...,...,...,...
52473,Fractured,Cheri Schmidt (Goodreads Author),4.00,"['Vampires', 'Paranormal', 'Young Adult', 'Rom...",The Fateful Trilogy continues with Fractured. ...,May 28th 2011,2940012616562,0,https://i.gr-assets.com/images/S/compressed.ph...
52474,Anasazi,Emma Michaels,4.19,"['Mystery', 'Young Adult']","'Anasazi', sequel to 'The Thirteenth Chime' by...",August 5th 2011,9999999999999,190,https://i.gr-assets.com/images/S/compressed.ph...
52475,Marked,Kim Richardson (Goodreads Author),3.70,"['Fantasy', 'Young Adult', 'Paranormal', 'Ange...",--READERS FAVORITE AWARDS WINNER 2011--Sixteen...,March 18th 2011,9781461017097,280,https://i.gr-assets.com/images/S/compressed.ph...
52476,Wayward Son,"Tom Pollack (Goodreads Author), John Loftus (G...",3.85,"['Fiction', 'Mystery', 'Historical Fiction', '...",A POWERFUL TREMOR UNEARTHS AN ANCIENT SECRETBu...,September 1st 2011,9781450755634,507,https://i.gr-assets.com/images/S/compressed.ph...


##### Exporting the last dataset

In [61]:
data5.to_csv('data5.csv', index=False)

### LA FIN pour Phase 1