# Install Necessary Libraries
Before you start, make sure you have Python installed on your system. You can install the required libraries using pip:

In [None]:
!pip install requests
!pip install beautifulsoup4
!pip install pandas

# Import the Library class

In [1]:
import requests
from bs4 import BeautifulSoup
import pandas as pd

# Download the web page Quotes To Scrape using requests library

Requests is elegant and simple HTTP (Hyper Text Transfer Protocol) library for Python which allows you to send HTTP requests easily. Requests.get function will download the web page and returns a response object with page contents and some information indicating whether the request was successful, using a status_code.

In [2]:
url="http://quotes.toscrape.com/page/1/"
response = requests.get(url)

In [3]:
response = response.content

# Parsing parts of the website using Beautiful Soup

To extract information from the HTML source code using programming, we will use the Beautiful Soup library. Beautiful Soup will return an object containing several properties and methods to extract the information from HTML documents.

In [4]:
soup = BeautifulSoup(response,'html.parser')

Extract all quotes

In [5]:
quote = soup.find('div',attrs={'class':'quoteText'})

In [6]:
soup.span.text

'“The world as we have created it is a process of our thinking. It cannot be changed without changing our thinking.”'

In [7]:
quotes = soup.find_all('div', {'class': 'quote'})
for i in quotes:
    print((i.find('span', {'class':'text'})).text)

“The world as we have created it is a process of our thinking. It cannot be changed without changing our thinking.”
“It is our choices, Harry, that show what we truly are, far more than our abilities.”
“There are only two ways to live your life. One is as though nothing is a miracle. The other is as though everything is a miracle.”
“The person, be it gentleman or lady, who has not pleasure in a good novel, must be intolerably stupid.”
“Imperfection is beauty, madness is genius and it's better to be absolutely ridiculous than absolutely boring.”
“Try not to become a man of success. Rather become a man of value.”
“It is better to be hated for what you are than to be loved for what you are not.”
“I have not failed. I've just found 10,000 ways that won't work.”
“A woman is like a tea bag; you never know how strong it is until it's in hot water.”
“A day without sunshine is like, you know, night.”


**Extract all authors**

In [8]:
for i in soup.findAll("div",{"class": "quote"}):
    print((i.find("small", {"class": "author"})).text)

Albert Einstein
J.K. Rowling
Albert Einstein
Jane Austen
Marilyn Monroe
Albert Einstein
André Gide
Thomas A. Edison
Eleanor Roosevelt
Steve Martin


**Extract all tags**

In [9]:
for i in soup.findAll("div",{"class": "tags"}):
    print((i.find("meta"))['content'])

change,deep-thoughts,thinking,world
abilities,choices
inspirational,life,live,miracle,miracles
aliteracy,books,classic,humor
be-yourself,inspirational
adulthood,success,value
life,love
edison,failure,inspirational,paraphrased
misattributed-eleanor-roosevelt
humor,obvious,simile


All quotes, authors and tags



In [10]:
for i in soup.findAll("div",{"class": "quote"}):
    print((i.find('span', {'class':'text'})).text)
    print((i.find("small", {"class": "author"})).text)
    print((i.find("meta"))['content'])

“The world as we have created it is a process of our thinking. It cannot be changed without changing our thinking.”
Albert Einstein
change,deep-thoughts,thinking,world
“It is our choices, Harry, that show what we truly are, far more than our abilities.”
J.K. Rowling
abilities,choices
“There are only two ways to live your life. One is as though nothing is a miracle. The other is as though everything is a miracle.”
Albert Einstein
inspirational,life,live,miracle,miracles
“The person, be it gentleman or lady, who has not pleasure in a good novel, must be intolerably stupid.”
Jane Austen
aliteracy,books,classic,humor
“Imperfection is beauty, madness is genius and it's better to be absolutely ridiculous than absolutely boring.”
Marilyn Monroe
be-yourself,inspirational
“Try not to become a man of success. Rather become a man of value.”
Albert Einstein
adulthood,success,value
“It is better to be hated for what you are than to be loved for what you are not.”
André Gide
life,love
“I have not fa

**Creating dataframe for all quotes, authors and tags**

In [11]:
url = 'http://quotes.toscrape.com/page/'

# create empty arrays
quotes = []
authors = []
tags = []

# loop over page 1 to 10
for pages in range(1,10): 
        
        html = requests.get(url + str(pages))
        
        soup = BeautifulSoup(html.text)    

        for i in soup.findAll("div",{"class":"quote"}):
                 quotes.append((i.find("span",{"class":"text"})).text)  
                 authors.append((i.find("small",{"class":"author"})).text)    
                 tags.append((i.find("meta"))['content'])
                 
#Create pandas dataframe
df = pd.DataFrame({'Quotes':quotes,'Authors':authors,'Tags':tags})

In [12]:
df

Unnamed: 0,Quotes,Authors,Tags
0,“The world as we have created it is a process ...,Albert Einstein,"change,deep-thoughts,thinking,world"
1,"“It is our choices, Harry, that show what we t...",J.K. Rowling,"abilities,choices"
2,“There are only two ways to live your life. On...,Albert Einstein,"inspirational,life,live,miracle,miracles"
3,"“The person, be it gentleman or lady, who has ...",Jane Austen,"aliteracy,books,classic,humor"
4,"“Imperfection is beauty, madness is genius and...",Marilyn Monroe,"be-yourself,inspirational"
...,...,...,...
85,“Some day you will be old enough to start read...,C.S. Lewis,"age,fairytales,growing-up"
86,“We are not necessarily doubting that God will...,C.S. Lewis,god
87,“The fear of death follows from the fear of li...,Mark Twain,"death,life"
88,“A lie can travel half way around the world wh...,Mark Twain,"misattributed-mark-twain,truth"


# Convert the parsed information into CSV file


In [13]:
df.to_csv('Quotes_to_Scrape.csv')

# Have a Look on CSV File Using Pandas Library
read_csv helps to read a comma-separated values (csv) file into DataFrame.

In [14]:
pd.read_csv('Quotes_to_Scrape.csv')

Unnamed: 0.1,Unnamed: 0,Quotes,Authors,Tags
0,0,“The world as we have created it is a process ...,Albert Einstein,"change,deep-thoughts,thinking,world"
1,1,"“It is our choices, Harry, that show what we t...",J.K. Rowling,"abilities,choices"
2,2,“There are only two ways to live your life. On...,Albert Einstein,"inspirational,life,live,miracle,miracles"
3,3,"“The person, be it gentleman or lady, who has ...",Jane Austen,"aliteracy,books,classic,humor"
4,4,"“Imperfection is beauty, madness is genius and...",Marilyn Monroe,"be-yourself,inspirational"
...,...,...,...,...
85,85,“Some day you will be old enough to start read...,C.S. Lewis,"age,fairytales,growing-up"
86,86,“We are not necessarily doubting that God will...,C.S. Lewis,god
87,87,“The fear of death follows from the fear of li...,Mark Twain,"death,life"
88,88,“A lie can travel half way around the world wh...,Mark Twain,"misattributed-mark-twain,truth"


# Summary

* Downloaded the webpage using requests library
* Find the list of quotes, author names, urls, quote tags for the single page by parsing the HTML source code of the web page using the Beautiful Soup library
* Combined the lists of all the required pages
* Convert those lists into list of dictionaries
* Convert the parsed information into CSV file
* Had a look on CSV file using Pandas library