### Web scraping algorithms: 

The following code uses <span style="color:green">BeautifulSoup</span> and <span style="color:green">requests</span> python packages to scrap the quotes from 'http://quotes.toscrape.com' website. As result, the quotes will be stored in a pandas data frame with three columns: text, author, and tags.


In [7]:
__author__ = 'Saeid SOHILY-KHAH'
"""
Web scraping algorithms: Web scraping using Beautiful Soup and Requests (quotes)
"""
import requests
import pandas as pd
from bs4 import BeautifulSoup

pd.set_option('display.max_colwidth', 1000)  # to extend the output display

# Initialization
top_n = 11 # number of pages to scrap
text_lst = [] 
author_lst = []
tags_lst = []

# Navigate top_n urls
for i in range(top_n):
    url = "http://quotes.toscrape.com/page/{}/".format(i) # generate url of page <i>
    request = requests.get(url)
    print('scraping {}'.format(url))

    # Pulling data out of HTML (or XML) files
    data = request.content 
    soup = BeautifulSoup(data, "lxml") # parsing website using BeautifulSoup
    
    # Navigate data structure
    objects = soup.find_all('div', attrs={'class': 'quote'})  # or soup.find_all(class_='text')
    for quote in objects:
        text = quote.find('span', attrs={'class': "text"})
        try:
            quote_text = str(text.find(text=True, recursive=False))
        except:
            quote_text = " "
        text_lst.append(quote_text)

        author = quote.find('small', attrs={'class': 'author'})
        try:
            quote_author = str(author.find(text=True, recursive=False))
        except:
            quote_author = " "
        author_lst.append(quote_author)

        tags = quote.find_all('a', attrs={'class': 'tag'})
        try:
            quote_tags = " "
            for tag in tags:
                quote_tags = quote_tags + ', ' + str(tag.find(text=True, recursive=False)) 
            quote_tags = quote_tags.strip(', ')
        except:
            quote_tags = " "
        tags_lst.append(quote_tags)

scraping http://quotes.toscrape.com/page/0/
scraping http://quotes.toscrape.com/page/1/
scraping http://quotes.toscrape.com/page/2/
scraping http://quotes.toscrape.com/page/3/
scraping http://quotes.toscrape.com/page/4/
scraping http://quotes.toscrape.com/page/5/
scraping http://quotes.toscrape.com/page/6/
scraping http://quotes.toscrape.com/page/7/
scraping http://quotes.toscrape.com/page/8/
scraping http://quotes.toscrape.com/page/9/
scraping http://quotes.toscrape.com/page/10/


In [8]:
# Create pandas dataframe from the scraped data
df = pd.DataFrame({"text": text_lst, "author": author_lst, "tags": tags_lst}, index=list(range(len(text_lst))))
df.head(100)

Unnamed: 0,author,tags,text
0,Albert Einstein,"change, deep-thoughts, thinking, world",“The world as we have created it is a process of our thinking. It cannot be changed without changing our thinking.”
1,J.K. Rowling,"abilities, choices","“It is our choices, Harry, that show what we truly are, far more than our abilities.”"
2,Albert Einstein,"inspirational, life, live, miracle, miracles",“There are only two ways to live your life. One is as though nothing is a miracle. The other is as though everything is a miracle.”
3,Jane Austen,"aliteracy, books, classic, humor","“The person, be it gentleman or lady, who has not pleasure in a good novel, must be intolerably stupid.”"
4,Marilyn Monroe,"be-yourself, inspirational","“Imperfection is beauty, madness is genius and it's better to be absolutely ridiculous than absolutely boring.”"
5,Albert Einstein,"adulthood, success, value",“Try not to become a man of success. Rather become a man of value.”
6,André Gide,"life, love",“It is better to be hated for what you are than to be loved for what you are not.”
7,Thomas A. Edison,"edison, failure, inspirational, paraphrased","“I have not failed. I've just found 10,000 ways that won't work.”"
8,Eleanor Roosevelt,misattributed-eleanor-roosevelt,“A woman is like a tea bag; you never know how strong it is until it's in hot water.”
9,Steve Martin,"humor, obvious, simile","“A day without sunshine is like, you know, night.”"
