## Scraping Quotes 2

In [1]:
# These are the needed libraries
import requests
from bs4 import BeautifulSoup

Now that we know how to scrape a single page, what about scraping multiple pages?  
Often, data is presented over multiple pages (pagination) without knowing how many pages there are.  
We know __http://quotes.toscrape.com/__ has 10 pages in total by going through all of them.  
This is unfeasible for sites with hundreds of pages or if the number of pages is dynamic.  

Right click on the 'Next' button and 'inspect' the source.  

```<li class="next">
    <a href="/page/2/">
        Next
        <span aria-hidden="true">
            →
        </span>
    </a>
</li>```

'Next' us to http://quotes.toscrape.com/page/2/  
On the second page there's also a 'Previous' button.  
That takes us to http://quotes.toscrape.com/page/1/

Apparently page is identified by the path (**page/X/**) after the domain (**http://quotes.toscrape.com**)  
So, we need to go through the pages, find the 'Next' button for the next page.  

In [2]:
# We already know how to parse a page full of quotes
def get_page_soup(url):
    response = requests.get(url)
    text = response.text
    return BeautifulSoup(text,'html.parser')    

In [3]:
# We also know how to extract quotes from a page-soup
def print_quotes(page_soup):
    for quote in page_soup.find_all('div', {'class': 'quote'}):
        # Print the quote text
        quote_text = quote.find_all('span', {'class': "text"})[0].text
        print(quote_text)

        # Print the author
        quote_author = quote.find_all('small', {'class': "author"})[0].text
        print(quote_author, end=" | ")

        # Print the tags
        for content_element in quote.find_all('a', {'class': "tag"}):
            tag = content_element.text
            print(tag, end=" ")

        # Close author/tag line and add empty line
        print("\n")

In [4]:
# Get the path for the new page or None if no new page
def get_next_path(page_soup):
    li = page_soup.find('li', {"class": "next"})
    
    if li is None:
        # No next page
        return None
    else:
        # Get the path for the next page
        return li.findNext('a')['href']

In [5]:
# Let's put it all together
def loop_pages(page_url):
    while True:
        # Get the 'soup'
        page_soup = get_page_soup(page_url)

        # Print all the quotes on the page
        print_quotes(page_soup)    
        
        # Get the path for the next page
        next_path = get_next_path(page_soup)
        
        # This was the last page
        if next_path is None:
            break
            
        # The URL of the next page
        page_url = base_url + next_path
        
# Domain and start url
base_url = "http://quotes.toscrape.com"
start_url = base_url + "/page/1"
# Let's go!
loop_pages(start_url)

“The world as we have created it is a process of our thinking. It cannot be changed without changing our thinking.”
Albert Einstein | change deep-thoughts thinking world 

“It is our choices, Harry, that show what we truly are, far more than our abilities.”
J.K. Rowling | abilities choices 

“There are only two ways to live your life. One is as though nothing is a miracle. The other is as though everything is a miracle.”
Albert Einstein | inspirational life live miracle miracles 

“The person, be it gentleman or lady, who has not pleasure in a good novel, must be intolerably stupid.”
Jane Austen | aliteracy books classic humor 

“Imperfection is beauty, madness is genius and it's better to be absolutely ridiculous than absolutely boring.”
Marilyn Monroe | be-yourself inspirational 

“Try not to become a man of success. Rather become a man of value.”
Albert Einstein | adulthood success value 

“It is better to be hated for what you are than to be loved for what you are not.”
André Gide 

“Anyone who has never made a mistake has never tried anything new.”
Albert Einstein | mistakes 

“A lady's imagination is very rapid; it jumps from admiration to love, from love to matrimony in a moment.”
Jane Austen | humor love romantic women 

“Remember, if the time should come when you have to make a choice between what is right and what is easy, remember what happened to a boy who was good, and kind, and brave, because he strayed across the path of Lord Voldemort. Remember Cedric Diggory.”
J.K. Rowling | integrity 

“I declare after all there is no enjoyment like reading! How much sooner one tires of any thing than of a book! -- When I have a house of my own, I shall be miserable if I have not an excellent library.”
Jane Austen | books library reading 

“There are few people whom I really love, and still fewer of whom I think well. The more I see of the world, the more am I dissatisfied with it; and every day confirms my belief of the inconsistency of all human characters, and of 