## Use Splinter
Splinter provides us with many ways to interact with webpages. It can input terms into a Google search bar for us and click the Search button, or even log us into our email accounts by inputting a username and password combination.

In the very first cell, we'll import our scraping tools: the Browser instance from splinter, the BeautifulSoup object, and the driver object for Chrome, ChromeDriverManager.
We're using an alias, "soup," to simplify our code a bit when we reference it later.

In [1]:
# import Splinter and Beautiful Soap
from splinter import Browser
from bs4 import BeautifulSoup as soup
from webdriver_manager.chrome import ChromeDriverManager

Next, we'll set the executable path and initialize a browser. With these two lines of code, we are creating an instance of a Splinter browser.  This means that we're prepping our automated browser.  We're also specifying that we'll be using Chrome as our browser.

In [2]:
# Set up Splinter
executable_path = {'executable_path': ChromeDriverManager().install()}
browser = Browser('chrome', **executable_path, headless=False)
    #**executable_path is unpacking the dictionary we've stored the path in – think of it as unpacking a suitcase
    # headless=False means that all of the browser's actions will be displayed in a Chrome window so we can see them.



Current google-chrome version is 92.0.4515
Get LATEST driver version for 92.0.4515
Driver [C:\Users\Daniel Brock\.wdm\drivers\chromedriver\win32\92.0.4515.107\chromedriver.exe] found in cache


## Practice with Splinter and BeautifulSoup
### Scrape the Title

In [3]:
# Visit the Quotes to Scrape site
    #This code tells Splinter which site we want to visit by assigning the link to a URL.
url = 'http://quotes.toscrape.com/'
browser.visit(url)

After executing the cell above, we will use BeautifulSoup to parse the HTML.

In [4]:
# Parse the HTML
html = browser.html
html_soup = soup(html, 'html.parser')

Now we've parsed all of the HTML on the page. That means that BeautifulSoup has taken a look at the different components and can now access them. Specifically, BeautifulSoup parses the HTML text and then stores it as an object.

In [5]:
# In our next cell, we will find the title and extract it.

# Scrape the Title
    # We used our html_soup object we created earlier and chained find() to it to search for the <h2 /> tag.
    # We've also extracted only the text within the HTML tags by adding .text to the end of the code.
title = html_soup.find('h2').text
title

'Top Ten tags'

### Scrape All of the Tags

In [6]:
# Scrape the top ten tags
    # Create a new variable tag_box, which will be used to store the results of a search.
        # In this case, we're looking for <div /> elements with a class of tags-box, and we're searching for it in the HTML we parsed earlier and stored in the html_soup variable.
tag_box = html_soup.find('div', class_='tags-box')
# tag_box
    # Hold the results of a find_all, but this time we're searching through the parsed results stored in our tag_box variable to find <a /> elements with a tag class. 
        # Use find_all this time because we want to capture all results, instead of a single or specific one.
tags = tag_box.find_all("a", class_='tag')

# Add for lopp
    # This for loop cycles through each tag in the tags variable, strips the HTML code out of it, and then prints only the text of each tag.
for tag in tags: # a for loop that cycles through each tag in the list
    word = tag.text # strips the HTLM from the code and assings the result to a variable
    print(word) #prints each word in the list

love
inspirational
life
humor
books
reading
friendship
friends
truth
simile


### Scrape Across Pages

In [9]:
for x in range(1, 6): # a for loop with five interations (pages 1- 6)
   html = browser.html # Create an HTML object, assigned to html variable
   quote_soup = soup(html, 'html.parser') # Use BeautifulSoup to parse the html object
   quotes = quote_soup.find_all('span', class_='text') # Use BeautifulSoup to final all <span \> tags with a class of "text".
   for quote in quotes: #Print statments Wrapped in another for loop thta will print each parsed
      print('page:', x, '----------')
      print(quote.text)
   browser.links.find_by_partial_text('Next').click() #Use Splinter to click the "Next" button

page: 1 ----------
“The world as we have created it is a process of our thinking. It cannot be changed without changing our thinking.”
page: 1 ----------
“It is our choices, Harry, that show what we truly are, far more than our abilities.”
page: 1 ----------
“There are only two ways to live your life. One is as though nothing is a miracle. The other is as though everything is a miracle.”
page: 1 ----------
“The person, be it gentleman or lady, who has not pleasure in a good novel, must be intolerably stupid.”
page: 1 ----------
“Imperfection is beauty, madness is genius and it's better to be absolutely ridiculous than absolutely boring.”
page: 1 ----------
“Try not to become a man of success. Rather become a man of value.”
page: 1 ----------
“It is better to be hated for what you are than to be loved for what you are not.”
page: 1 ----------
“I have not failed. I've just found 10,000 ways that won't work.”
page: 1 ----------
“A woman is like a tea bag; you never know how strong it is u