# Selenium
## Web Browser Automation and Scraping

Selenium is a Python package that allows you to automate your web browser and to scrape data off web pages.

###### Download Link and Instructions: http://selenium-python.readthedocs.io/installation.html

# Example 1: Scraping from IMDB

In [18]:
# imports

import numpy as np
import pandas as pd
from selenium import webdriver
import time # this is for sleeping

In [20]:
# initializing the browser and going to a web page

# open Firefox
browser = webdriver.Firefox(executable_path="/Users/stellasotos/desktop/geckodriver")

# go to the web page that we want to scrape from
browser.get('https://www.gofundme.com/salmonwillrun')

# wait for browser/page to load before doing anything else
'''
If you don't do this, selenium may get confused while running 
the next command because whatever object it looks for may not yet be there.
So when running a command that will open a new web page it is usually
a good idea to sleep for a few seconds.
''' 
time.sleep(2)

Selenium allows you to select web page elements in a variety of ways:
                                    1. .find_element_by_class_name
                                    2. .find_element_by_css_selector
                                    3. .find_element_by_id
                                    4. .find_element_by_link_text
                                    5. .find_element_by_name
                                    6. .find_element_by_partial_link_text
                                    7. .find_element_by_tag_name
                                    8. .find_element_by_xpath

In [27]:
# Select the table of cast members using "inspect element" in your browser
table = browser.find_element_by_class_name("goal").get_attribute('innerHTML')
print(table)

WebDriverException: Message: Tried to run command without establishing a connection


In [None]:
# close the browser now that we've got the data we need
'''
just for convenience sake because Selenium opens up a new 
iteration of Firefox everytime you run it which can get annoying
'''
browser.close()

In [None]:
# a little cleaning to make our data frame pretty…

# drop the columns with useless information
cast_df = cast_df.drop([0,2], 1)

# drop the first row of NaN's
cast_df = cast_df.drop(cast_df.index[0])

# name the columns something descriptive
cast_df.columns = ['Actor', 'Role']

# voila!
cast_df

# Example 2: Slack Bot

In [None]:
# first, let's pick a channel and a message to send to that channel

# pick your channel
channel = 'bot_world'
# write a message
message = 'Hi there Slack!'

In [None]:
# open Firefox
browser = webdriver.Firefox()

# go to the web page that we want to scrape from
browser.get('http://slack.com/signin')

# wait for browser/page to load before doing anything else
time.sleep(2)

In [None]:
# LOGIN

# type the slack team name
browser.find_element_by_id("domain").send_keys("hackcville")

# press continue button
browser.find_element_by_id("submit_team_domain").click()

# wait for next page to load
time.sleep(2)

In [None]:
# supply username and password for slack
email = input("Email: ")
password = input("Password: ")

# type username
browser.find_element_by_id("email").send_keys(email)
# type password
browser.find_element_by_id("password").send_keys(password)
# click sign in button
browser.find_element_by_id("signin_btn").click()

# wait for the next page to load
time.sleep(15)

In [None]:
browser.find_element_by_class_name('p-channel_sidebar__section_heading_label--clickable').click()

search_bar = browser.find_element_by_id('channel_browser_filter')

search_bar.send_keys(channel)
# press enter
search_bar.send_keys(u'\ue007')

In [None]:
# find the element for the text bar
text_bar = browser.find_element_by_class_name('ql-editor.ql-blank').find_element_by_css_selector('p')
# type your message
text_bar.send_keys(message)
# press enter
text_bar.send_keys(u'\ue007')

#you can repeat this cell with different messages to send multiple messages to the channel

So if you ever want to write a message to Slack but you're in class and don't want to get called out for not working, just whip out this script and no one will suspect anything.

## Challenge
1. Scrape the entire table by making Selenium press the "See full cast >>" button at the bottom of the table which opens up the full cast list, not just the first 15 members.

2. Scrape the full cast list from multiple imdb pages by using the search bar to navigate between the pages.