# Data Extraction with Selenium
In this tutorial, we discuss how to use Selenium to extract data from the web.  Please see https://selenium-python.readthedocs.io for more details.

## Installation
Before using selenium, we will have to install a webdriver of your choice.  It can be Chrome or Firefox.  Once installed, you will need to know the location of the drive as it will be used as a parameter to start a browser.

We also have to install selenium package.

In [None]:
from selenium import webdriver
from selenium.webdriver.chrome.service import Service as ChromeService
from webdriver_manager.chrome import ChromeDriverManager

import time
import os

In [None]:
service = ChromeService(executable_path=ChromeDriverManager().install())
driver = webdriver.Chrome(service=service)

## Browsing a webpage
Once the browser starts, we can tell it to visit a webpage.

In [None]:
url = 'https://www.google.com'

In [None]:
driver.get(url=url)

In [None]:
html = driver.execute_script("return document.documentElement.outerHTML")
html[:3000]

## Interact with a webpage
When the page is loaded, we can interact with all elements in the webpage.  In this example, we will perform a search for a particular keyword in Google.  We will have to locate the correct element and then send the proper keys.

In [None]:
q_element = driver.find_element_by_css_selector('input[name=q]')
q_element.clear()
q_element.send_keys('ประเทศไทย')
q_element.send_keys(u'\ue007')

## Navigate the webpage
We can navigate the current webpage, similar to Beautiful Soup.  Selenium supports several navigation approaches.

In [None]:
all_link = driver.find_elements_by_css_selector('.g a')

In [None]:
for link in all_link:
    print(link.text, link.get_attribute('href'))
    print('---')

In [None]:
all_link[0].click()

In [None]:
all_toc = driver.find_elements_by_css_selector('li[class^=toclevel]')

In [None]:
all_toc

In [None]:
for toc in all_toc:
    a = toc.find_element_by_css_selector('a')
    print(a.text, toc.get_attribute('class'), a.get_attribute('href'))
    print('---')

In [None]:
all_toc[2].find_element_by_css_selector('a').click()

## End browsing session

In [None]:
driver.quit()