# **Web Scraping and Automation using Selenium in Python**

Selenium is a tool that is used for automating web browsers. It allows users to write scripts and programs that can interact with web pages in the same way that a user would, including clicking buttons, entering text, and navigating to different pages. Selenium is often used for web scraping, testing web applications, and automating repetitive tasks on the web.
Selenium is a powerful tool that can be used with many different programming languages, including Python, Java, and C#. It provides a rich and flexible API that allows users to control web browsers in a variety of ways, including simulating user interactions, extracting data from web pages, and performing actions on multiple web pages at once.

## Initializing Selenium

In [1]:
import selenium
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys

In [3]:
url = 'https://en.wikipedia.org/wiki/Game_of_Thrones'
driver = webdriver.Chrome()
driver.get(url)

# wait for the page to load
driver.implicitly_wait(10)

## Getting Elements from a Page

#### By choosing the following classes, we can select elements:

    ID = "id"
    NAME = "name"
    XPATH = "xpath"
    LINK_TEXT = "link text"
    PARTIAL_LINK_TEXT = "partial link text"
    TAG_NAME = "tag name"
    CLASS_NAME = "class name"
    CSS_SELECTOR = "css selector"

#### The ‘By’ class is used to specify which attribute is used to locate elements on a page. These are the various ways the attributes are used to locate elements on a page:

- find_element(By.ID, "id")
- find_element(By.NAME, "name")
- find_element(By.XPATH, "xpath")
- find_element(By.LINK_TEXT, "link text")
- find_element(By.PARTIAL_LINK_TEXT, "partial link text")
- find_element(By.TAG_NAME, "tag name")
- find_element(By.CLASS_NAME, "class name")
- find_element(By.CSS_SELECTOR, "css selector")

#### Getting the name of the page we are at.

In [8]:
#getting h1
heading1 = driver.find_element(By.TAG_NAME, 'h1')
print(heading1.text)

Game of Thrones


#### Let's get the H3 tags

In [9]:
#getting h3 
heading3 = driver.find_elements(By.TAG_NAME, 'h3')

for heading in heading3:
    print(heading.text)

Plot
Cast and characters
Themes
Inspirations and derivations
Conception and development
Casting
Writing
Filming
Directing
Production design
Visual effects
Music
Language
Broadcast
Home media and streaming
Copyright infringement
Critical response
Cultural influence
Awards
Viewership
Video games
Merchandise and exhibition
Thronecast
After the Thrones
Home media extras
Successors


## Selecting all the text paragraphs

In [11]:
# find all the paragraphs on the page
paragraphs = driver.find_element(By.TAG_NAME, 'p')

for p in paragraphs:
    print(p.text)

TypeError: 'WebElement' object is not iterable

### Selecting Images 

In [None]:
# Find all img elements on the page
img_elements = driver.find_elements_by_tag_name("img")

# Loop through the img elements and download the images
for i, img_element in enumerate(img_elements):
    # Extract the src attribute containing the URL of the image
    img_url = img_element.get_attribute("src")
    
    # Download the image using the requests library
    response = requests.get(img_url)
    
    # Save the image to your local system
    with open(f"image_{i}.png", "wb") as f:
        f.write(response.content)

# Close the web driver
driver.close()


### Clicking on a button, sending keys 

In [14]:
search_query = "selenium automation"

# Open the browser and go to Google
browser = webdriver.Chrome()
browser.get("https://www.google.com")

# Enter the search query in the search box and press the search button
search_box = browser.find_element(By.NAME,"q")
#Sending the query
search_box.send_keys(search_query)
#sending the key ENTER 
search_box.send_keys(Keys.ENTER)

#### Your challenge 

Select the search button in main page of wikipedia and search something!

In [None]:
#YOUR CODE