## Introduction to Selenium 


### What is Selenium?

#### A web automation framework that allows you to interact with websites as if you were a human.
#### Common uses:
- Automated testing.
- Web scraping, especially for dynamic, JavaScript-heavy websites.
  
#### Why Use Selenium for Web Scraping?

- Handles dynamic content that libraries like Beautiful Soup or requests can't access directly.
- Simulates user actions like clicking, scrolling, and typing.
- Works with most modern browsers (e.g., Chrome, Firefox, Edge).
  
#### Basic Components of Selenium:

- WebDriver: The core interface that communicates with the browser.
- Browser Drivers: Specific drivers for each browser (e.g., ChromeDriver for Chrome).
- Locators: Methods to find elements on a webpage (e.g., id, class, xpath).

### Setting Up Selenium 
#### Install Selenium:

- pip install selenium
#### Download Browser Driver:

- Download ChromeDriver from https://googlechromelabs.github.io/chrome-for-testing/
#### Ensure the driver version matches your browser version.


In [27]:
from selenium.webdriver import Chrome
from selenium.webdriver.support.ui import Select
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support import expected_conditions as EC
from fake_useragent import UserAgent
#from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
from selenium.webdriver.chrome.service import Service
import time
from fake_useragent import UserAgent  # For generating random user agents
import pandas as pd  # For data manipulation and creating DataFrames
import numpy as np


service = Service(executable_path=r'C:\Users\H i - G E O R G E\Documents\All_scale\chromedriver.exe')
# Initialize the WebDriver
driver = webdriver.Chrome(service=service)
# Open a website
driver.get("https://twitter.com")
# Print the page title

print(driver.title)
# Close the browser
driver.quit()


X. It’s what’s happening / X


### Finding and Interacting with Elements 
#### Locating Elements:

##### Methods to find elements;

 

In [None]:

element_by_id = driver.find_element(By.ID, "login-button")

element_by_name = driver.find_element(By.NAME, "username")

element_by_class = driver.find_element(By.CLASS_NAME, "submit-button")

element_by_tag = driver.find_element(By.TAG_NAME, "button")

element_by_link = driver.find_element(By.LINK_TEXT, "Sign Up")

element_by_partial_link = driver.find_element(By.PARTIAL_LINK_TEXT, "Sign")

element_by_xpath = driver.find_element(By.XPATH, "//div[@id='loom-companion-mv3']")
#//*[@id="loom-companion-mv3"]'
# Finding multiple elements
elements_by_class = driver.find_elements(By.CLASS_NAME, "product-item")

# Example combining different locators
element_by_complex_xpath = driver.find_element(
    By.XPATH,
    "//div[@class='container']//button[contains(@id, 'submit')]"
)

element_by_css = driver.find_element(By.CSS_SELECTOR, "#login-form .submit-button")


#### LINK_TEXT:


- Requires an exact, complete match of the link text
- Case-sensitive
- Must match the entire visible text of the link
- More precise but less flexible


#### PARTIAL_LINK_TEXT:


- Matches any link containing the specified text
- Case-sensitive
- Can match a portion of the link text
- More flexible but less precise


#### ID
- Most preferred method as IDs should be unique
- Fastest locator for browsers
- Most stable for automation
- Note: Ensure IDs are unique in the page


#### NAME
- Common in forms
- Can return multiple elements (as name attribute can be repeated)


#### TAG_NAME
- When you need all elements of a specific type
- For counting elements
- For broad searches
- Usually combined with other attributes in XPath
- Rarely used alone as it's too broad
- Better to use find_elements() as there are usually multiple tags


#### XPATH
- When no better options (ID, name) are available
- Need complex traversal of DOM
- Dynamic elements with patterns
- Avoid absolute XPath
- Use relative XPath starting with //
- Keep XPath as short as possible


In [None]:
# HTML: 
# <div class="container">
#   <form>
#     <input type="text" class="search-input" />
#   </form>
# </div>

# Absolute XPath (not recommended)
element = driver.find_element(By.XPATH, "/html/body/div/form/input")

# Relative XPath (recommended)
element = driver.find_element(By.XPATH, "//input[@class='search-input']")

# Complex XPath examples
# Find by contains
element = driver.find_element(By.XPATH, "//button[contains(@class, 'submit')]")

# Find by text
element = driver.find_element(By.XPATH, "//button[text()='Submit']")

# Find by position
element = driver.find_element(By.XPATH, "(//button[@class='action'])[2]")

# Find parent
element = driver.find_element(By.XPATH, "//input[@id='child']/..")

# Find with multiple conditions
element = driver.find_element(By.XPATH, "//button[@class='primary' and @type='submit']")

### Clicking Buttons, typing into input fields, getting text

In [None]:
button = driver.find_element(By.ID, "login-button")
button.click()

search_box = driver.find_element(By.NAME, "login_details")
search_box.send_keys("Selenium WebDriver")
search_box.submit()

element = driver.find_element(By.ID, "Warwick")
print(element.text)


### Example Task: Search for a term on Google and navigating to Example.com


In [22]:
service = Service(executable_path=r'C:\Users\H i - G E O R G E\Documents\All_scale\chromedriver.exe')
driver = webdriver.Chrome(service=service)

driver.get("https://www.google.com")

search_box = driver.find_element(By.CLASS_NAME ,"gLFyf")

search_box.send_keys("Selenium Python")

search_box.submit()

driver.get("https://example.com")
driver.back()
driver.forward()
driver.quit()

### Implicit vs Explicit wait
##### Implicit wait waits for an element to appear on the page, while explicit wait waits for a specific condition, such as the presence of an element or the element to be clickable.

In [None]:
# Sets a global wait time for the entire session
driver.implicitly_wait(10)  # Waits up to 10 seconds

#### Implicit wait
##### Characteristics:

- Global setting that applies to all elements
- Only needs to be set once per session
- Polls the DOM for specified duration
- Works based on presence of element only
- Less granular control
- Can lead to unexpected behavior with mixed waits

##### Best used when:

- Simple scripts with basic interactions
- Consistent page load times
- No dynamic elements
- All elements need same timeout

In [None]:
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

# Creates a wait object with timeout
wait = WebDriverWait(driver, 10)

# Wait for specific condition
element = wait.until(
    EC.element_to_be_clickable((By.ID, "submit-button"))

#### Explicit wait
##### Characteristics:

- More precise and flexible
- Applies to specific elements
- Can specify different conditions
- Better for dynamic elements
- More control over timeout behavior
- Can handle different wait conditions

In [None]:
# Different types of explicit wait conditions
wait = WebDriverWait(driver, 10)

# Wait for element to be clickable
element = wait.until(EC.element_to_be_clickable((By.ID, "button")))

# Wait for element to be visible
element = wait.until(EC.visibility_of_element_located((By.NAME, "username")))

# Wait for element to be present in DOM
element = wait.until(EC.presence_of_element_located((By.CLASS_NAME, "class")))

# Wait for element to disappear
wait.until(EC.invisibility_of_element_located((By.ID, "loading")))

# Wait for text to be present in element
wait.until(EC.text_to_be_present_in_element((By.CLASS_NAME, "message"), "Success"))

# Wait for title to contain specific text
wait.until(EC.title_contains("Dashboard"))

In [None]:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException

# Setup driver
driver = webdriver.Chrome()

# Set implicit wait (not recommended when using explicit waits)
driver.implicitly_wait(10)  # This will affect all subsequent operations

try:
    # Navigate to page
    driver.get("https://example.com")
    
    # Explicit wait - More precise
    wait = WebDriverWait(driver, 10)
    
    # Wait for specific element with condition
    login_button = wait.until(
        EC.element_to_be_clickable((By.ID, "login"))
    )
    
    # Wait for loading spinner to disappear
    wait.until(
        EC.invisibility_of_element_located((By.CLASS_NAME, "loading-spinner"))
    )
    
    # Wait for success message
    wait.until(
        EC.text_to_be_present_in_element(
            (By.CLASS_NAME, "message"),
            "Login Successful"
        )
    )
    
except TimeoutException:
    print("Timed out waiting for element")
finally:
    driver.quit()

In [23]:
from selenium.webdriver import Chrome
from selenium.webdriver.support.ui import Select
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support import expected_conditions as EC
from fake_useragent import UserAgent
#from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
from selenium.webdriver.chrome.service import Service
import time
import requests  # For making HTTP requests
from bs4 import BeautifulSoup  # For parsing HTML content
from fake_useragent import UserAgent  # For generating random user agents
import pandas as pd  # For data manipulation and creating DataFrames
import numpy as np
import concurrent.futures
from selenium.common.exceptions import NoSuchElementException

service = Service(executable_path=r'C:\Users\H i - G E O R G E\Documents\All_scale\chromedriver.exe')
driver = webdriver.Chrome(service=service)

driver.get("https://github.com/trending")
driver.implicitly_wait(10)

repos = driver.find_elements(By.XPATH, "//h2[@class='h3 lh-condensed']")

descriptions = driver.find_elements(By.CLASS_NAME, "col-9.color-fg-muted.my-1.pr-4")

# Print repository names
for repo in repos:
    print(repo.text)

for i in descriptions:
    print(i.text)
# Close the browser
driver.quit()


black-forest-labs / flux
mediar-ai / screenpipe
FreeCAD / FreeCAD
abi / screenshot-to-code
flutter / flutter
gitroomhq / postiz-app
shader-slang / slang
EbookFoundation / free-programming-books
awslabs / multi-agent-orchestrator
openai / openai-cookbook
twentyhq / twenty
primefaces / primevue
dream-num / univer
FortAwesome / Font-Awesome
Official inference repo for FLUX.1 models
rewind.ai x cursor.com = your AI assistant that has all the context. 24/7 screen & voice recording for the age of super intelligence. get your data ready or be left behind
This is the official source code of FreeCAD, a free and opensource multiplatform 3D parametric modeler.
Drop in a screenshot and convert it to clean code (HTML/Tailwind/React/Vue)
Flutter makes it easy and fast to build beautiful apps for mobile and beyond
📨 The ultimate social media scheduling tool, with a bunch of AI 🤖
Making it easier to work with shaders
📚 Freely available programming books
Flexible and powerful framework for managing mul