# Selenium

<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Webdriver" data-toc-modified-id="Webdriver-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Webdriver</a></span></li><li><span><a href="#The-use-case" data-toc-modified-id="The-use-case-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>The use case</a></span><ul class="toc-item"><li><span><a href="#Access-application-form" data-toc-modified-id="Access-application-form-2.1"><span class="toc-item-num">2.1&nbsp;&nbsp;</span>Access application form</a></span></li><li><span><a href="#Fill-in-the-fields" data-toc-modified-id="Fill-in-the-fields-2.2"><span class="toc-item-num">2.2&nbsp;&nbsp;</span>Fill in the fields</a></span></li><li><span><a href="#Solve-captcha" data-toc-modified-id="Solve-captcha-2.3"><span class="toc-item-num">2.3&nbsp;&nbsp;</span>Solve captcha</a></span></li><li><span><a href="#Choose-option" data-toc-modified-id="Choose-option-2.4"><span class="toc-item-num">2.4&nbsp;&nbsp;</span>Choose option</a></span></li></ul></li><li><span><a href="#Modularization:-create-a-class-with-methods" data-toc-modified-id="Modularization:-create-a-class-with-methods-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>Modularization: create a class with methods</a></span></li></ul></div>

`selenium` is a Python library that lets me surf the Internet automatically

In [None]:
!pip install selenium

In [1]:
from selenium import webdriver

A robot is veeeery fast. We need to calm him down, because the Internet has not infinite velocity

In [3]:
from time import sleep

In [5]:
print("hola")
sleep(3)
print("adios")

hola
adios


## Webdriver

I need a Chrome **webdriver** to let Python use Chrome browser: [Link](https://chromedriver.chromium.org/downloads)

Lets initialize the robot and play with it

In [107]:
driver = webdriver.Chrome("./chromedriver")

In [108]:
driver.get("https://www.elpais.es")
sleep(4)
driver.find_element_by_id("didomi-notice-agree-button").click()

Click on some article

In [10]:
some_article = driver.find_elements_by_css_selector("h2 a")[0]

In [11]:
some_article

<selenium.webdriver.remote.webelement.WebElement (session="269df46ecb77f464f07df19739bb16dc", element="e2eebaee-dce0-4d5f-9399-6698128f8eef")>

In [12]:
some_article.click()

Find something on google

In [21]:
driver.get("https://www.google.com")

To find by tag and class, we use syntax "tag.class". If class name has spaces, change them by "."

In [22]:
buscador = driver.find_element_by_css_selector("input.gLFyf.gsfi")

In [23]:
buscador.click()

In [24]:
buscador.send_keys("ironhack")

In [26]:
buscador.send_keys(Keys.ENTER)

## The use case

I need an appointment at Seguridad Social

In [29]:
url = "https://w6.seg-social.es/ProsaInternetAnonimo/OnlineAccess?ARQ.SPM.ACTION=LOGIN&ARQ.SPM.APPTYPE=SERVICE&ARQ.IDAPP=XV106001"

### Access application form

In [110]:
driver.get(url)

In [111]:
driver.find_element_by_xpath('//*[@id="SELECCIONAR1"]').click()

 * Web scraping lets us access the static HTML of the webpage
 * Web **crawling** lets us interact dinamycally with the browser!

### Fill in the fields

Exploring...

In [112]:
user_data = {
    "name": "Francisco Lopez",
    "NIF": "12345678Z",
    "telf": "666555888",
    "mail": "pacolopez99@gmail.com"
}

Fill it with your information using `.send_keys()`

In [113]:
driver.find_element_by_name("nombreApellidos").send_keys(user_data.get("name"))

driver.find_element_by_name("tipo").send_keys("N")

driver.find_element_by_name("numeroDocumento").send_keys(user_data.get("NIF"))
driver.find_element_by_name("telefono").send_keys(user_data.get("telf"))
driver.find_element_by_name("eMail").send_keys(user_data.get("mail"))

driver.find_element_by_id("radioProvincia").click()
driver.find_element_by_name("provincia").send_keys("m")

### Solve captcha

In [114]:
import random

`.text` is used to access a tag's textual content

In [64]:
words = driver.find_element_by_css_selector("p.p0").text.split(": ")[:-1]

In [65]:
words

['Cerdo', 'Abuelo', 'Desván', 'Flauta', 'Madrastra']

In [78]:
our_choice = random.choice(words)

In [79]:
our_choice

'Abuelo'

In [82]:
driver.find_element_by_id("ARQ.CAPTCHA").send_keys(our_choice)

In [83]:
# pulsar Siguiente
driver.find_element_by_id("SPM.ACC.SIGUIENTE").click()

Build while loop until passed

In [98]:
def we_passed():
    try:
        driver.find_element_by_css_selector("li.mensajeError")
        return False
    except:
        return True

In [115]:
while True: 
    sleep(1)
    words = driver.find_element_by_css_selector("p.p0").text.split(": ")[:-1]
    our_choice = random.choice(words)
    driver.find_element_by_id("ARQ.CAPTCHA").send_keys(our_choice)
    # pulsar Siguiente
    driver.find_element_by_id("SPM.ACC.SIGUIENTE").click()
    sleep(1)
    if we_passed():
        print("passed")
        break
    else:
        print("failed")

failed
failed
failed
failed
failed
failed
failed
passed


### Choose option

In [116]:
driver.find_element_by_id("335").click()

driver.find_element_by_id("SPM.ACC.CONTINUAR_TRAS_SELECCIONAR_SERVICIO").click()

Build while loop until passed

In [118]:
def we_passed_second_step():
    try:
        driver.find_element_by_css_selector("li.mensajeCpmsTam2")
        return False
    except:
        return True

In [None]:
while True:
    sleep(2)
    driver.find_element_by_id("335").click()
    driver.find_element_by_id("SPM.ACC.CONTINUAR_TRAS_SELECCIONAR_SERVICIO").click()
    
    if we_passed_second_step():
        break

## Modularization: create a class with methods

In [143]:
class SSocialCrawler:
    # class variable (vs instance variable)
    url = "https://w6.seg-social.es/ProsaInternetAnonimo/OnlineAccess?ARQ.SPM.ACTION=LOGIN&ARQ.SPM.APPTYPE=SERVICE&ARQ.IDAPP=XV106001"
    
    def __init__(self, chromedriver, user_data):
        self.driver = chromedriver
        self.user_data = user_data
        
    def access_form(self):
        self.driver.get(self.url)
        self.driver.find_element_by_xpath('//*[@id="SELECCIONAR1"]').click()
        
    def fill_fields(self):
        self.driver.find_element_by_name("nombreApellidos").send_keys(self.user_data.get("name"))
        self.driver.find_element_by_name("tipo").send_keys("N")

        self.driver.find_element_by_name("numeroDocumento").send_keys(self.user_data.get("NIF"))
        self.driver.find_element_by_name("telefono").send_keys(self.user_data.get("telf"))
        self.driver.find_element_by_name("eMail").send_keys(self.user_data.get("mail"))

        self.driver.find_element_by_id("radioProvincia").click()
        self.driver.find_element_by_name("provincia").send_keys("m")

    def solve_captcha(self):
        while True: 
            sleep(1)
            words = self.driver.find_element_by_css_selector("p.p0").text.split(": ")[:-1]
            our_choice = random.choice(words)
            self.driver.find_element_by_id("ARQ.CAPTCHA").send_keys(our_choice)
            # pulsar Siguiente
            self.driver.find_element_by_id("SPM.ACC.SIGUIENTE").click()
            sleep(1)
            if we_passed():
                print("passed")
                break
            else:
                print("failed")
    
    def choose_option(self):
        self.driver.find_element_by_id("335").click()
        self.driver.find_element_by_id("SPM.ACC.CONTINUAR_TRAS_SELECCIONAR_SERVICIO").click()
    
        while True:
            sleep(10)
            self.driver.find_element_by_id("335").click()
            self.driver.find_element_by_id("SPM.ACC.CONTINUAR_TRAS_SELECCIONAR_SERVICIO").click()

            if we_passed_second_step():
                break

In [144]:
driver = webdriver.Chrome("./chromedriver")

In [125]:
user_data

{'name': 'Francisco Lopez',
 'NIF': '12345678Z',
 'telf': '666555888',
 'mail': 'pacolopez99@gmail.com'}

In [141]:
ss = SSocialCrawler(driver, user_data)

In [142]:
ss.access_form()
sleep(1)
print("form done")
ss.fill_fields()
sleep(1)
print("fields filled")
ss.solve_captcha()
sleep(1)
print("captcha done")
ss.choose_option()
sleep(1)

form done
fields filled
failed
failed
failed
passed
captcha done


KeyboardInterrupt: 

You can setup an alarm when process is finished, using library pygame

In [148]:
import pygame

In [150]:
pygame.mixer.init()
pygame.mixer.music.load("./beep.wav")
for _ in range(20):
    pygame.mixer.music.play()