# Texas Tow Trucks (`.apply` and Selenium)

We're going to scrape some [tow trucks in Texas](https://www.tdlr.texas.gov/tools_search/).

Try searching for the TLDR Number `006179570C`.

## Preparation

### What URL will Selenium be starting on?

- Tip: The answer is *not* `https://www.tdlr.texas.gov/tools_search/`

In [None]:
# https://www.tdlr.texas.gov/tools_search/

### Why are you using Selenium for this?

## Scrape this page

Scrape this page, displaying the

- The business name
- Phone number
- License status
- Physical address

**You should know how to do `.post` requests by now.**

- *TIP: For physical address, **ask me on the board** and I'll give you a secret trick about situations like this.*

In [147]:
import selenium

In [148]:
from selenium import webdriver

In [149]:
driver = webdriver.Chrome()

In [150]:
driver.get("https://www.tdlr.texas.gov/tools_search/")

In [151]:
box = driver.find_element_by_xpath('//*[@id="mcrdata"]')

In [152]:
box.send_keys("006179570C")

In [153]:
submit = driver.find_element_by_xpath('//*[@id="submit3"]')
submit.click()

In [154]:
business_name = driver.find_element_by_xpath('//*[@id="t1"]/tbody/tr/td/font/table[2]/tbody/tr[2]/td[1]')
business_name.text.replace("Name:", "").strip()

'B.D. SMITH TOWING'

In [155]:
phone_number = driver.find_element_by_xpath('//*[@id="t1"]/tbody/tr/td/font/table[2]/tbody/tr[4]/td[1]')
phone_number.text.replace("Phone:", "").strip()

'8173330706'

In [156]:
status = driver.find_element_by_xpath('//*[@id="t1"]/tbody/tr/td/font/table[3]/tbody/tr[1]/td[2]/font/font')
status.text

'Active'

In [157]:
physical = driver.find_element_by_xpath('//*[@id="t1"]/tbody/tr/td/font/table[3]/tbody/tr[2]/td[2]')
physical.text

'Carrier Type:  Tow Truck Company\nNumber of Active Tow Trucks:   0\n\nAddress Information\nMailing:\n13619 BRETT JACKSON RD\nFORT WORTH, TX. 76179\n\nPhysical:\n13619 BRETT JACKSON RD.\nFORT WORTH, TX. 76179'

In [158]:
physical.text.split(":")[-1].strip()

'13619 BRETT JACKSON RD.\nFORT WORTH, TX. 76179'

# Using .apply to find data about SEVERAL tow truck companies

The file `trucks-subset.csv` has information about the trucks, we'll use it to find the pages to scrape.

### Open up `trucks-subset.csv` and save it into a dataframe

In [159]:
import pandas as pd

In [160]:
df = pd.read_csv("trucks-subset.csv")
df.head()

Unnamed: 0,TDLR Number
0,006507931C
1,006179570C
2,006502097C


### Open up `trucks-subset.csv` in a text editor, then look at your dataframe. Is something different about them? If so, make them match.

- *TIP: I can help with this.*

## Use `.apply` to go through each row of the dataset, printing out information about each tow truck company.

- The business name
- Phone number
- License status
- Physical address

Just print it out for now.

- *TIP: use .apply and a function*
- *TIP: If you need help with .apply, look at the "Using apply in pandas" notebook *

In [161]:
def process_truck(row):
    driver.get("https://www.tdlr.texas.gov/tools_search/")
    
    box = driver.find_element_by_xpath('//*[@id="mcrdata"]')
    box.send_keys(row['TDLR Number'])
    
    driver.find_element_by_xpath('//*[@id="submit3"]').click()
    
    business_name = driver.find_element_by_xpath('//*[@id="t1"]/tbody/tr/td/font/table[2]/tbody/tr[2]/td[1]')
    print("Name", business_name.text.replace("Name:", "").strip())
    
    phone_number = driver.find_element_by_xpath('//*[@id="t1"]/tbody/tr/td/font/table[2]/tbody/tr[4]/td[1]')
    print("Phone", phone_number.text.replace("Phone:", "").strip())
    
    status = driver.find_element_by_xpath('//*[@id="t1"]/tbody/tr/td/font/table[3]/tbody/tr[1]/td[2]/font')
    print("Status", status.text)
    
    physical = driver.find_element_by_xpath('//*[@id="t1"]/tbody/tr/td/font/table[3]/tbody/tr[2]/td[2]')
    print("Address", physical.text.split(":")[-1].strip())
    
    print("-----")

df.apply(process_truck, axis=1)

Name AUGUSTUS E SMITH
Phone 9032276464
Status Active
Address 103 N MAIN ST
BONHAM, TX. 75418
-----
Name B.D. SMITH TOWING
Phone 8173330706
Status Active
Address 13619 BRETT JACKSON RD.
FORT WORTH, TX. 76179
-----
Name BARRY MICHAEL SMITH
Phone 8066544404
Status Active
Address 4501 W CEMETERY RD
CANYON, TX. 79015
-----


0    None
1    None
2    None
dtype: object

In [162]:
df

Unnamed: 0,TDLR Number
0,006507931C
1,006179570C
2,006502097C


## Scrape the following information for each row of the dataset, and save it into new columns in your dataframe.

- The business name
- Phone number
- License status
- Physical address

It's basically what we did before, but using the function a little differently.

- *TIP: Use .apply and a function*
- *TIP: Remember to use `return`*

In [164]:
def process_truck(row):
    driver.get("https://www.tdlr.texas.gov/tools_search/")
    
    box = driver.find_element_by_xpath('//*[@id="mcrdata"]')
    box.send_keys(row['TDLR Number'])
    
    driver.find_element_by_xpath('//*[@id="submit3"]').click()
    
    business_name = driver.find_element_by_xpath('//*[@id="t1"]/tbody/tr/td/font/table[2]/tbody/tr[2]/td[1]')
    
    phone_number = driver.find_element_by_xpath('//*[@id="t1"]/tbody/tr/td/font/table[2]/tbody/tr[4]/td[1]')
    
    status = driver.find_element_by_xpath('//*[@id="t1"]/tbody/tr/td/font/table[3]/tbody/tr[1]/td[2]/font')
   
    physical = driver.find_element_by_xpath('//*[@id="t1"]/tbody/tr/td/font/table[3]/tbody/tr[2]/td[2]')

    return pd.Series({
        'name': business_name.text.replace("Name:", "").strip(),
        'phone': phone_number.text.replace("Phone:", "").strip(),
        'status' : status.text,
        'address' : physical.text.split(":")[-1].strip()
        
    })

complete_df = df.apply(process_truck, axis=1).join(df)
complete_df

Unnamed: 0,address,name,phone,status,TDLR Number
0,"103 N MAIN ST\nBONHAM, TX. 75418",AUGUSTUS E SMITH,9032276464,Active,006507931C
1,"13619 BRETT JACKSON RD.\nFORT WORTH, TX. 76179",B.D. SMITH TOWING,8173330706,Active,006179570C
2,"4501 W CEMETERY RD\nCANYON, TX. 79015",BARRY MICHAEL SMITH,8066544404,Active,006502097C


In [165]:
df

Unnamed: 0,TDLR Number
0,006507931C
1,006179570C
2,006502097C


### Save your dataframe as a CSV

In [166]:
complete_df.to_csv("complete-trucks.csv", index=False)

### Re-open your dataframe to confirm you didn't save any extra weird columns

In [167]:
pd.read_csv("complete-trucks.csv")

Unnamed: 0,address,name,phone,status,TDLR Number
0,"103 N MAIN ST\nBONHAM, TX. 75418",AUGUSTUS E SMITH,9032276464,Active,006507931C
1,"13619 BRETT JACKSON RD.\nFORT WORTH, TX. 76179",B.D. SMITH TOWING,8173330706,Active,006179570C
2,"4501 W CEMETERY RD\nCANYON, TX. 79015",BARRY MICHAEL SMITH,8066544404,Active,006502097C


## Repeat this process for the entire `tow-trucks.csv` file

In [168]:
df = pd.read_csv("tow-trucks.csv")
df

Unnamed: 0,TDLR Number
0,006507931C
1,006179570C
2,006502097C
3,006494912C
4,0649468VSF
5,006448786C
6,0648444VSF
7,0651667VSF
8,006017767C
9,006495492C
