# Scraping Maryland Business Licenses with Selenium

Maryland has a [great portal](https://jportal.mdcourts.gov/license/pbPublicSearch.jsp) for searching business licenses, but the only problem is you have to check a box in order to get in.

1. Try to visit [the public search page](https://jportal.mdcourts.gov/license/pbPublicSearch.jsp)
2. Get redirected to a "I agree to this" page. Click that you've read the disclaimer, click Enter the Site.
3. Click "Search License Records" down at the bottom of the page
4. You're now on the search page! From the "Jurisdiction" dropdown, select "Statewide"
5. In the "Trade Name" field, type "Vap%" to try to find vape shops
6. Click "Next" in the bottom right-hand corner to go to the next page
7. Click "Click for detail" to see the details for a specific business license.

That's a lot of stuff! **Let's get to work.**

## Import what you need

In [5]:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import Select
from selenium.webdriver.support.ui import WebDriverWait

import pandas as pd
import time

## Preparation

### When you search for a business license, what URL should Selenium try to visit first?

In [6]:
driver = webdriver.Chrome()

driver.get("https://jportal.mdcourts.gov/license/pbPublicSearch.jsp?slcJurisdiction=50&txtTradeName=Vap%25&txtOwnerName=&txtLocationStreetName=&slcYear=2018&slcSortBy=ownername")

**It isn't going to work, though! It's going to redirect to that intro page.** You can use *Incognito mode* to go back through the "Check the box, etc" series of pages, or you can close and re-open Chrome.

- Check the checkbox, then submit the form to accept their terms of service

Selenium can submit forms by either

- Selecting the form and using `.submit()`, or
- Selecting the button and using `.click()`

You only need to be able to get **one, not both.**

- *TIP: if something doesn't have anything special about it, xpath might be your best bet*

In [7]:
check_box = driver.find_element_by_xpath('//*[@id="checkbox"]')
check_box.click()

In [8]:
enter = driver.find_element_by_xpath('/html/body/table/tbody/tr[7]/td/form/div/input[2]')
enter.click()

Now click the **Search License Records** link up top in the navigation to get to the search page.

- *TIP: Honestly you could also just visit the URL directly now since you filled out that terms of service thing*

In [9]:
search_license_rec = driver.find_element_by_xpath('/html/body/table[2]/tbody/tr[6]/td[2]/a[2]')
search_license_rec.click()

### Perform your search

Pick "Statewide" for the jurisdiction dropdown, and `VAP%` into the Trade Name field. The `%` is a wildcard.

In [10]:
dropdown= Select(driver.find_element_by_xpath('//*[@id="slcJurisdiction"]'))
dropdown.select_by_visible_text('Statewide')

In [11]:
driver.find_element_by_xpath('//*[@id="txtTradeName"]').click()

In [12]:
text_input = driver.find_element_by_xpath('//*[@id="txtTradeName"]')

In [13]:
text_input.send_keys('VAP%')

And now, of course, **submit the form**.

- *TIP: Since scrolling to buttons can be a pain, sometimes it's easier to select the form and use `.submit()` instead of `.click()`ing the button*

In [14]:
submit_button = driver.find_element_by_xpath('/html/body/table[2]/tbody/tr[4]/td[2]/form/table/tbody/tr[14]/td/input[1]')
submit_button.click()

## (Try to) scrape the results

Let's start by just **printing this stuff**. We'll save it as a dataframe later on.

For now, just scrape **each store's name**, then cry a little. Fact: this is an impossible and miserable page. 

In [15]:
#/html/body/table[2]/tbody/tr[4]/td[2]/table[1]/tbody/tr[3]/td[2]

In [16]:
#store_name = driver.find_element_by_class_name('searchlistitem')
#print(store_name)
store_names = driver.find_elements_by_class_name('searchlistitem')

for name in store_names:
    print(name.text)

VAPE IT STORE II
VAPE IT STORE I
VAPEPAD THE
VAPE FROG
VAPE FROG


To avoid struggling with the search results page, we're going to use the **detail page** instead. Try to figure out how to select it and click it inside of your `for` loop.

- *TIP: Instead of just looking for an `a` or an `img`, you might want to look for one of its parents first, then click. This might affect the way you print out the shop's name, too*
- *TIP: Not all of them have links! You can wrap in try/except to skip it, or you can check to see if the shop's status is Pending.*

In [17]:
detail_button = driver.find_element_by_xpath('/html/body/table[2]/tbody/tr[4]/td[2]/table[1]/tbody/tr[3]/td[3]/a')
detail_button.click()

Okay, now let's get to action. For each result, **click the link to the detail page** and print out the following information:

- Mailing address
- Location address
- License information (you can keep it as one field)
- Total amount paid
- Issued by
- If you're feeling crazy, get the licenses, too.

If it doesn't have a detail page, just print out the name and that's all we need.

- *TIP: When you're done getting the information, you probably want to click back to the search results*
- *TIP: You might enjoy `find_element_by_partial_link_text` to do that*
- *TIP: Licenses can be acquired by doing some really odd list slicing - think about where it starts and where it ends, relative to the beginning and end of everything.*

> **IMPORTANT NOTE:** This is doomed. It's useful to do, but your current process is doomed. Once you get a `stale element reference` error move on to the next cell.

In [18]:
#/html/body/table[2]/tbody/tr[4]/td[2]/table[1]/tbody/tr[3]/td[3]/a
#/html/body/table[2]/tbody/tr[4]/td[2]/table[1]/tbody/tr[8]/td[3]/a

#pages = driver.find_element_by_xpath('/html/body/table[2]/tbody/tr[4]/td[2]/table[1]')
#print(pages.text)

#for page in pages:
store_links = driver.find_elements_by_class_name('searchfieldtitle')

for link in store_links:
    detail_button = link.find_element_by_tag_name('a')
    detail_button.click()
    
    info = driver.find_elements_by_class_name('tablecelltext')
    mail_address = info[0].find_elements_by_tag_name('td')[0]
    location_address = info[1].find_elements_by_tag_name('td')[0]
    total_paid = info[7].find_elements_by_tag_name('td')[1]
    issued_by = info[8].find_elements_by_tag_name('td')[0]
    license_one = info[2].find_elements_by_tag_name('td')[2]
    license_two = info[3].find_elements_by_tag_name('td')[2]
    license_third = info[4].find_elements_by_tag_name('td')[2]
    
    print(mail_address.text)
    print(location_address.text)
    print(total_paid.text)
    print(issued_by.text)
    print(license_one.text)
    print(license_two.text)
    print(license_third.text)
    
    click_back = driver.find_element_by_xpath('/html/body/table[2]/tbody/tr[4]/td[2]/table[4]/tbody/tr/td[1]')
    click_back.click()
    
  

   #If youre feeling crazy, get the licenses, too.

#detail_button=driver.find_element_by_xpath('/html/body/table[2]/tbody/tr[4]/td[2]/table[1]/tbody/tr[3]/td[3]/a')
#detail_button = driver.find_element_by_partial_link_text('click for detail')
#detail_button.click() 
    
   
    
  #  print(mail.text)
    
    #Mailing address
#Location address
#License information (you can keep it as one field)
#Total amount paid
#Issued by
#detail_buttons.click()

AMIN NARGIS
1104 PLANTERS PLACE
SALISBURY, MD 21804
VAPE IT STORE II
1015 S SALISBURY BLVD
SALISBURY, MD 21801
$ 41.00
WICOMICO COUNTY, CLERK OF CIRCUIT COURT
P.O. BOX 198
SALISBURY, MARYLAND 21803-0198
OTHER TOBACCO PROD RETLR
CHAIN STORE
TRADER'S LICENSE


StaleElementReferenceException: Message: stale element reference: element is not attached to the page document
  (Session info: chrome=67.0.3396.87)
  (Driver info: chromedriver=2.40.565498 (ea082db3280dd6843ebfb08a625e3eb905c4f5ab),platform=Windows NT 10.0.17134 x86_64)


### Stale message reference

Once you navigate away from a page, and you go back to it, you can't use the variables from the first time you were on the page. So, we got a list of results when we first visited, clicked to the details page, clicked back, and now our original list is "stale."

This is sad.

Let's try this again: loop through the results and create a dataframe with `name` and `url` columns. And yes, some of them won't have URLs.

In [19]:
#store_names = driver.find_elements_by_class_name('searchlistitem')
stores = []

store_urls = driver.find_elements_by_class_name('searchfieldtitle')

for info in store_urls:
    name = info.find_element_by_class_name('searchlistitem')
    
    print(name.text)
    try:
        url = info.find_element_by_tag_name('a')
        print(url.get_attribute('href'))
    
    except: 
        print("There is not a url.")

VAPE IT STORE II
https://jportal.mdcourts.gov/license/pbLicenseDetail.jsp?owi=bmDExzf4ZDw%3D
VAPE IT STORE I
https://jportal.mdcourts.gov/license/pbLicenseDetail.jsp?owi=glqAu7o7gJE%3D
VAPEPAD THE
https://jportal.mdcourts.gov/license/pbLicenseDetail.jsp?owi=zjsVR%2F6x1p8%3D
VAPE FROG
There is not a url.
VAPE FROG
There is not a url.


In [20]:
#store_names = driver.find_elements_by_class_name('searchlistitem')
stores = []

store_urls = driver.find_elements_by_class_name('searchfieldtitle')

for info in store_urls:
        
    print('-----------')
    
    dictio = {}
    
    name = info.find_element_by_class_name('searchlistitem')
    dictio['name'] = name.text
    
    #print(name.text)
    
    try:
        url = info.find_element_by_tag_name('a').get_attribute('href')
        #print(url.get_attribute('href'))
        dictio['url'] = url
    
    except: 
        dictio['url'] = "There is not a url."
    
    print(dictio)
    stores.append(dictio)
        

-----------
{'name': 'VAPE IT STORE II', 'url': 'https://jportal.mdcourts.gov/license/pbLicenseDetail.jsp?owi=bmDExzf4ZDw%3D'}
-----------
{'name': 'VAPE IT STORE I', 'url': 'https://jportal.mdcourts.gov/license/pbLicenseDetail.jsp?owi=glqAu7o7gJE%3D'}
-----------
{'name': 'VAPEPAD THE', 'url': 'https://jportal.mdcourts.gov/license/pbLicenseDetail.jsp?owi=zjsVR%2F6x1p8%3D'}
-----------
{'name': 'VAPE FROG', 'url': 'There is not a url.'}
-----------
{'name': 'VAPE FROG', 'url': 'There is not a url.'}


In [21]:
stores_df = pd.DataFrame(stores)
stores_df

Unnamed: 0,name,url
0,VAPE IT STORE II,https://jportal.mdcourts.gov/license/pbLicense...
1,VAPE IT STORE I,https://jportal.mdcourts.gov/license/pbLicense...
2,VAPEPAD THE,https://jportal.mdcourts.gov/license/pbLicense...
3,VAPE FROG,There is not a url.
4,VAPE FROG,There is not a url.


### Getting all of the results

After you've looped through the results on one page, we're going to want to go to the next page! Add a line to make it click the 'Next' button down at the bottom

- *Tip: `find_element_by_partial_link_text` will be your friend*
- *Tip: You might need to do the scrolling thing to get it onto the screen (and by that I mean, you WILL need to, so you should)*

Confirm that it moves to the next page (it doesn't need to scrape anything yet)

In [22]:
#store_urls = driver.find_elements_by_class_name('searchfieldtitle')
#store_urls[0].text


In [23]:
stores = []

store_urls = driver.find_elements_by_class_name('searchfieldtitle')

for info in store_urls:
        
    #print('-----------')
    
    dictio = {}
    
    name = info.find_element_by_class_name('searchlistitem')
    dictio['name'] = name.text
    
    #print(name.text)
    
    try:
        url = info.find_element_by_tag_name('a').get_attribute('href')
        #print(url.get_attribute('href'))
        dictio['url'] = url
    
    except: 
        dictio['url'] = "There is not a url."
    
    #print(dictio)
    stores.append(dictio)
    
next_button = driver.find_element_by_partial_link_text('Next')
#print(next_button)
next_button.click()

print('Yes, I moved to the next page!')

Yes, I moved to the next page!


### Wrapping with `while`

> Go back to the first page of results before you try to run this

You have a bunch of scraping code. It clicks the next button, then it stops. But you'd like it to go back up to the top! You can make that happen with a special `while` loop.

```python
while True:
    # Scrape your stuff
    # Click next button
```

This will go on FOREVER AND EVER until there is an error (when it can't find the Next button on the last page of results, you'll get an error).

- *Tip: Print out "Scraping a new page" every time you visit a new page, just to check that it's working*

In [24]:
stores = []

while True:

    store_urls = driver.find_elements_by_class_name('searchfieldtitle')


    for info in store_urls:

        #print('-----------')

        dictio = {}

        name = info.find_element_by_class_name('searchlistitem')
        dictio['name'] = name.text

        #print(name.text)

        try:
            url = info.find_element_by_tag_name('a').get_attribute('href')
            #print(url.get_attribute('href'))
            dictio['url'] = url

        except: 
            dictio['url'] = "There is not a url."

    print('-------Scraping a new page--------')
    stores.append(dictio)

    next_button = driver.find_element_by_partial_link_text('Next')
    #print(next_button)
    next_button.click()
    
#print('There is no other page.')

-------Scraping a new page--------
-------Scraping a new page--------
-------Scraping a new page--------
-------Scraping a new page--------
-------Scraping a new page--------
-------Scraping a new page--------
-------Scraping a new page--------


NoSuchElementException: Message: no such element: Unable to locate element: {"method":"partial link text","selector":"Next"}
  (Session info: chrome=67.0.3396.87)
  (Driver info: chromedriver=2.40.565498 (ea082db3280dd6843ebfb08a625e3eb905c4f5ab),platform=Windows NT 10.0.17134 x86_64)


### Making it perfect

> Go back to the first page of results before you try to run this

Wrap all of your code in a `try`/`except` so that it doesn't finish with an error and you'll be good to go.

**Confirm your list has all of the vape shops in it.** If not, check where you are creating your empty list (`[]`) - if you do it in the wrong spot, it will overwrite your list every time you visit a page.

In [25]:
stores = []

try:
    
    while True:

        store_urls = driver.find_elements_by_class_name('searchfieldtitle')


        for info in store_urls:

            #print('-----------')

            dictio = {}

            name = info.find_element_by_class_name('searchlistitem')
            dictio['name'] = name.text

            #print(name.text)

            try:
                url = info.find_element_by_tag_name('a').get_attribute('href')
                #print(url.get_attribute('href'))
                dictio['url'] = url

            except: 
                dictio['url'] = "There is not a url."

            #print(dictio)
            print('-------Scraping data--------')
            stores.append(dictio)
        #print(stores)
            

        next_button = driver.find_element_by_partial_link_text('Next')
        #print(next_button)
        next_button.click()
        
except:
    print("There is no other page.")

-------Scraping data--------
-------Scraping data--------
-------Scraping data--------
-------Scraping data--------
-------Scraping data--------
-------Scraping data--------
-------Scraping data--------
-------Scraping data--------
-------Scraping data--------
-------Scraping data--------
-------Scraping data--------
-------Scraping data--------
-------Scraping data--------
-------Scraping data--------
-------Scraping data--------
-------Scraping data--------
-------Scraping data--------
-------Scraping data--------
-------Scraping data--------
-------Scraping data--------
-------Scraping data--------
-------Scraping data--------
-------Scraping data--------
-------Scraping data--------
-------Scraping data--------
-------Scraping data--------
-------Scraping data--------
-------Scraping data--------
-------Scraping data--------
-------Scraping data--------
-------Scraping data--------
-------Scraping data--------
-------Scraping data--------
-------Scraping data--------
There is no ot

In [26]:
len(stores)

34

### Save this data as a csv

The filename should be `vape-shops-basic.csv`.

In [27]:
stores_df = pd.DataFrame(stores)
stores_df

Unnamed: 0,name,url
0,VAPE IT STORE II,https://jportal.mdcourts.gov/license/pbLicense...
1,VAPE IT STORE I,https://jportal.mdcourts.gov/license/pbLicense...
2,VAPEPAD THE,https://jportal.mdcourts.gov/license/pbLicense...
3,VAPE FROG,There is not a url.
4,VAPE FROG,There is not a url.
5,VAPE LOFT THE,https://jportal.mdcourts.gov/license/pbLicense...
6,VAPE N CIGAR,https://jportal.mdcourts.gov/license/pbLicense...
7,VAPE DOJO,https://jportal.mdcourts.gov/license/pbLicense...
8,VAPE HAVEN,There is not a url.
9,VAPORS LOUNGE,There is not a url.


In [44]:
stores_df.to_csv("vape-shops.csv", index=False)
stores_df = pd.read_csv('vape-shops.csv')
stores_df.head()

Unnamed: 0,name,url
0,VAPE IT STORE II,https://jportal.mdcourts.gov/license/pbLicenseDetail.jsp?owi=bmDExzf4ZDw%3D
1,VAPE IT STORE I,https://jportal.mdcourts.gov/license/pbLicenseDetail.jsp?owi=glqAu7o7gJE%3D
2,VAPEPAD THE,https://jportal.mdcourts.gov/license/pbLicenseDetail.jsp?owi=zjsVR%2F6x1p8%3D
3,VAPE FROG,There is not a url.
4,VAPE FROG,There is not a url.


# Okay, let's scrape!

All right, get the actual data!

### Look at the URL of your first row

- *TIP: Remember `pd.set_option('display.max_colwidth', -1)` will let you see alllll of your strings*

In [41]:
pd.set_option('display.max_colwidth', -1)

In [42]:
stores_df['url'].head(1)

0    https://jportal.mdcourts.gov/license/pbLicenseDetail.jsp?owi=bmDExzf4ZDw%3D
Name: url, dtype: object

In [34]:
#stores_df['url'].tail(1)

### Use Selenium to visit that page

In [37]:
#driver.get('https://jportal.mdcourts.gov/license/pbLicenseDetail.jsp?owi=74s9PXdmgq0%3D')

In [39]:
#driver.get('https://jportal.mdcourts.gov/license/pbLicenseDetail.jsp?owi=plY4eVvXtD8%3D')

In [45]:
driver.get(stores_df.url[0])

### Now, just like you did before, grab the additional data

You should probably save it into a dictionary! Don't try to put it into the dataframe yet, though. You want:

- Mailing address
- Location address
- License information (you can keep it as one field)
- Total amount paid
- Issued by
- If you're feeling crazy, get the licenses, too.

.

- *TIP: Licenses can be acquired by doing some really odd list slicing - think about where it starts and where it ends, relative to the beginning and end of everything.*
- *TIP: If you've gotten addicted to xpath, total amount paid and issued by might not work with it when doing other shops. You'll want to test it!*

In [46]:
#store_links = driver.find_elements_by_class_name('searchfieldtitle')
info = driver.find_elements_by_class_name('tablecelltext')

mail = info[0].find_elements_by_tag_name('td')[0]
location = info[1].find_elements_by_tag_name('td')[0]
total = info[7].find_elements_by_tag_name('td')[1]
issued = info[8].find_elements_by_tag_name('td')[0]
license = info[2].find_elements_by_tag_name('td')[2]
license_second = info[3].find_elements_by_tag_name('td')[2]
license_more = info[4].find_elements_by_tag_name('td')[2]

print(mail.text)
print(location.text)
print(total.text)
print(issued.text)
print(license.text)
print(license_second.text)
print(license_more.text)
    

AMIN NARGIS
1104 PLANTERS PLACE
SALISBURY, MD 21804
VAPE IT STORE II
1015 S SALISBURY BLVD
SALISBURY, MD 21801
$ 41.00
WICOMICO COUNTY, CLERK OF CIRCUIT COURT
P.O. BOX 198
SALISBURY, MARYLAND 21803-0198
OTHER TOBACCO PROD RETLR
CHAIN STORE
TRADER'S LICENSE


### Move all of this into one cell

It should visit the URL, then grab the data and put it into a dictionary.

In [52]:
#driver.get('https://jportal.mdcourts.gov/license/pbLicenseDetail.jsp?owi=plY4eVvXtD8%3D')

info = driver.find_elements_by_class_name('tablecelltext')

print('-----------')
    
dictio = {}

mail = info[0].find_elements_by_tag_name('td')[0]
dictio['mail'] = mail.text

location = info[1].find_elements_by_tag_name('td')[0]
dictio['location'] = location.text 

total = info[7].find_elements_by_tag_name('td')[1]
dictio['total'] = total.text 

issued = info[8].find_elements_by_tag_name('td')[0]
dictio['issued'] = issued.text 

license = info[2].find_elements_by_tag_name('td')[2]
dictio['license'] = license.text 

license_second = info[3].find_elements_by_tag_name('td')[2]
dictio['license_second'] = license_second.text 

license_more = info[4].find_elements_by_tag_name('td')[2]
dictio['license_more'] = license_more.text 

#mail = info[0].find_elements_by_tag_name('td')[0]
#location = info[1].find_elements_by_tag_name('td')[0]
#total = info[7].find_elements_by_tag_name('td')[1]
#issued = info[8].find_elements_by_tag_name('td')[0]
#license = info[2].find_elements_by_tag_name('td')[2]
#license_second = info[3].find_elements_by_tag_name('td')[2]
#license_more = info[4].find_elements_by_tag_name('td')[2]

print(dictio)

-----------
{'mail': 'YEAGER ONE INC\n545 HIGGINS DR\nODENTON, MD 21113', 'location': 'VAPERS RING\n545 BENFIELD RD\nSEVERNA PARK, MD 21146', 'total': '$ 116.00', 'issued': 'ROBERT P. DUCKWORTH, CLERK OF CIRCUIT COURT\n8 CHURCH CIRCLE, ROOM H-101\nANNAPOLIS, MARYLAND 21401', 'license': "TRADER'S LICENSE", 'license_second': 'CHAIN STORE', 'license_more': 'VAPE SHOP VENDOR'}


### Change it into a function

You'll want to have this function accept a `row`, and send back a `pd.Series`. You can just use `pd.Series(your_dictionary)` (but it better have a better name than `your_dictionary`!).

- *TIP: Make sure you `return` something!*
- *TIP: Make sure you change everything to reflect the row's url, not the URL you typed in*

In [72]:
def get_vapestores(row): 
    try:
        driver.get(row['url'])
        
        #info = driver.find_elements_by_class_name('tablecelltext')

        mail = info[0].find_elements_by_tag_name('td')[0].text
        location = info[1].find_elements_by_tag_name('td')[0].text
        total = info[7].find_elements_by_tag_name('td')[1].text
        issued = info[8].find_elements_by_tag_name('td')[0].text
        license = info[2].find_elements_by_tag_name('td')[2].text
        #license_second = info[3].find_elements_by_tag_name('td')[2].text
        #license_more = info[4].find_elements_by_tag_name('td')[2].text

        return pd.Series({
                'mail': mail,
                'location': location,
                'total': total,
                'issued': issued,
                'license': license,
                #'license_second': license_second,
                #'license_more': license_more     
            })
        
    except:
        print('There is no url.')

In [81]:
#final_test_df = stores_df.apply(get_vapestores, axis=1).join(stores_df)
#final_test_df
##############################################################################################################

#USING def get_vapestores(row) & tag_names, results seem to be messed up. 
#That is why I am going with xpaths into a new function, including only license_status and one type of license 
#(the first one each time). Please see the following cells for my solution. 

In [82]:
def get_vapeshops(row): 
    try:
        driver.get(row['url'])
        
        #info = driver.find_elements_by_class_name('tablecelltext')

        mail_address = driver.find_element_by_xpath('/html/body/table[2]/tbody/tr[4]/td[2]/table[1]/tbody/tr[3]/td[1]').text
        location_address = driver.find_element_by_xpath('/html/body/table[2]/tbody/tr[4]/td[2]/table[1]/tbody/tr[5]/td').text
        total_cost = driver.find_element_by_xpath('/html/body/table[2]/tbody/tr[4]/td[2]/table[2]/tbody/tr[7]/td[2]').text
        issued_info = driver.find_element_by_xpath('/html/body/table[2]/tbody/tr[4]/td[2]/table[3]/tbody/tr[2]').text
        license_status = driver.find_element_by_xpath('/html/body/table[2]/tbody/tr[4]/td[2]/table[1]/tbody/tr[3]/td[2]').text
        license_a = driver.find_element_by_xpath('/html/body/table[2]/tbody/tr[4]/td[2]/table[2]/tbody/tr[2]/td[3]/a').text          
        #license_second = info[3].find_elements_by_tag_name('td')[2].text
        #license_more = info[4].find_elements_by_tag_name('td')[2].text

        return pd.Series({
                'mail_address': mail_address,
                'location_address': location_address,
                'total_cost': total_cost,
                'issued_info': issued_info,
                'license_status': license_status,
                'license_a': license_a
                #'license_second': license_second,
                #'license_more': license_more     
            })
        
    except:
        print('There is no url.')

In [83]:
#vape_df = stores_df.apply(get_vapeshops, axis=1).join(stores_df)
#vape_df

### Use your dataframe and `.apply` to pull all of the data from the vape shops

Once you know it's working, use the whole 

- *TIP: Try using it with `.head(3)` first*
- *TIP: You'll want to use `.apply` with your new function*
- *TIP: Issued By and Total Paid are goign to give you problems if you tried to use xpath! Try checking the classes and think about find_elementSSSSS and working backwards instead of forwards.*
- *TIP: You might need a `try`/`except`*
- *TIP: Make sure you're using `axis=1`*
- *TIP: Use `.join` the big thing with all of the `dfs` - make sure you name them right!*

In [84]:
stores_df.head()
#stores_df

Unnamed: 0,name,url
0,VAPE IT STORE II,https://jportal.mdcourts.gov/license/pbLicenseDetail.jsp?owi=bmDExzf4ZDw%3D
1,VAPE IT STORE I,https://jportal.mdcourts.gov/license/pbLicenseDetail.jsp?owi=glqAu7o7gJE%3D
2,VAPEPAD THE,https://jportal.mdcourts.gov/license/pbLicenseDetail.jsp?owi=zjsVR%2F6x1p8%3D
3,VAPE FROG,There is not a url.
4,VAPE FROG,There is not a url.


In [85]:
vape_df = stores_df.apply(get_vapeshops, axis=1).join(stores_df)
vape_df

There is no url.
There is no url.
There is no url.
There is no url.
There is no url.
There is no url.
There is no url.
There is no url.
There is no url.
There is no url.
There is no url.
There is no url.


Unnamed: 0,mail_address,location_address,total_cost,issued_info,license_status,license_a,name,url
0,"AMIN NARGIS\n1104 PLANTERS PLACE\nSALISBURY, MD 21804","VAPE IT STORE II\n1015 S SALISBURY BLVD\nSALISBURY, MD 21801",35.00,"WICOMICO COUNTY, CLERK OF CIRCUIT COURT\nP.O. BOX 198\nSALISBURY, MARYLAND 21803-0198",License Status: Issued\nLicense No.: 22375606\nControl No.: 22884439\nDate of Issue: 4/27/2018\nMonths Paid: 12\nExp. Date: 4/30/2019\nSubdivision: 22 Salisbury,OTHER TOBACCO PROD RETLR,VAPE IT STORE II,https://jportal.mdcourts.gov/license/pbLicenseDetail.jsp?owi=bmDExzf4ZDw%3D
1,"AMIN NARGIS\n1104 PLANTERS PLACE\nSALISBURY, MD 21804","VAPE IT STORE I\n1724 N SALISBURY BLVD UNIT 2\nSALISBURY, MD 21801",$ 24.00,"WICOMICO COUNTY, CLERK OF CIRCUIT COURT\nP.O. BOX 198\nSALISBURY, MARYLAND 21803-0198",License Status: Issued\nLicense No.: 22375605\nControl No.: 22591855\nDate of Issue: 4/27/2018\nMonths Paid: 12\nExp. Date: 4/30/2019\nSubdivision: 22 Salisbury,TRADER'S LICENSE,VAPE IT STORE I,https://jportal.mdcourts.gov/license/pbLicenseDetail.jsp?owi=glqAu7o7gJE%3D
2,"ANJ DISTRIBUTIONS LLC\n2504 ORCHARD KNOLL WAY\nODENTON, MD 21113","VAPEPAD THE\n2299 JOHNS HOPKINS ROAD\nGAMBRILLS, MD 21054",$ 94.00,"ROBERT P. DUCKWORTH, CLERK OF CIRCUIT COURT\n8 CHURCH CIRCLE, ROOM H-101\nANNAPOLIS, MARYLAND 21401",License Status: Issued\nLicense No.: 02304705\nControl No.: 02685930\nDate of Issue: 4/05/2018\nMonths Paid: 12\nExp. Date: 4/30/2019\nSubdivision: 02 Anne Arundel County,TRADER'S LICENSE,VAPEPAD THE,https://jportal.mdcourts.gov/license/pbLicenseDetail.jsp?owi=zjsVR%2F6x1p8%3D
3,,,,,,,VAPE FROG,There is not a url.
4,,,,,,,VAPE FROG,There is not a url.
5,DISBROW II EMERSON HARRINGTON,"VAPE LOFT THE\n185 MITCHELLS CHANCE RD\nEDGEWATER, MD 21037",$ 154.00,"ROBERT P. DUCKWORTH, CLERK OF CIRCUIT COURT\n8 CHURCH CIRCLE, ROOM H-101\nANNAPOLIS, MARYLAND 21401",License Status: Issued\nLicense No.: 02310799\nControl No.: 02686069\nDate of Issue: 4/03/2018\nMonths Paid: 12\nExp. Date: 4/30/2019\nSubdivision: 02 Anne Arundel County,TRADER'S LICENSE,VAPE LOFT THE,https://jportal.mdcourts.gov/license/pbLicenseDetail.jsp?owi=%2BlPU%2F8mjjj8%3D
6,DISCOUNT TOBACCO ESSEX LLC,"VAPE N CIGAR\n7104 MINSTREL UNIT #7\nCOLUMBIA, MD 21045",$ 84.00,"WAYNE A. ROBEY, CLERK OF CIRCUIT COURT\n9250 BENDIX ROAD\nCOLUMBIA, MARYLAND 21045",License Status: Issued\nLicense No.: 13343011\nControl No.: 13856368\nDate of Issue: 4/30/2018\nMonths Paid: 12\nExp. Date: 4/30/2019\nSubdivision: 13 Howard County,TRADER'S LICENSE,VAPE N CIGAR,https://jportal.mdcourts.gov/license/pbLicenseDetail.jsp?owi=rGdKgh3sllo%3D
7,FAIRGROUND VILLAGE LLC,"VAPE DOJO\n330 ONE FORTY VILLAGE ROAD\nUNIT 15\nWESTMINSTER, MD 21157",$ 179.00,"DONALD B. SEALING II, CLERK OF CIRCUIT COURT\n55 NORTH COURT STREET\nWESTMINSTER, MARYLAND 21157-5155",License Status: Issued\nLicense No.: 06327188\nControl No.: 06946760\nDate of Issue: 4/05/2018\nMonths Paid: 12\nExp. Date: 4/30/2019\nSubdivision: 06 Westminster,TRADER'S LICENSE,VAPE DOJO,https://jportal.mdcourts.gov/license/pbLicenseDetail.jsp?owi=2NDDt0T4zOs%3D
8,,,,,,,VAPE HAVEN,There is not a url.
9,,,,,,,VAPORS LOUNGE,There is not a url.


## Save as `vape-total.csv`

Make sure you don't save the index! Open it up in a text editor or Excel to make sure it's correct.

In [79]:
vape_df.to_csv("vape-total.csv", index=False)

In [80]:
vape_df = pd.read_csv('vape-total.csv')
vape_df.head()

Unnamed: 0,mail_address,location_address,total_cost,issued_info,license_status,license_a,name,url
0,"AMIN NARGIS\r\n1104 PLANTERS PLACE\r\nSALISBURY, MD 21804","VAPE IT STORE II\r\n1015 S SALISBURY BLVD\r\nSALISBURY, MD 21801",35.00,"WICOMICO COUNTY, CLERK OF CIRCUIT COURT\r\nP.O. BOX 198\r\nSALISBURY, MARYLAND 21803-0198",License Status: Issued\r\nLicense No.: 22375606\r\nControl No.: 22884439\r\nDate of Issue: 4/27/2018\r\nMonths Paid: 12\r\nExp. Date: 4/30/2019\r\nSubdivision: 22 Salisbury,OTHER TOBACCO PROD RETLR,VAPE IT STORE II,https://jportal.mdcourts.gov/license/pbLicenseDetail.jsp?owi=bmDExzf4ZDw%3D
1,"AMIN NARGIS\r\n1104 PLANTERS PLACE\r\nSALISBURY, MD 21804","VAPE IT STORE I\r\n1724 N SALISBURY BLVD UNIT 2\r\nSALISBURY, MD 21801",$ 24.00,"WICOMICO COUNTY, CLERK OF CIRCUIT COURT\r\nP.O. BOX 198\r\nSALISBURY, MARYLAND 21803-0198",License Status: Issued\r\nLicense No.: 22375605\r\nControl No.: 22591855\r\nDate of Issue: 4/27/2018\r\nMonths Paid: 12\r\nExp. Date: 4/30/2019\r\nSubdivision: 22 Salisbury,TRADER'S LICENSE,VAPE IT STORE I,https://jportal.mdcourts.gov/license/pbLicenseDetail.jsp?owi=glqAu7o7gJE%3D
2,"ANJ DISTRIBUTIONS LLC\r\n2504 ORCHARD KNOLL WAY\r\nODENTON, MD 21113","VAPEPAD THE\r\n2299 JOHNS HOPKINS ROAD\r\nGAMBRILLS, MD 21054",$ 94.00,"ROBERT P. DUCKWORTH, CLERK OF CIRCUIT COURT\r\n8 CHURCH CIRCLE, ROOM H-101\r\nANNAPOLIS, MARYLAND 21401",License Status: Issued\r\nLicense No.: 02304705\r\nControl No.: 02685930\r\nDate of Issue: 4/05/2018\r\nMonths Paid: 12\r\nExp. Date: 4/30/2019\r\nSubdivision: 02 Anne Arundel County,TRADER'S LICENSE,VAPEPAD THE,https://jportal.mdcourts.gov/license/pbLicenseDetail.jsp?owi=zjsVR%2F6x1p8%3D
3,,,,,,,VAPE FROG,There is not a url.
4,,,,,,,VAPE FROG,There is not a url.
