# Mines, Part 2

You can get information about a specific mine by using its Mine ID.

**Try searching using the Mine ID `3503598`**.

## Preparation: Knowing your tags

These questions are the same for every data set, and might not work exactly for yours.

### What is the tag and class name for the mine operator name?

In [2]:
# tr[3] td[1]

### What is the tag and class name for the current controller?

In [3]:
# tr[] td[1]

### What is the tag and class name for the operator history area?

In [4]:
# .drsoprhistory nextSibling

### What is the tag and class name for the mine's address?

In [5]:
# tr[-1] td[1]

## Setup: Import what you'll need to scrape the page

Use `requests`, not `urllib`.

In [6]:
import requests
from bs4 import BeautifulSoup
import pandas as pd

## Scrape this page

Scrape this page, displaying the

- The operator
- The current address
- The current controller

**You should know how to do `.post` requests by now.**

In [7]:
url = 'https://arlweb.msha.gov/drs/ASP/BasicMineInfonew.asp'

data = {
    'MineId': '3503598',
    'x': '47',
    'y': '4'
}

response = requests.post(url, data=data)
doc = BeautifulSoup(response.text, 'html.parser')

In [8]:
mine = {
    'operator': doc.find_all('font', attrs={'style': 'FONT-SIZE:.80em; color:#000080'})[3].text.strip(),
    'address': doc.find_all('font', attrs={'style': 'FONT-SIZE:.80em; color:#000080'})[-1].text.strip(),
    'controller': doc.find_all('font', attrs={'style': 'FONT-SIZE:.80em; color:#000080'})[9].text.strip()
}

mine

{'address': '6860 Anderson Rd.\r\nAurora, OR\xa0\xa097002',
 'controller': 'S-2 Contractors Inc',
 'operator': 'Newberg Rock & Dirt'}

## Getting information on many mines

### Reading in our source

Using pandas, read in `mines-subset.csv`.

In [9]:
df = pd.read_csv('mines-subset.csv')
df

Unnamed: 0,id
0,2501216
1,3200965
2,2901371
3,2901544


## Scrape every single row, storing the current controller and mine operator in new columns.

You probably want to open up the Jupyter Notebook that's about `.apply`.

In [10]:
def scrape(row):
    data['MineId'] = row['id']
    response = requests.post(url, data=data)
    doc = BeautifulSoup(response.text, 'html.parser')
    operator = doc.find_all('font', attrs={'style': 'FONT-SIZE:.80em; color:#000080'})[3].text.strip()
    address = doc.find_all('font', attrs={'style': 'FONT-SIZE:.80em; color:#000080'})[-1].text.strip()
    controller = doc.find_all('font', attrs={'style': 'FONT-SIZE:.80em; color:#000080'})[9].text.strip()
    output = {}
    if operator:
        output['operator'] = operator
    if address:
        output['address'] = address
    if controller:
        output['controller'] = controller
    return pd.Series(output)

df = df.join(df.apply(scrape, axis=1))

### Save your dataframe

In [11]:
df.to_csv('mines_test_scrape.csv', index=False)

### Re-open your dataframe to confirm you didn't save any extra weird columns

In [12]:
pd.read_csv('mines_test_scrape.csv')

Unnamed: 0,id,address,controller,operator
0,2501216,"24617 W Center Rd\r\nWaterloo, NE 68069",David A Iske,Iske Dirt Sand & Gravel
1,3200965,"485 Helene St\r\nPalermo, ND 58769",John Lynn,J M Lynn Dirtwork
2,2901371,"E Hwy 60\r\nHEREFORD, TX 79045",Lawson Warner,Jake Diel Dirt & Paving Inc
3,2901544,"E Hwy 60\r\nHEREFORD, TX 79045",Lawson Warner,Jake Diel Dirt & Paving Inc


## Repeat this process for the entire `mines.csv` file

In [17]:
df = pd.read_csv('mines.csv', converters = {'id': str})
df = df.apply(scrape, axis=1)
df.to_csv('mines_complete_scrape.csv', index=False)
pd.read_csv('mines_complete_scrape.csv')

Unnamed: 0,address,controller,operator
0,"6860 Anderson Rd.\r\nAurora, OR 97002",S-2 Contractors Inc,Newberg Rock & Dirt
1,"2360 West 2nd Ave\r\nDENVER, CO 80010",Allied Dirt Moving Company,Allied Dirt Moving Company
2,"120 Dally Ln\r\nBuffalo, WY 82834",Matt Mitchell,AM Dirtworks & Aggregate Sales
3,"GREEN RIVER, UT 84525",Atlas Resources Inc & Dirty Devil Mining Co,Atlas-Dirty Devil Mining
4,"GREEN RIVER, UT 84525",Atlas Resources Inc & Dirty Devil Mining Co,Atlas-Dirty Devil Mining
5,"1773 Cedar View Rd\r\nSoda Springs, ID 83276",Steven Meyers,Babe's Dirt Work
6,"8002 Dogwood Trail\r\nHAUGHTON, LA 71037",Barlow James & John Lindsey,Bar-Lin Dirt Company
7,"1590 Terry\r\nVIDOR, TX 77662",Barber Donald,Barber'S Dirt Pit
8,"Rr 1 Box 32\r\nRUSSELL, KS 67665",Bender Marvin,Bender Sand & Dirt
9,"115 Georges Pond Road\r\nFranklin, ME 04634",Greg P Carter,BERT'S DIRT
