# Rock and Mineral Clubs

Scrape all of the rock and mineral clubs listed at https://rocktumbler.com/blog/rock-and-mineral-clubs/ (but don't just cut and paste!)

Save a CSV called `rock-clubs.csv` with the name of the club, their URL, and the city they're located in.

**Bonus**: Add a column for the state. There are a few ways to do this, but knowing that `element.parent` goes 'up' one element might be helpful.

* _**Hint:** The name of the club and the city are both inside of td elements, and they aren't distinguishable by class. Instead you'll just want to ask for all of the tds and then just ask for the text from the first or second one._
* _**Hint:** If you use BeautifulSoup, you can select elements by attributes other than class or id._

In [54]:
import requests
from bs4 import BeautifulSoup
import re
import pandas as pd

In [55]:
url = 'https://rocktumbler.com/blog/rock-and-mineral-clubs/'
response = requests.get(url,verify=False)
doc = BeautifulSoup(response.text, 'html.parser')



In [56]:
rows = doc.find_all('tr')
rows

[<tr><td bgcolor="#B9EDB8">
 <h3>Alabama Rock and Mineral Clubs</h3>
 </td></tr>,
 <tr bgcolor="#FFFFFF"><td width="60%"><a href="http://www.lapidaryclub.com/">Alabama Mineral &amp; Lapidary Society</a></td>
 <td width="40%">Birmingham</td></tr>,
 <tr bgcolor="#FFFFFF"><td width="60%"><a href="http://www.wiregrassrockhounds.com/">Dothan Gem &amp; Mineral Club</a></td>
 <td width="40%">Dothan</td></tr>,
 <tr bgcolor="#FFFFFF"><td width="60%"><a href="http://huntsvillegms.org/">Huntsville Gem and Mineral Society</a></td>
 <td width="40%">Huntsville</td></tr>,
 <tr bgcolor="#FFFFFF"><td width="60%"><a href="http://www.mobilerockandgem.com/">Mobile Rock &amp; Gem Society</a></td>
 <td width="40%">Mobile</td></tr>,
 <tr bgcolor="#FFFFFF"><td width="60%"><a href="http://montgomerygemandmineralsociety.com/mgms/">Montgomery Gem &amp; Mineral Society</a></td>
 <td width="40%">Montgomery</td></tr>,
 <tr><td bgcolor="#B9EDB8">
 <h3>Alaska Rock and Mineral Clubs</h3>
 </td></tr>,
 <tr bgcolor="#FF

In [57]:
records = []
state = ''
club = ''
link = ''
city = ''

In [58]:
for row in rows:
    d = {}
    #this means the state is here. Bulk of processing in the else
    if row.find('h3') is not None:
        state = re.findall(r"^\w*\s?\w*\s?R", row.find('h3').text)[0][0:-1]
    else:
        name = row.find('td', width="60%")
        if name is not None:
            club = name.text
            link = name.find('a')['href']
        location = row.find('td', width="40%")
        if location is not None:
            city = location.text    
            d = {
                'state':state,
                'club':club,
                'city':city,
                'link':link
            }
            records.append(d)
records        

[{'state': 'Alabama ',
  'club': 'Alabama Mineral & Lapidary Society',
  'city': 'Birmingham',
  'link': 'http://www.lapidaryclub.com/'},
 {'state': 'Alabama ',
  'club': 'Dothan Gem & Mineral Club',
  'city': 'Dothan',
  'link': 'http://www.wiregrassrockhounds.com/'},
 {'state': 'Alabama ',
  'club': 'Huntsville Gem and Mineral Society',
  'city': 'Huntsville',
  'link': 'http://huntsvillegms.org/'},
 {'state': 'Alabama ',
  'club': 'Mobile Rock & Gem Society',
  'city': 'Mobile',
  'link': 'http://www.mobilerockandgem.com/'},
 {'state': 'Alabama ',
  'club': 'Montgomery Gem & Mineral Society',
  'city': 'Montgomery',
  'link': 'http://montgomerygemandmineralsociety.com/mgms/'},
 {'state': 'Alaska ',
  'club': 'Chugach Gem & Mineral Society',
  'city': 'Anchorage',
  'link': 'http://www.chugachgemandmineralsociety.com/'},
 {'state': 'Alaska ',
  'club': 'Mat-Su Rock and Mineral Club',
  'city': 'Palmer',
  'link': 'http://matsurockclub.com/'},
 {'state': 'Arizona ',
  'club': 'Apache 

In [59]:
# need a dictionary for each row, and then a list of them all
df = pd.DataFrame(records)
df.head()

Unnamed: 0,city,club,link,state
0,Birmingham,Alabama Mineral & Lapidary Society,http://www.lapidaryclub.com/,Alabama
1,Dothan,Dothan Gem & Mineral Club,http://www.wiregrassrockhounds.com/,Alabama
2,Huntsville,Huntsville Gem and Mineral Society,http://huntsvillegms.org/,Alabama
3,Mobile,Mobile Rock & Gem Society,http://www.mobilerockandgem.com/,Alabama
4,Montgomery,Montgomery Gem & Mineral Society,http://montgomerygemandmineralsociety.com/mgms/,Alabama


In [60]:
df.to_csv("rock-clubs.csv", index=False, header=False)

In [61]:
df.state.value_counts()

California         86
Texas              31
Washington         24
Arizona            21
Michigan           20
Florida            20
Colorado           18
Pennsylvania       16
New York           15
Oregon             15
North Carolina     13
Illinois           12
Maryland           12
Ohio               12
Wisconsin          11
Missouri           10
Utah                9
Idaho               8
Massachusetts       8
Indiana             8
Virginia            8
Oklahoma            6
New Jersey          6
Nevada              6
Iowa                6
Georgia             6
Minnesota           5
Mississippi         5
New Mexico          5
Montana             5
South Carolina      5
Alabama             5
Connecticut         5
New Hampshire       4
Kansas              4
Tennessee           4
Maine               4
Arkansas            3
West Virginia       3
Louisiana           3
Wyoming             3
Kentucky            3
Alaska              2
South Dakota        2
Vermont             2
Rhode Isla