# Rock and Mineral Clubs

Scrape all of the rock and mineral clubs listed at https://rocktumbler.com/blog/rock-and-mineral-clubs/ (but don't just cut and paste!)

Save a CSV called `rock-clubs.csv` with the name of the club, their URL, and the city they're located in.

**Bonus**: Add a column for the state. There are a few ways to do this, but knowing that `element.parent` goes 'up' one element might be helpful.

* _**Hint:** The name of the club and the city are both inside of td elements, and they aren't distinguishable by class. Instead you'll just want to ask for all of the tds and then just ask for the text from the first or second one._
* _**Hint:** If you use BeautifulSoup, you can select elements by attributes other than class or id - instead of `doc.find_all({'class': 'cat'})` you can do things like `doc.find_all({'other_attribute': 'blah'})` (sorry for the awful example)._
* _**Hint:** If you love `pd.read_html` you might also be interested in `pd.concat` and potentially `list()`. But you'll have to clean a little more!_

In [28]:
import requests
from bs4 import BeautifulSoup
import re

In [2]:
response = requests.get('https://rocktumbler.com/blog/rock-and-mineral-clubs/')
doc = BeautifulSoup(response.text)

### just alabama info

In [10]:
alabama_stuff = doc.find_all('section')[1]
alabama_stuff

<section>
<br/><br/>
<table bgcolor="#CCCCCC" cellpadding="4" cellspacing="1" width="100%"><tr><td bgcolor="#B9EDB8">
<h3>Alabama Rock and Mineral Clubs</h3>
</td></tr></table>
<table bgcolor="#CCCCCC" border="0" cellpadding="4" cellspacing="1" class="font12" width="100%">
<tr bgcolor="#FFFFFF"><td width="60%"><a href="http://www.lapidaryclub.com/">Alabama Mineral &amp; Lapidary Society</a></td>
<td width="40%">Birmingham</td></tr>
<tr bgcolor="#FFFFFF"><td width="60%"><a href="http://www.wiregrassrockhounds.com/">Dothan Gem &amp; Mineral Club</a></td>
<td width="40%">Dothan</td></tr>
<tr bgcolor="#FFFFFF"><td width="60%"><a href="http://huntsvillegms.org/">Huntsville Gem and Mineral Society</a></td>
<td width="40%">Huntsville</td></tr>
<tr bgcolor="#FFFFFF"><td width="60%"><a href="http://www.mobilerockandgem.com/">Mobile Rock &amp; Gem Society</a></td>
<td width="40%">Mobile</td></tr>
<tr bgcolor="#FFFFFF"><td width="60%"><a href="http://montgomerygemandmineralsociety.com/mgms/">Mont

In [22]:
# club names
alabama_a = alabama_stuff.find_all('a')
for item in alabama_a:
    print(item.text.strip())

Alabama Mineral & Lapidary Society
Dothan Gem & Mineral Club
Huntsville Gem and Mineral Society
Mobile Rock & Gem Society
Montgomery Gem & Mineral Society


In [58]:
# cities
alabama_tr = alabama_stuff.find_all('td')
count = 2
for item in alabama_tr[1:]:
    if count % 2:
        name_and_city = item.text.strip()
        print(name_and_city)
    count = count + 1

Birmingham
Dothan
Huntsville
Mobile
Montgomery


In [86]:
alabama_tr = alabama_stuff.find_all('tr')
for item in alabama_tr[1:]:
    name = item.find('td').text.strip()
    print(name)    
    city = item.find_all('td')[1].text.strip()
    print(city)
    url = item.find('a')['href']
    print(url)
    state = alabama_stuff.find('h3').text.strip()
    state_finder = r"(.*) Rock and Mineral Clubs"
    state_name = re.findall(state_finder,state)
    for thing in state_name:
        print(thing)

Alabama Mineral & Lapidary Society
Birmingham
http://www.lapidaryclub.com/
Alabama
Dothan Gem & Mineral Club
Dothan
http://www.wiregrassrockhounds.com/
Alabama
Huntsville Gem and Mineral Society
Huntsville
http://huntsvillegms.org/
Alabama
Mobile Rock & Gem Society
Mobile
http://www.mobilerockandgem.com/
Alabama
Montgomery Gem & Mineral Society
Montgomery
http://montgomerygemandmineralsociety.com/mgms/
Alabama


### putting it all together

In [98]:
stuff = doc.find_all('section')[1:51]

In [134]:
rock_list = []
for item in stuff:
    trs = item.find_all('tr')
    for thing in trs[1:]:
        name = thing.find_all('td')[0].text.strip()
        city = thing.find_all('td')[1].text.strip()
        url = thing.find('a')['href']
        state = trs[0].find('h3').text.strip()
        state_finder = r"(.*) Rock and Mineral Clubs"
        state_name = re.findall(state_finder,state)
        for thingamabob in state_name:
            thingamabob
        rock_list.append({
            'name': name,
            'city': city,
            'url': url,
            'state': thingamabob
        })

In [135]:
rock_list

[{'name': 'Alabama Mineral & Lapidary Society',
  'city': 'Birmingham',
  'url': 'http://www.lapidaryclub.com/',
  'state': 'Alabama'},
 {'name': 'Dothan Gem & Mineral Club',
  'city': 'Dothan',
  'url': 'http://www.wiregrassrockhounds.com/',
  'state': 'Alabama'},
 {'name': 'Huntsville Gem and Mineral Society',
  'city': 'Huntsville',
  'url': 'http://huntsvillegms.org/',
  'state': 'Alabama'},
 {'name': 'Mobile Rock & Gem Society',
  'city': 'Mobile',
  'url': 'http://www.mobilerockandgem.com/',
  'state': 'Alabama'},
 {'name': 'Montgomery Gem & Mineral Society',
  'city': 'Montgomery',
  'url': 'http://montgomerygemandmineralsociety.com/mgms/',
  'state': 'Alabama'},
 {'name': 'Chugach Gem & Mineral Society',
  'city': 'Anchorage',
  'url': 'http://www.chugachgemandmineralsociety.com/',
  'state': 'Alaska'},
 {'name': 'Mat-Su Rock and Mineral Club',
  'city': 'Palmer',
  'url': 'http://matsurockclub.com/',
  'state': 'Alaska'},
 {'name': 'Apache Junction Rock and Gem Club',
  'city'

In [120]:
import pandas as pd



In [136]:
df = pd.DataFrame(rock_list)

In [137]:
df

Unnamed: 0,name,city,url,state
0,Alabama Mineral & Lapidary Society,Birmingham,http://www.lapidaryclub.com/,Alabama
1,Dothan Gem & Mineral Club,Dothan,http://www.wiregrassrockhounds.com/,Alabama
2,Huntsville Gem and Mineral Society,Huntsville,http://huntsvillegms.org/,Alabama
3,Mobile Rock & Gem Society,Mobile,http://www.mobilerockandgem.com/,Alabama
4,Montgomery Gem & Mineral Society,Montgomery,http://montgomerygemandmineralsociety.com/mgms/,Alabama
...,...,...,...,...
481,Weis'n'Miners Geology Club,Menasha,http://www.weismuseum.org/geology-club.html,Wisconsin
482,Wisconsin Geological Society,West Allis,http://www.wisgeologicalsociety.com/,Wisconsin
483,Cody 59ers Rock Club,Cody,http://www.cody59ers.com/,Wyoming
484,Riverton Mineral and Gem Society,Riverton,http://www.rivertonmgs.com/,Wyoming


In [140]:
df.to_csv('rock-clubs.csv', index=False)