# Rock and Mineral Clubs

Scrape all of the rock and mineral clubs listed at https://rocktumbler.com/blog/rock-and-mineral-clubs/ (but don't just cut and paste!)

Save a CSV called `rock-clubs.csv` with the name of the club, their URL, and the city they're located in.

**Bonus**: Add a column for the state. There are a few ways to do this, but knowing that `element.parent` goes 'up' one element might be helpful.

* _**Hint:** The name of the club and the city are both inside of td elements, and they aren't distinguishable by class. Instead you'll just want to ask for all of the tds and then just ask for the text from the first or second one._
* _**Hint:** If you use BeautifulSoup, you can select elements by attributes other than class or id - instead of `doc.find_all({'class': 'cat'})` you can do things like `doc.find_all({'other_attribute': 'blah'})` (sorry for the awful example)._
* _**Hint:** If you love `pd.read_html` you might also be interested in `pd.concat` and potentially `list()`. But you'll have to clean a little more!_

In [45]:
import requests
from bs4 import BeautifulSoup

In [46]:
rocks = "https://rocktumbler.com/blog/rock-and-mineral-clubs/"
raw_html = requests.get(rocks).content
doc = BeautifulSoup(raw_html, "html.parser")
print(doc.prettify())

<!DOCTYPE html>
<html>
 <head>
  <meta charset="utf-8"/>
  <link href="https://rocktumbler.com/blog/rock-and-mineral-clubs/" rel="canonical"/>
  <title>
   450+ Rock and Mineral Clubs Across the USA
  </title>
  <meta content="Rock mineral and gem clubs are a great way to meet people with knowledge and interest in the field. Find one near you." name="description"/>
  <meta content="Rock and mineral clubs in the United States" name="page-topic"/>
  <link href="https://rocktumbler.com/cssa.css" media="all" rel="stylesheet" type="text/css"/>
  <meta content="width=device-width, initial-scale=1.0" name="viewport"/>
  <meta content="text/html; charset=utf-8" http-equiv="Content-Type"/>
  <link href="https://rocktumbler.com/favicon.ico" rel="SHORTCUT ICON"/>
  <script async="" src="//pagead2.googlesyndication.com/pagead/js/adsbygoogle.js">
  </script>
  <script>
   (adsbygoogle = window.adsbygoogle || []).push({
          google_ad_client: "ca-pub-3777031107463803",
          enable_page_l

In [124]:
# Clubs
tables = doc.find_all('tr', bgcolor = "#FFFFFF")
for table in tables:
    cities = table.find_all('td')
    for city in cities:
        clubs = table.find_all('a')
        for club in clubs:
            print(club.text)

Alabama Mineral & Lapidary Society
Alabama Mineral & Lapidary Society
Dothan Gem & Mineral Club
Dothan Gem & Mineral Club
Huntsville Gem and Mineral Society
Huntsville Gem and Mineral Society
Mobile Rock & Gem Society
Mobile Rock & Gem Society
Montgomery Gem & Mineral Society
Montgomery Gem & Mineral Society
Chugach Gem & Mineral Society
Chugach Gem & Mineral Society
Mat-Su Rock and Mineral Club
Mat-Su Rock and Mineral Club
Apache Junction Rock and Gem Club
Apache Junction Rock and Gem Club
Black Canyon City Rock Club
Black Canyon City Rock Club
Daisy Mountain Rock & Mineral Club
Daisy Mountain Rock & Mineral Club
Gila County Gem & Mineral Society
Gila County Gem & Mineral Society
Huachuca Mineral and Gem Club
Huachuca Mineral and Gem Club
Lake Havasu Gem & Mineral Society
Lake Havasu Gem & Mineral Society
Mineralogical Society of Arizona
Mineralogical Society of Arizona
Mingus Gem & Mineral Club
Mingus Gem & Mineral Club
Mohave County Gemstoners
Mohave County Gemstoners
Old Pueblo Lap

In [48]:
# urls
tables = doc.find_all('td')
for table in tables:
    clubs = table.find_all('a')
    for club in clubs:
        print(club['href'])

http://www.lapidaryclub.com/
http://www.wiregrassrockhounds.com/
http://huntsvillegms.org/
http://www.mobilerockandgem.com/
http://montgomerygemandmineralsociety.com/mgms/
http://www.chugachgemandmineralsociety.com/
http://matsurockclub.com/
http://www.ajrockclub.com/
http://www.bccrockclub.mysite.com/
http://www.dmrmc.com/
http://gilagem.org/
http://www.huachucamineralandgemclub.info/
http://www.lakehavasugms.org/
http://www.msaaz.org/
http://www.mingusclub.org/
http://www.gemstoners.org/
http://www.lapidaryclub.org/
https://www.pinalgeologymuseum.org/index.php/pinal-gem-mineral-society/about-pgms
http://www.prescottgemmineral.org/
http://www.qrgmc.org/
http://rockhounds.scwclubs.com/
http://www.sedonagemandmineral.org/
http://scrrc.blogspot.com/
https://sites.google.com/site/cochisecountyrock/
http://www.tgms.org/
http://www.verderiverrockhounds.com/
http://www.westvalleyrockandmineralclub.com/
http://whitemountain-azrockclub.org/
http://www.wickenburggms.org/
http://www.centralarroc

In [123]:
# City
tables = doc.find_all('tr', bgcolor = "#FFFFFF")
for table in tables:
    cities = table.find_all('td')
    print(cities[1].text)

Birmingham
Dothan
Huntsville
Mobile
Montgomery
Anchorage
Palmer
Apache Junction
Black Canyon City
Anthem
Miami
Sierra Vista
Lake Havasu City
Scottsdale
Cottonwood
Kingman
Tucson
Coolidge
Prescott Valley
Quartzsite
Phoenix
Sedona
Bullhead City
Pearce
Tucson
Cottonwood
Buckeye
Show Low
Wickenburg
Little Rock
Siloam Springs
Mountain Home
Sutter Creek
Anaheim
Lancaster
Antioch
Borrego Springs
Angels Camp
Pacific Grove
San Luis Obispo
Coalinga
Thousand Oaks
Concord
Culver City
Northridge
Bellflower
El Cajon
Placerville
Buena Park
Fallbrook
Oroville
Yucca Valley
North Highlands
Fresno
Placerville
Yucca Valley
Ridgecrest
Bakersfield
Lemoore
Livermore
Lone Pine
Mariposa
Hayward
Pasadena
Barstow
Monrovia
Modesto
Mohave Valley
Grass Valley
Orinda
La Habra
San Bernardino
Santa Maria
Oxnard
Palmdale
Escondido
Rolling Hills Estates
Paradise
Pasadena
Los Altos
Roseville
Brea
Sacramento
North Highlands
Salinas
San Diego
San Diego
San Francisco
Bakersfield
San Luis Obispo
Santa Ana
San Jose
Santa Cruz

In [126]:
rock_clubs = []
tables = doc.find_all('tr', bgcolor = "#FFFFFF")
for table in tables:
    rocks = {}
    cities = table.find_all('td')
    rocks['city'] = cities[1].text
    for city in cities:
        clubs = table.find_all('a')
        for club in clubs:
            rocks['club_name'] = club.text
            rocks['url'] = club['href']
    print(rocks)
    rock_clubs.append(rocks)

{'city': 'Birmingham', 'club_name': 'Alabama Mineral & Lapidary Society', 'url': 'http://www.lapidaryclub.com/'}
{'city': 'Dothan', 'club_name': 'Dothan Gem & Mineral Club', 'url': 'http://www.wiregrassrockhounds.com/'}
{'city': 'Huntsville', 'club_name': 'Huntsville Gem and Mineral Society', 'url': 'http://huntsvillegms.org/'}
{'city': 'Mobile', 'club_name': 'Mobile Rock & Gem Society', 'url': 'http://www.mobilerockandgem.com/'}
{'city': 'Montgomery', 'club_name': 'Montgomery Gem & Mineral Society', 'url': 'http://montgomerygemandmineralsociety.com/mgms/'}
{'city': 'Anchorage', 'club_name': 'Chugach Gem & Mineral Society', 'url': 'http://www.chugachgemandmineralsociety.com/'}
{'city': 'Palmer', 'club_name': 'Mat-Su Rock and Mineral Club', 'url': 'http://matsurockclub.com/'}
{'city': 'Apache Junction', 'club_name': 'Apache Junction Rock and Gem Club', 'url': 'http://www.ajrockclub.com/'}
{'city': 'Black Canyon City', 'club_name': 'Black Canyon City Rock Club', 'url': 'http://www.bccroc

In [127]:
import pandas as pd

In [129]:
df = pd.DataFrame(rock_clubs)
df.head()

Unnamed: 0,city,club_name,url
0,Birmingham,Alabama Mineral & Lapidary Society,http://www.lapidaryclub.com/
1,Dothan,Dothan Gem & Mineral Club,http://www.wiregrassrockhounds.com/
2,Huntsville,Huntsville Gem and Mineral Society,http://huntsvillegms.org/
3,Mobile,Mobile Rock & Gem Society,http://www.mobilerockandgem.com/
4,Montgomery,Montgomery Gem & Mineral Society,http://montgomerygemandmineralsociety.com/mgms/


In [131]:
df.to_csv("rock-clubs.csv", index=False, header=True)