
## Getting All of New Jersey's Municipality Names

I grabbed and stored all of the New Jersey Municipality Names from this [Wikipedia Article](https://en.wikipedia.org/wiki/List_of_municipalities_in_New_Jersey)
and stored it in text file named `nj_municipals.txt`. 

This uses `urllib` to handle https request and grabbing the HTML file rendered by the Wikipedia article. It also uses `BeautifulSoup` to parse through the file to find specific targetted HTML tags necessary. 

In [1]:
from bs4 import BeautifulSoup
from urllib import request, error, parse
import csv

In [2]:
wiki_nj_municipalities_link = 'https://en.wikipedia.org/wiki/List_of_municipalities_in_New_Jersey'

In [3]:
response = request.urlopen(wiki_nj_municipalities_link)
nj_municipalities_html = response.read()
soup = BeautifulSoup(nj_municipalities_html,'lxml')

After inspecting the Wikipedia article entry, to grab all of the municipalities' name, we jave to first grab all of the `<tr>` tags from the variable `soup`.

In [4]:
nj_municipal_rows = soup.find_all('tr')[1:566]

First, we grab each municipal name based on the table column. Then we delete all the trailing whitespaces. Finally, we make every character lowercase for string cleaning purposes.

In [5]:
def clean_row(row):
    row = row.find_all('td')
    important = [row[1],row[2],row[4],row[5]]
    important = [item.text.strip().lower() for item in important]
    important[2] = important[2].replace(',','')
    return important

In [6]:
nj_municipals_data = list(map(clean_row, nj_municipal_rows))
heading = ['Municipal','County','Population','Type']

Lastly, we load the data into a CSV file named `municpial.csv` with the appropriate headings. 

In [7]:
with open('data/municipal.csv', 'w+', newline='') as file:
    writer = csv.writer(file)
    writer.writerow(heading)
    writer.writerows(nj_municipals_data)