# Mapping largest cities
## Experiments with folium, zipfile and pandas

In [1]:
import zipfile, requests, codecs
from io import BytesIO
import pandas as pd
import folium

%matplotlib inline

## Get online data from Census gazeteer

https://www.census.gov/geo/maps-data/data/gazetteer2010.html

In [2]:
url = 'http://www2.census.gov/geo/docs/maps-data/data/gazetteer/Gaz_places_national.zip'
r = requests.get(url)
z = zipfile.ZipFile(BytesIO(r.content))
z.extractall()

There is just one file in this zipfile

In [3]:
filename = z.namelist()[0]
filename

'Gaz_places_national.txt'

I was getting an encoding error trying to read this file into a pandas dataframe. The following makes sure the file is encoded in utf-8.

In [4]:
file = codecs.open(filename, "r",encoding='utf-8', errors='ignore')
df = pd.read_table(file, sep = '\t')

Some column names might have whitespace in them, so strip those out.

In [5]:
df = df.rename(columns=lambda x: x.strip())

In [6]:
df.columns

Index(['USPS', 'GEOID', 'ANSICODE', 'NAME', 'LSAD', 'FUNCSTAT', 'POP10',
       'HU10', 'ALAND', 'AWATER', 'ALAND_SQMI', 'AWATER_SQMI', 'INTPTLAT',
       'INTPTLONG'],
      dtype='object')

In [7]:
print('There are',len(df),'cities and towns in this database.')

There are 29514 cities and towns in this database.


Let's get just the cities with > 250K population.  Then we'll display a map of the USA with a marker for each of those cities using Folium.

In [8]:
big250 = df[['NAME', 'POP10', 'INTPTLAT', 'INTPTLONG']][df.POP10>250000].sort_values('POP10', ascending=False)

In [9]:
map = folium.Map(location=[30,-97], zoom_start=4,tiles='Stamen Terrain')
for city in big250.iterrows():
    folium.Marker(
        location = [city[1]['INTPTLAT'],city[1]['INTPTLONG']], 
        popup = city[1]['NAME']+ ' pop:'+str(city[1]['POP10'])).add_to(map)
                    
map