# State Polygons

## Loading Data

This notebook focuses on extracting [the data](http://econym.org.uk/gmap/states.xml) from its XML format into a CSV file to be read in later. We start by import the necessary dependencies.

In [1]:
import xml.etree.ElementTree as ET
import pandas as pd
import numpy as np

To begin, we use the XML package to parse the file and get its root.

In [2]:
root = ET.parse("../data/states.xml").getroot()

To put the data into a format that can be passed to `pd.DataFrame`, we iterate through each state in `root` and extract its name, vertex ID (0, 1, 2, ...), latitude, and longitude.

In [3]:
states = []

for state in root:
    i = 0
    for vertex in state:
        states += [[state.attrib["name"], i, vertex.attrib["lat"], vertex.attrib["lng"]]]
        i += 1

Finally, we pass the list of lists to `pd.DataFrame` and specify the column names.

In [5]:
state_polygons = pd.DataFrame(states, columns=["state", "vertex", "lat", "lon"])
state_polygons.head()

Unnamed: 0,state,vertex,lat,lon
0,Alaska,0,70.0187,-141.0205
1,Alaska,1,70.1292,-141.7291
2,Alaska,2,70.4515,-144.8163
3,Alaska,3,70.7471,-148.4583
4,Alaska,4,70.7923,-151.1609


For later use, we save the dataframe as a CSV file.

In [6]:
state_polygons.to_csv("../data/state-polygons.csv", index=False)