# Visualizing Spatial Data 

## Introduction

The objective of this exercise is to show you how to programmatically build <b> map visualizations </b> using <b>spatial data</b> stored in MongoDB.

<b>Requirements</b>
<ul>
    <li> <a href="https://www.continuum.io" target="_blank">Anaconda Python 2.7</a> </li>
    <li> <a href="https://plot.ly/python/getting-started" target="_blank">Plotly</a> </li>
    <li> <a href="https://api.mongodb.com/python/current" target="_blank">PyMongo</a> </li>
    <li> Databases from the <a href="http://espinosa-oviedo.com/big-data-visualization/sharding-data-collections-with-mongodb/" target="_blank">sharding exercise</a> </li>
</ul>

<b>Execution Environment</b>

This exercise is intented to be executed as a <a href="http://jupyter.org" target="_blank">Jupyter Notebook</a> (<em>already included in Anaconda Python</em>). You can launch jupyter by executing the following command:

    jupyter notebook
    
Then, either open this notebook in your web browser or <em>copy/paste</em> the code into a <b>new notebook</b>.



## Connecting and Querying MongoDB



<li>Connecting to a MongoDB instance</li>

In [10]:
import pymongo

mongo = pymongo.MongoClient(host="localhost", port=27022)
mongo

MongoClient(host=['localhost:27022'], document_class=dict, tz_aware=False, connect=True)

<li>Count the number of documents in collection <b>mydb.cities</b> (cf. <a href="http://espinosa-oviedo.com/big-data-visualization/sharding-data-collections-with-mongodb/" target="_blank">sharding exercise</a>).</li>

In [11]:
cities = mongo['mydb']['cities3'].find()
cities.count()

1595

<li>Count the number of <b>cities per state</b>.</li>


In [12]:
cities = mongo['mydb']['cities3'].find()
states = {}

for city in cities:
    s = city['state']
    if s in states:
        states[s] += 1
    else:
        states[s] = 1

states

{u'NY': 1595}

## Plotting Maps with Plotly 


<li>Map showing the number of cities per state.</li>

In [13]:
import plotly as plotly
from plotly.graph_objs import *

plotly.offline.init_notebook_mode()

data = [dict(
        type='choropleth',
        locations= states.keys(), # states' names   
        z= states.values(),       # states' number of cities
        locationmode= 'USA-states'
)]

layout = dict(
    title = 'Number of cities in US per state',
    geo = dict(scope='usa')
)

fig = dict( data=data, layout=layout )
plotly.offline.iplot( fig )

<li>Dot map showing the US cities</li>

In [5]:
lons = []
lats = []
text = []

for city in mongo['mydb']['cities'].find():
    lons.append( city['loc'][0] )
    lats.append( city['loc'][1] )
    text.append( city['city'] )
    
data = [dict(
        type = 'scattergeo',
        locationmode = 'USA-states',
        lon = lons,
        lat = lats,
        text = text,
        mode = 'markers',
        marker = dict(
            size = 1
        )
)]

layout = dict(
        title = 'US cities',
        geo = dict(
            scope='usa',
            projection= dict( type='albers usa' )
        )
)

fig = dict( data=data, layout=layout )
plotly.offline.iplot( fig )

## TO DO

<ol>
    <li>Plot multiple maps showing the contens of the <a href="http://espinosa-oviedo.com/big-data-visualization/sharding-data-collections-with-mongodb/" target="_blank">sharded cluster</a> defined in the previous exercise</li>
    <li>Verify the distribution of the data according to each sharding technique</li>
</ol>



