###Blobs. A project of Urban CCD in Chicago.
#Tutorial 1.0 - Using Static Data

## Data Prep and Set-Up

You should have already cloned the *blobs* repo from GitHub, and saved it accordingly. The first steps import the required libraries, assign and initilize the required data and spatial files. Your data will be connected to spatial files, connecting 311 call data to census tracts in this case.

In [9]:
ls

[31mChicago Census.csv[m[m* [34mblocks[m[m/             maxp.py             [34mtracts[m[m/
README.md           configure.py        [34mold[m[m/                untitled
blobs.py            [34miPython Notebooks[m[m/  smoothing.py        untitled1
blobs.pyc           master311.csv       test_script.py


In [12]:
# Import required libraries
import blobs
import numpy as np
import pandas as pd
import pysal as ps

ImportError: dlopen(/Users/Makosak/anaconda/lib/python2.7/site-packages/matplotlib/_png.so, 2): Library not loaded: @loader_path/../../../libpng15.15.dylib
  Referenced from: /Users/Makosak/anaconda/lib/python2.7/site-packages/matplotlib/_png.so
  Reason: image not found

In [3]:
# Open census tract shapefile and database.
shp_link = 'tracts/CensusTractsTIGER2010.shp'
dbf = ps.open('tracts/CensusTractsTIGER2010.dbf')

In [4]:
# Shapefile and data preparation.
cols = np.array([dbf.by_col(col) for col in dbf.header]).T
df = pd.DataFrame(cols)
df.columns = dbf.header
df.columns = df.columns.map(lambda x: x.lower())
df.commarea = df.commarea.astype('int')
df['order'] = df.index

In [5]:
# Assign spatial weight for census tracts.
w=ps.open('tracts/CensusTractsTIGER2010.gal').read()

In [6]:
# Initialize sample 311 call data from CSV file.
init_calls = pd.read_csv('master311.csv', dtype=object)
for c in init_calls.columns[1:]:
    init_calls[c] = init_calls[c].astype('float')

In [7]:
# Format data and merge on shapefile IDs
ordered_tracts = pd.DataFrame(df.loc[:,['tractce10', 'commarea', 'order']])
calls = pd.merge(init_calls, ordered_tracts, how='right', left_on='tractID', 
    right_on='tractce10', sort=False).fillna(0).sort(['order'])
calls = calls.drop(['order', 'commarea'],1)

class bd:
  data = calls
  w = w
  shp_link = shp_link
  id = 'tractce10'
  level = 'tract'

In [8]:
# Assign the attribute data.
d = bd()

## Create Blobs

You can assign what attribute you would like to "blob" accordingly. In this example, blobs are created from census tracts and must have a minimum of 10,000 persons per blob.

In [9]:
# Create blobs minimum population of 10,000 in each blob
b = blobs.Blobs(d, 'pop', 10000, iterations=1)

['vehicles_per1000', 'alley_lights_per1000', 'garbage_per1000', 'graffiti_per1000', 'potholes_per1000', 'rodents_per1000', 'sanitation_per1000', 'street_lights_one_per1000', 'street_lights_all_per1000', 'tree_debris_per1000', 'tree_trims_per1000', 'buildings_per1000', 'all_calls_per1000']

### CREATING BLOBS FROM 13 VARIABLES ###
    PARAMETERS:
     # Minimum pop in each blob: 10000
     # Iterations: 1
     # Method: equal votes
     # Plot blobs: True
     # Save blobs data: False


# ITERATION 1                 
  Score: 3175.65
  Created 189 blobs (4 tracts per blob)
  Best solution so far: 3175.65
  Time taken: 5.3 seconds (0 seconds remaining)

# BEST SOLUTION:                      
  Score: 3175.65
  189 blobs (4 tracts per blob)
             



In [None]:
# Cluster the blobs along similarities in the data
cl = blobs.Cluster_blobs(b, blobs_per_cluster=10)

In [None]:
# Create blobs with a minimum of 30 tracts in each blob, and cluster
b = blobs.Blobs(d, 'areas', 30)
cl = blobs.Cluster_blobs(b, blobs_per_cluster=10)

In [None]:
# Have around 3 blobs per cluster
cl.set_n_clusters(3)

In [None]:
# See blob assignments
b.assignments

In [None]:
# Plot blobs along one of the variables
b.plot_blobs('all_calls_per1000_mean')