# Strava-viz Alpha Release
### Alex Howard & Taylor Pellerin

For the alpha release of our visualization project, we just wanted to see what we could do with some of the geographic examples that we had discussed, as we are the most interested in this type of thing and we feel that these have a lot of potential.

## Data and prep
For a proof of concept, we are going to use a handful of trips that alex logged this past week as data. Moving forward, we would like to use all of his logged data in one place.  
   
We will also simplify things and fudge the time stamps, making all trips happen starting at the same time. In the next iteration, we will deal with the inconveniences of how strava records the data (only start time, rather than a time stamp to go with each lat-lng).   

This data is in JSON blobs, so first things first I'll do is turn it into a usable CSV.

In [1]:
! ls

Activities Processing Play.ipynb clean.pyc
Alpha-Viz.ipynb                  constants.py
Plotly play.ipynb                constants.pyc
alex.csv                         credentials.py
alex_activities_latlng.json      credentials.pyc
alex_all_acts.json               process_json.py
alpha-vis.py                     scrape.py
clean.py


In [2]:
import json
from pprint import pprint

data = json.load(open('alex_activities_latlng.json'))

### Subset to 5 of Alex's trips

In [3]:
sample = data[:5]

In [4]:
alex_csv = "runner_id,lat,lon,timestamp\n"

for i in range(len(sample)):
    time_stamp = 0
    lat_lng = sample[i][0]["data"]
    for lat, lon in lat_lng:
        time_stamp += 120
        alex_csv += ",".join([str(i),
                              str(lat),
                              str(lon),
                              str(time_stamp)
                             ]) + "\n"

In [5]:
f = open("alex.csv", "w") 
f.write(alex_csv) 
f.close()

## Visualization  
#### Warning that this requires python 2.7 due to import errors in 3.6

In [10]:
! pip install pyglet

Collecting pyglet
[33m  Cache entry deserialization failed, entry ignored[0m
[33m  Cache entry deserialization failed, entry ignored[0m
  Downloading https://files.pythonhosted.org/packages/1c/fc/dad5eaaab68f0c21e2f906a94ddb98175662cc5a654eee404d59554ce0fa/pyglet-1.3.2-py2.py3-none-any.whl (1.0MB)
[K    100% |████████████████████████████████| 1.0MB 1.4MB/s eta 0:00:01
[?25hCollecting future (from pyglet)
[33m  Cache entry deserialization failed, entry ignored[0m
Installing collected packages: future, pyglet
Successfully installed future-0.16.0 pyglet-1.3.2
[33mYou are using pip version 9.0.1, however version 10.0.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.[0m


In [11]:
from geoplotlib.layers import BaseLayer
from geoplotlib.core import BatchPainter
import geoplotlib
from geoplotlib.colors import colorbrewer
from geoplotlib.utils import epoch_to_str, BoundingBox, read_csv

NotImplementedError: abstract

### The below will show all of the instances of Alex running around

In [7]:
class AllTrailsLayer(BaseLayer):

    def __init__(self):
        self.data = read_csv('alex.csv')
        self.cmap = colorbrewer(self.data['runner_id'], alpha=220)
        self.t = self.data['timestamp'].min()
        self.painter = BatchPainter()


    def draw(self, proj, mouse_x, mouse_y, ui_manager):
        self.painter = BatchPainter()
        df = self.data.where((self.data['timestamp'] > self.t) & (self.data['timestamp'] <= self.t + 15*60))

        for taxi_id in set(df['runner_id']):
            grp = df.where(df['runner_id'] == taxi_id)
            self.painter.set_color(self.cmap[taxi_id])
            x, y = proj.lonlat_to_screen(grp['lon'], grp['lat'])
            self.painter.points(x, y, 10)

        self.t += 2*60

        if self.t > self.data['timestamp'].max():
            self.t = self.data['timestamp'].min()

        self.painter.batch_draw()
        ui_manager.info(epoch_to_str(self.t))

        
    # this should get modified as well moving forward. Might be too small
    def bbox(self):
        return BoundingBox(north=37.801421, west=-122.517339, south=37.730097, east=-122.424474)

NameError: name 'BaseLayer' is not defined

For reasons unknown, actually runnning this kills the kernel and kills python in PyCharm as well.

### The below shows one runner, with the follow cam

In [14]:
class FollowTrailsLayer(BaseLayer):

    def __init__(self):
        self.data = read_csv('alex.csv')
        self.data = self.data.where(self.data['runner_id'] == list(set(self.data['runner_id']))[2])
        self.t = self.data['timestamp'].min()
        self.painter = BatchPainter()


    def draw(self, proj, mouse_x, mouse_y, ui_manager):
        self.painter = BatchPainter()
        self.painter.set_color([0,0,255])
        df = self.data.where((self.data['timestamp'] > self.t) & (self.data['timestamp'] <= self.t + 30*60))
        proj.fit(BoundingBox.from_points(lons=df['lon'], lats=df['lat']), max_zoom=14)
        x, y = proj.lonlat_to_screen(df['lon'], df['lat'])
        self.painter.linestrip(x, y, 10)
        self.t += 30
        if self.t > self.data['timestamp'].max():
            self.t = self.data['timestamp'].min()

        self.painter.batch_draw()
        ui_manager.info(epoch_to_str(self.t))

### For reasons unknown, I am having issues getting either viz to run  
So instead, I will make a sample plotly graphic of Alex's trip data

In [26]:
import numpy as np
import pandas as pd
import plotly.plotly as py

import matplotlib.pyplot as plt


import plotly.graph_objs as go

% matplotlib inline

In [27]:
alex_data = pd.read_csv('alex.csv')

In [28]:
alex_data.head()

Unnamed: 0,runner_id,lat,lon,timestamp
0,0,37.777265,-122.449372,120
1,0,37.777217,-122.449621,240
2,0,37.77714,-122.449857,360
3,0,37.777065,-122.45007,480
4,0,37.777036,-122.45032,600


### With this sample we can...
Plot Alex's lat, long coordinates for each trip as x, y coordinates

In [35]:
alex_data.values[0]

array([   0.      ,   37.777265, -122.449372,  120.      ])

In [43]:
traces = []
for rid in list(set(list(alex_data.runner_id.values))):
    df = alex_data.loc[alex_data.runner_id == rid].sort_values("timestamp")
    
    trace = go.Scatter(x = df.lon, 
                       y = df.lat,
                       mode = 'lines', 
                       name='trip {}'.format(rid),
                       text = ['trip # {}<br>lon: {}<br>lat: {}'.format(int(row[0]),
                                                                        row[1], 
                                                                        row[2])
                               for row in df.values],
                       hoverinfo = 'text'
                       )
    
    
    traces.append(trace)
py.iplot(traces)

### For next time, we will:  
1) Get geoplotlib working, or apply background images to this plotly map  
2) Implement the summary statistic visualizations, as discussed in the proposal