# Central London Data Science Project Nights
### Citigrapher 1: Google Maps API
The focus of this meetup is in finding out different ways in which training data can be structure in order to improve a models accuracy.

## Overview

1. User input 
2. API query
3. Storage & Extension

These 3 parts will be demonstrated in the following tutorial and will give an idea
of how a few packages can be used to tackle the problem.

## 1. User input

The first thing that a user will want to do when using citigrapher is to input locations
that represent their starting and final destinations. These may represent their work and
home addresses respectively. We will want to be able to do this for a number of potential
people.

In [1]:
# User input function
def request_user_address():
    # Ask user for start address
    startAddress = input('Enter a start location: ')
    # Ask user for end address
    finalAddress = input('Enter a final location: ')
    return({'Start': startAddress, 'Final': finalAddress})

We can now use this within a larger function to get everyone's addresses:

In [2]:
# User inputs function
def request_all_user_addresses():
    # Ask user for number of people
    numPeople = int(input("How many people are we connecting: "))

    # Preinitialise results
    res = {}

    # Query this many times
    for i in range(numPeople):
        # give the user an idea of who the info is for
        print("For person {}".format(i))
        res[i] = {}
        res[i]['Location'] = request_user_address()

    # And let's returnn this for further use
    return(res)

In [4]:
userAddresses = request_all_user_addresses()

How many people are we connecting: 3
For person 0
Enter a start location: w120rq
Enter a final location: sw165yr
For person 1
Enter a start location: nw17db
Enter a final location: e10AA
For person 2
Enter a start location: sw111qn
Enter a final location: se171je


The above will clearly be insufficient/annoying when there are lots of people, so in a future meetup
we will be exploring how to best carry out local storage for a user so that they only ever have to enter
information for a particular person once. I.e. if you have already entered your friend Bob's work and 
home addresses you don't want to do that every time, but simply ask the user whether they want to use
Bob's default addresses. 

We will now pretend that we saved this output to file so we can use it in the next step, so we will 
now import it here:

In [5]:
import csv
userAddresses = list(csv.reader(open('inst/extdata/user_addresses.txt'), delimiter='\t'))
userAddresses

[['User', 'Start', 'End'],
 ['User1', 'W12 0RQ', 'SW16 5YR'],
 ['User2', 'NW1 7DB', 'E1 0AA'],
 ['User3', 'SW11 1QN', 'SE17 1JE']]

In [8]:
## turn it into a dict output like above
res = {}

for i in range(0,len(userAddresses)-1):
    res[i] = {}
    res[i]['Location'] = {}
    res[i]['Location']['Start'] = userAddresses[i+1][1]
    res[i]['Location']['End'] = userAddresses[i+1][2]

res

{0: {'Location': {'End': 'SW16 5YR', 'Start': 'W12 0RQ'}},
 1: {'Location': {'End': 'E1 0AA', 'Start': 'NW1 7DB'}},
 2: {'Location': {'End': 'SE17 1JE', 'Start': 'SW11 1QN'}}}

## 2. API query

With the addresses we will want to isolate which tube stops are best for each person's start and
end destination. To do this we will use the igraph tube map object that is saved within the citigrapher repo,
along with the ggmap package which is superb. First we will need to convert our addresses to lat long.
To do this we will need to first install the geopy package. 

In [20]:
!pip install geopy
from geopy.geocoders import Nominatim



To demonstrate how geopy works, we can quickly get one lat/long for one of our user given postcodes as follows:

In [24]:
geolocator = Nominatim()
location = geolocator.geocode(res[1]['Location']['Start'])
location.latitude, location.longitude

(50.7848302, 8.094857)

Then let's build a function to query google for the lat long of our postcodes

In [25]:
def get_geo_details(postcode):     
    # use the geopy package to query postcodes
    geolocator = Nominatim()  
    # now extract the bits that we need and return as dict
    location = geolocator.geocode(postcode) 
    return({'lat': location.latitude, 'long': location.longitude})


We can now use this within a larger function to work out the closest 3 tubes to 
our requested addresses.

In [13]:
!pip install jgraph
import jgraph

