# To catch a spy...

**Problem definition:<br>**
The following notebook solves the Spy Catcher problem from [puzzlOR](http://puzzlor.com/2014-04_SpyCatcher.html):

"Your government has lost track of a high profile foreign spy and they have requested your help to track him down. As part of his attempts to evade capture, he has employed a simple strategy. Each day the spy moves from the country that he is currently in to a neighboring country.

The spy cannot skip over a country (for example, he cannot go from Chile to Ecuador in one day). The movement probabilities are equally distributed amongst the neighboring countries. For example, if the spy is currently in Ecuador, there is a 50% chance he will move to Colombia and a 50% chance he will move to Peru. The spy was last seen in Chile and will only move about countries that are in South America. He has been moving about the countries for several weeks."

**Constraints:**<br>	
* Spy begins in Chile, setting Chile at 1 on Day 0
* Spy can only travel to contiguous countries within South America
* Movement probabilities are equally distributed among neighboring countries
* Spy has been moving "for several weeks"

### Solution:
I developed a tool for asking for user input on how many days the spy has been moving and creating the following:
* a dataframe of probabilities of where the spy might have come from and where he/she may be
* choropleth maps based on that input (one for potential point of departure, one for current day)

Interestingly, the conversation around French Guiana's status as a country came up in the map - plotly does not seem to recognize it so it's missing from the map. I did not end up using the population data as adjustments to probabilities. 

In [1]:
# import dependencies
import numpy as np
import pandas as pd
from numpy.linalg import matrix_power
import plotly.graph_objs as gobj
import chart_studio.plotly
from plotly.offline import download_plotlyjs,init_notebook_mode,plot,iplot
init_notebook_mode(connected=True)

In [2]:
# define column names with countries 
countries = "Chile", "Argentina", "Uruguay", "Paraguay", "Brazil", "Bolivia", "Peru", "Ecuador", "Colombia", "Venezuela", "Guyana", "Suriname", "French Guiana"

In [3]:
# create dataframe of countries with population figures
# population figures from https://en.wikipedia.org/wiki/List_of_South_American_countries_by_population
country_list = ["Chile", "Argentina", "Uruguay", "Paraguay", "Brazil", "Bolivia", "Peru", "Ecuador", "Colombia", "Venezuela", "Guyana", "Suriname", "French Guiana"]
population_list = [19107216, 44938712, 3518552, 7152703, 210730000, 11469896, 32131400, 17364100, 48258494, 32219521, 782766, 581372, 296711]

country_pop = pd.DataFrame(list(zip(country_list, population_list)), 
               columns =['Country', 'Population']) 

country_pop.sort_values(by = ['Population'], ascending = False)

Unnamed: 0,Country,Population
4,Brazil,210730000
8,Colombia,48258494
1,Argentina,44938712
9,Venezuela,32219521
6,Peru,32131400
0,Chile,19107216
7,Ecuador,17364100
5,Bolivia,11469896
3,Paraguay,7152703
2,Uruguay,3518552


In [4]:
# define probabilities matrix, each row representing a country and the probabilities of moving there
country_probs = np.array ([[0, 1/3, 0, 0, 0, 1/3, 1/3, 0, 0, 0, 0, 0, 0],
                            [1/5, 0, 1/5, 1/5, 1/5, 1/5, 0, 0, 0, 0, 0, 0, 0],
                            [0, 1/2, 0, 0, 1/2, 0, 0, 0, 0, 0, 0, 0, 0],
                            [0, 1/3, 0, 0, 1/3, 1/3, 0, 0, 0, 0, 0, 0, 0],
                            [0, 1/10, 1/10, 1/10, 0, 1/10, 1/10, 0, 1/10, 1/10, 1/10, 1/10, 1/10],
                            [1/5, 1/5, 0, 1/5, 1/5, 0, 1/5, 0, 0, 0, 0, 0, 0],
                            [1/5, 0, 0, 0, 1/5, 1/5, 0, 1/5, 1/5, 0, 0, 0, 0],
                            [0, 0, 0, 0, 0, 0, 1/2, 0, 1/2, 0, 0, 0, 0],
                            [0, 0, 0, 0, 1/4, 0, 1/4, 1/4, 0, 1/4, 0, 0, 0],
                            [0, 0, 0, 0, 1/3, 0, 0, 0, 1/3, 0, 1/3, 0, 0],
                            [0, 0, 0, 0, 1/3, 0, 0, 0, 0, 1/3, 0, 1/3, 0],
                            [0, 0, 0, 0, 1/3, 0, 0, 0, 0, 0, 1/3, 0, 1/3],
                            [0, 0, 0, 0, 1/2, 0, 0, 0, 0, 0, 0, 1/2, 0]])

In [5]:
# starting out in Chile on Day 0, assign zeros to all values but Chile
x = np.zeros((len(countries), 1), dtype=float)
chile_idx = 0
x[chile_idx] = 1
x

array([[1.],
       [0.],
       [0.],
       [0.],
       [0.],
       [0.],
       [0.],
       [0.],
       [0.],
       [0.],
       [0.],
       [0.],
       [0.]])

In [6]:
# define a function for calculating probability of spy location on any given day
def probability_vector(t):
    """ probability of being in any country on given day t """
    probs = np.dot(x.T, np.linalg.matrix_power(country_probs, t))
    return pd.Series(probs.ravel(), index = countries)

In [10]:
# test out with 2 days movement
probability_vector(2)

Chile            0.200000
Argentina        0.066667
Uruguay          0.066667
Paraguay         0.133333
Brazil           0.200000
Bolivia          0.133333
Peru             0.066667
Ecuador          0.066667
Colombia         0.066667
Venezuela        0.000000
Guyana           0.000000
Suriname         0.000000
French Guiana    0.000000
dtype: float64

In [11]:
# get user input on number of days traveling
input_text = input ("Days traveling: ")

# convert string to integer, assign variables for start point and current probabilities
input_number = int(input_text)
s = probability_vector(input_number)
start_point = probability_vector(input_number - 1)

# create dataframe and display
status = pd.DataFrame(list(zip(country_list, s, start_point)), 
               columns =['Country', 'Prob_coming_from_here', 'Prob_being_here'])

# print dataframe sorted by highest probability current location
print()
status.sort_values(by = ['Prob_being_here'], ascending = False)

Days traveling: 21



Unnamed: 0,Country,Prob_coming_from_here,Prob_being_here
4,Brazil,0.19998,0.19997
5,Bolivia,0.100028,0.100042
1,Argentina,0.100027,0.100039
6,Peru,0.100018,0.100026
8,Colombia,0.079996,0.079994
0,Chile,0.060021,0.060033
3,Paraguay,0.060013,0.06002
9,Venezuela,0.059982,0.059973
10,Guyana,0.059973,0.05996
11,Suriname,0.05997,0.059955


### Mapping the spy's location 
#### Taking the user input from above, create visualizations based on:
* Where might the spy have come from?
* Where is the spy now?

In [12]:
# define data and parameters for map of where spy may have come from (start map)
data_start = dict(type = 'choropleth',
           locations = ['chile', 'argentina', 'uruguay', 'paraguay', 'brazil', 'bolivia', 'peru', 'ecuador', 
                        'colombia', 'venezuela', 'guyana', 'suriname', 'french guiana'],
           locationmode = 'country names',
           colorscale = 'RdBu',
           reversescale = True, 
           text = ['chile', 'argentina', 'uruguay', 'paraguay', 'brazil', 'bolivia', 'peru', 'ecuador', 
                   'colombia', 'venezuela', 'guyana', 'suriname', 'french guiana'],
           z = start_point,
           colorbar = {'title':'Probabilities', 'len':200,'lenmode':'pixels' })

In [13]:
# define data and parameters for map of where spy may be currently (current map)
data_current = dict(type = 'choropleth',
           locations = ['chile', 'argentina', 'uruguay', 'paraguay', 'brazil', 'bolivia', 'peru', 'ecuador', 
                        'colombia', 'venezuela', 'guyana', 'suriname', 'french guiana'],
           locationmode = 'country names',
           colorscale = 'RdBu',
           reversescale = True, 
           text = ['chile', 'argentina', 'uruguay', 'paraguay', 'brazil', 'bolivia', 'peru', 'ecuador', 
                   'colombia', 'venezuela', 'guyana', 'suriname', 'french guiana'],
           z = s,
           colorbar = {'title':'Probabilities', 'len':200,'lenmode':'pixels' })

In [18]:
# initialize the layout variable for start map
layout_start = dict(geo = {'scope':'south america'},
#              title = "donde esta el espía?",
             font=dict(
                family="Arial, monospace",
                size=20,
                color="#7f7f7f")
             )

In [19]:
# initialize the layout variable for current map
layout_current = dict(geo = {'scope':'south america'},
#              title = "donde esta el espía?",
             font=dict(
                family="Arial, monospace",
                size=20,
                color="#7f7f7f")
             )

### Starting point map

In [20]:
# initialize the Figure object by passing data and layout as arguments for start map
col_map_start = gobj.Figure(data = [data_start],layout = layout_start)

# print how many days the spy has been traveling
print(f"The spy has been moving for {input_number} days. Here is where he/she may have traveled from...")

# plot the map
iplot(col_map_start)

The spy has been moving for 21 days. Here is where he/she may have traveled from...


### Current location map

In [21]:
# initialize the Figure object by passing data and layout as arguments for current map
col_map_start = gobj.Figure(data = [data_current],layout = layout_current)

# print how many days the spy has been traveling
print(f"The spy has been moving for {input_number} days. Here is where he may be...")

# plot the map
iplot(col_map_start)

The spy has been moving for 21 days. Here is where he may be...


In [None]:
# NOTES
# plotly reference: https://analyticsindiamag.com/beginners_guide_geographical_plotting_with_plotly/
# 'chile', 'argentina', 'uruguay', 'paraguay', 'brazil', 'bolivia', 'peru', 'ecuador', 'colombia', 'venezuela', 'guyana', 'suriname', 'french guiana'