<img src="http://imgur.com/1ZcRyrc.png" style="float: left; margin: 15px; height: 80px">

# Project 1

### Building "Pokemon Stay"

---
You are an analyst at a "scrappy" online gaming company that specializes in remakes of last year's fads.

Your boss, who runs the product development team, is convinced that Pokemon Go's fatal flaw was that you had to actually move around outside. She has design mock-ups for a new game called Pokemon Stay: in this version players still need to move, but just from website to website. Pokemon gyms are now popular online destinations, and catching Pokemon in the "wild" simply requires browsing the internet for hours in the comfort of your home.

She wants you to program a prototype version of the game, and analyze the planned content to help the team calibrate the design.

#### Package imports

The pprint package below is the only package imported here, and it's not even strictly required to do any of the project. Printing python variables and objects with pprint can help to format them in a "prettier" way.

In [1]:
from pprint import pprint
import numpy as np
from scipy.stats import expon
np.random.seed(5)

<img src="http://imgur.com/l5NasQj.png" style="float: left; margin: 25px 15px 0px 0px; height: 25px">

## 1. Defining a player

---

The player variables are:

    player_id : id code unique to each player (integer)
    player_name : entered name of the player (string)
    time_played : number of time played the game in minutes (float)
    player_pokemon: the player's captured pokemon (dictionary)
    gyms_visited: ids of the gyms that a player has visited (list)
    
Create the components for a player object by defining each of these variables. The dictionary and list variables should just be defined as empty; you can use any (correctly typed) values for the others.

In [2]:
player_id = 1
player_name = 'jacob tan'
time_played = 542.1
player_pokemon = {}
gyms_visited = []


<img src="http://imgur.com/l5NasQj.png" style="float: left; margin: 25px 15px 0px 0px; height: 25px">

## 2. Defining "gym" locations

---

As the sole programmer, Pokemon Stay will have to start small. To begin, there will be 10 different gym location websites on the internet. The gym locations are:

    1. 'reddit.com'
    2. 'amazon.com'
    3. 'twitter.com'
    4. 'linkedin.com'
    5. 'ebay.com'
    6. 'netflix.com'
    7. 'sporcle.com'
    8. 'stackoverflow.com'
    9. 'github.com'
    10. 'quora.com'

1. Set up a list of all the gym locations. This will be a list of strings.
2. Append two of these locations to your player's list of visited gyms.
3. Print the list.

In [3]:
# defining gyms location
pokemon_gyms = ['reddit.com', 'amazon.com', 'twitter.com', 'linkedin.com', 'ebay.com', \
              'netflix.com', 'sporcle.com', 'stackoverflow.com', 'github.com', 'quora.com']

# appending 2 gym locations to gyms visited
gyms_visited.append(pokemon_gyms[1])
gyms_visited.append(pokemon_gyms[5])

pprint(gyms_visited)

['amazon.com', 'netflix.com']


<img src="http://imgur.com/l5NasQj.png" style="float: left; margin: 25px 15px 0px 0px; height: 25px">

## 3. Create a pokedex

---

We also need to create some pokemon to catch. Each pokemon will be defined by these variables:

    pokemon_id : unique identifier for each pokemon (integer)
    name : the name of the pokemon (string)
    type : the category of pokemon (string)
    hp : base hitpoints (integer)
    attack : base attack (integer)
    defense : base defense (integer)
    special_attack : base special attack (integer)
    special_defense : base sepecial defense (integer)
    speed : base speed (integer)

We are only going to create 3 different pokemon with these `pokemon_id` and `pokemon_name` values:

    1 : 'charmander'
    2 : 'squirtle'
    3 : 'bulbasaur'

Create a dictionary that will contain the pokemon. The keys of the dictionary will be the `pokemon_id` and the values will themselves dictionaries that contain the other pokemon variables. The structure of the pokedex dictionary will start like so:
     
     {
         1: {
                 'name':'charmander',
                 'type':'fire',
                 ...
                 
The `type` of charmander, squirtle, and bulbasaur should be `'fire'`, `'water'`, and `'poison'` respectively. The other values are up to you, make them anything you like!

Print (or pretty print) the pokedex dictionary with the 3 pokemon.

In [4]:
def rand_stats():
    """rand_stats() will generate random integers from 1 to 99"""
    return np.random.randint(1,100)

# using rand_stats() to determine the value of pokemon stats
pokedex = { 1: {'name':'charmander', 'type':'fire', 'hp':rand_stats(), 'attack':rand_stats() ,\
                       'defense':rand_stats(), 'special_attack':rand_stats(), 'special_defense':rand_stats(),\
                        'speed':rand_stats()},
                    2: {'name':'squirtle', 'type':'water', 'hp':rand_stats(), 'attack':rand_stats() ,\
                       'defense':rand_stats(), 'special_attack':rand_stats(), 'special_defense':rand_stats(),\
                        'speed':rand_stats()},
                    3: {'name':'bulbasaur', 'type':'poison', 'hp':rand_stats(), 'attack':rand_stats() ,\
                       'defense':rand_stats(), 'special_attack':rand_stats(), 'special_defense':rand_stats(),\
                        'speed':rand_stats()}}

pprint(pokedex)

{1: {'attack': 62,
     'defense': 17,
     'hp': 79,
     'name': 'charmander',
     'special_attack': 74,
     'special_defense': 9,
     'speed': 63,
     'type': 'fire'},
 2: {'attack': 31,
     'defense': 81,
     'hp': 28,
     'name': 'squirtle',
     'special_attack': 8,
     'special_defense': 77,
     'speed': 16,
     'type': 'water'},
 3: {'attack': 81,
     'defense': 28,
     'hp': 54,
     'name': 'bulbasaur',
     'special_attack': 45,
     'special_defense': 78,
     'speed': 76,
     'type': 'poison'}}


<img src="http://imgur.com/l5NasQj.png" style="float: left; margin: 25px 15px 0px 0px; height: 25px">

## 4. Create a data structure for players

---

### 4.1 

In order to maintain a database of multiple players, create a dictionary that keeps track of players indexed by `player_id`. 

The keys of the dictionary will be `player_id` and values will be dictionaries containing each player's variables (from question 1). 

Construct the `players` dictionary and insert the player that you defined in question 1, then print `players`.

In [5]:
players = {player_id: {'player_name':player_name, 'time_played':time_played, \
                      'player_pokemon':player_pokemon, 'gyms_visited':gyms_visited}}

pprint(players)


{1: {'gyms_visited': ['amazon.com', 'netflix.com'],
     'player_name': 'jacob tan',
     'player_pokemon': {},
     'time_played': 542.1}}


---

### 4.2

Create a new player with `player_id = 2` in the `players` dictionary. Leave the `'player_pokemon'` dictionary empty. Append `'alcatraz'` and `'pacific_beach'` to the `'gyms_visited'` list for player 2.

The `'player_name'` and `'time_played'` values are up to you, but must be a string and float, respectively.

Remember, the player_id is the key for the player in the players dictionary.

Print the `players` dictionary with the new player inserted.

In [6]:
players[2] = {'player_name':'jamie chan', 'time_played':421., 'player_pokemon':{}, \
              'gyms_visited':gyms_visited + ['alacatraz', 'pacific_beach']}

pprint(players)

{1: {'gyms_visited': ['amazon.com', 'netflix.com'],
     'player_name': 'jacob tan',
     'player_pokemon': {},
     'time_played': 542.1},
 2: {'gyms_visited': ['amazon.com',
                      'netflix.com',
                      'alacatraz',
                      'pacific_beach'],
     'player_name': 'jamie chan',
     'player_pokemon': {},
     'time_played': 421.0}}


<img src="http://imgur.com/l5NasQj.png" style="float: left; margin: 25px 15px 0px 0px; height: 25px">

## 5. Add captured pokemon for each player

---

The `'player_pokemon'` keyed dictionaries for each player keep track of which of the pokemon each player has.

The keys of the `'player_pokemon'` dictionaries are the pokemon ids that correspond to the ids in the `pokedex` dictionary you created earlier. The values are integers specifying the stats for the pokemon.

Give player 1 a squirtle. Give player 2 charmander and a bulbasaur.

Print the players dictionary after adding the pokemon for each player.


In [7]:
# adding squirtle to player 1 pokemon list
players[1]['player_pokemon'][2] = pokedex[2]

# adding charmander and bulbasaur to player 2 pokemon list
players[2]['player_pokemon'][1] = pokedex[1]
players[2]['player_pokemon'][3] = pokedex[3]

pprint(players)

{1: {'gyms_visited': ['amazon.com', 'netflix.com'],
     'player_name': 'jacob tan',
     'player_pokemon': {2: {'attack': 31,
                            'defense': 81,
                            'hp': 28,
                            'name': 'squirtle',
                            'special_attack': 8,
                            'special_defense': 77,
                            'speed': 16,
                            'type': 'water'}},
     'time_played': 542.1},
 2: {'gyms_visited': ['amazon.com',
                      'netflix.com',
                      'alacatraz',
                      'pacific_beach'],
     'player_name': 'jamie chan',
     'player_pokemon': {1: {'attack': 62,
                            'defense': 17,
                            'hp': 79,
                            'name': 'charmander',
                            'special_attack': 74,
                            'special_defense': 9,
                            'speed': 63,
                            'typ



## 6. What gyms have players visited?

---
<img src="http://imgur.com/l5NasQj.png" style="float: left; margin: 25px 15px 0px 0px; height: 25px">
### 6.1

Write a for-loop that:

1. Iterates through the `pokemon_gyms` list of gym locations you defined before.
2. For each gym, iterate through each player in the `players` dictionary with a second, internal for-loop.
3. If the player has visited the gym, print out "[player] has visited [gym location].", filling in [player] and [gym location] with the current player's name and current gym location.

In [8]:
# adding alacatraz gym and pacific_beach gym into the list of gyms
pokemon_gyms.extend(['alacatraz', 'pacific_beach'])

#for loop
for gym in pokemon_gyms:
    for p in players:
        if gym in players[p]['gyms_visited']:
            print players[p]['player_name'] + ' has visited ' + gym + '.'

jacob tan has visited amazon.com.
jamie chan has visited amazon.com.
jacob tan has visited netflix.com.
jamie chan has visited netflix.com.
jamie chan has visited alacatraz.
jamie chan has visited pacific_beach.


<img src="http://imgur.com/xDpSobf.png" style="float: left; margin: 25px 15px 0px 0px; height: 25px">
### 6.2

How many times did that loop run? If you have N gyms and also N players, how many times would it run as a function of N?

Can you think of a more efficient way to accomplish the same thing? 

(You can write your answer as Markdown text.)

The loop ran 24 times. For every gym, it looped twice. If I have N gyms and N players, for every gym it will loop N times. Hence for N gyms it will loop N*N times.

In [9]:
# assuming that efficient is defined to be running the loop lesser times
for p in players:
    for i in players[p]['gyms_visited']:
            if i in pokemon_gyms:
                print players[p]['player_name'] + ' has visited ' + i + '.'
                
# this loop for i times for each player, where i varies according to each player.
# this for loop version will loop lesser times than the previous one because i can be any value betwwen 0 to N

jacob tan has visited amazon.com.
jacob tan has visited netflix.com.
jamie chan has visited amazon.com.
jamie chan has visited netflix.com.
jamie chan has visited alacatraz.
jamie chan has visited pacific_beach.


## <img src="http://imgur.com/l5NasQj.png" style="float: left; margin: 25px 15px 0px 0px; height: 25px">

## 7. Calculate player "power".

---

Define a function that will calculate a player's "power". Player power is defined as the sum of the base statistics all of their pokemon.

Your function will:

1. Accept the `players` dictionary, `pokedex` dictionary, and a player_id as arguments.
2. For the specified player_id, look up that player's pokemon and their level(s).
3. Find and aggregate the attack and defense values for each of the player's pokemon from the `pokedex` dictionary.
4. Print "[player name]'s power is [player power].", where the player power is the sum of the base statistics for all of their pokemon.
5. Return the player's power value.

Print out the pokemon power for each of your players.

In [10]:
# assuming sum of base statistics = attack + defense + special_attack + special_defense + speed + hp
def power(players, pokedex, player_id):
    agg_atk_def= {}
    other_stats = {}
    for key in players[player_id]['player_pokemon'].keys():
        
        # aggregating attack and defense values for each of the player's pokemon into a dictionary
        agg_atk_def[key] = pokedex[key]['attack'] + pokedex[key]['special_attack'] + pokedex[key]['defense'] + \
        pokedex[key]['special_defense']
        
        # aggregating all other stats values for each of the player's pokemon into a dictionary
        other_stats[key] = pokedex[key]['hp'] + pokedex[key]['speed']
    
    # summing up all the value items in agg_att_def dictionary
    agg_atk_def_sum = 0
    for key in agg_atk_def.keys():
        agg_atk_def_sum += agg_atk_def[key]
    
    # summing up all the value items in the other_stats dictionary
    other_stats_sum = 0
    for key in other_stats.keys():
        other_stats_sum += other_stats[key]   
    
    power = agg_atk_def_sum + other_stats_sum
    
    print players[player_id]['player_name'] + '\'s power is ' + str(power) + '.'
    return power

In [11]:
pprint (power(players, pokedex, 1))
pprint (power(players, pokedex, 2))

jacob tan's power is 241.
241
jamie chan's power is 666.
666


<img src="http://imgur.com/l5NasQj.png" style="float: left; margin: 25px 15px 0px 0px; height: 25px">

## 8. Load a pokedex file containing all the pokemon

---

### 8.1

While you were putting together the prototype code, your colleagues were preparing a dataset of Pokemon and their attributes. (This was a rush job, so they may have picked some crazy values for some...)

The code below loads information from a comma separated value (csv) file. You need to parse this string into a more useable format. The format of the string is:

- Rows are separated by newline characters: \n
- Columns are separated by commas: ,
- All cells in the csv are double quoted. Ex: "PokedexNumber" is the first cell of the first row.


Using for-loops, create a list of lists where each list within the overall list is a row of the csv/matrix, and each element in that list is a cell in that row. Additional criteria:

1. Quotes are removed from each cell item.
2. Numeric column values are converted to floats.
3. There are some cells that are empty and have no information. For these cells put a -1 value in place.

Your end result is effectively a matrix. Each list in the outer list is a row, and the *j*th elements of list together form the *j*th column, which represents a data attribute. The first three lists in your pokedex list should look like this:

    ['PokedexNumber', 'Name', 'Type', 'Total', 'HP', 'Attack', 'Defense', 'SpecialAttack', 'SpecialDefense', 'Speed']
    [1.0, 'Bulbasaur', 'GrassPoison', 318.0, 45.0, 49.0, 49.0, 65.0, 65.0, 45.0]
    [2.0, 'Ivysaur', 'GrassPoison', 405.0, 60.0, 62.0, 63.0, 80.0, 80.0, 60.0]

In [12]:
# code to read in pokedex info
import csv
raw_pd = []
pokedex_file = './pokedex_basic.csv'
with open(pokedex_file, 'rU') as f:
    reader = csv.reader(f)
    for row in reader:
        raw_pd.append(row)        
f.close() 

# the pokedex string is assigned to the raw_pd variable
pprint (raw_pd[0:2])

[['PokedexNumber',
  'Name',
  'Type',
  'Total',
  'HP',
  'Attack',
  'Defense',
  'SpecialAttack',
  'SpecialDefense',
  'Speed'],
 ['001',
  'Bulbasaur',
  'GrassPoison',
  '318',
  '45',
  '49',
  '49',
  '65',
  '65',
  '45']]


In [13]:
# for loop
ref_pd_forloops = []

for sublist in raw_pd:
    ref_sublist = []
    for i in sublist:
        # changing value of empty cells or cells with whitespace to -1
        if i == '' or i == ' ':
            ref_sublist.append(-1)
            
        # changing numbers to float
        elif i.isdigit() == True:
            ref_sublist.append(float(i))
        else:
            ref_sublist.append(i.lower())
    ref_pd_forloops.append(ref_sublist)

pprint (ref_pd_forloops[0:3])

[['pokedexnumber',
  'name',
  'type',
  'total',
  'hp',
  'attack',
  'defense',
  'specialattack',
  'specialdefense',
  'speed'],
 [1.0, 'bulbasaur', 'grasspoison', 318.0, 45.0, 49.0, 49.0, 65.0, 65.0, 45.0],
 [2.0, 'ivysaur', 'grasspoison', 405.0, 60.0, 62.0, 63.0, 80.0, 80.0, 60.0]]


<img src="http://imgur.com/xDpSobf.png" style="float: left; margin: 25px 15px 0px 0px; height: 25px">

### 8.2 Parse the raw pokedex with list comprehensions

---

Perform the same parsing as above, but **using only a single list comprehension** instead of for loops. You may have nested list comprehensions within the main list comprehension! The output should be exactly the same.

In [14]:
ref_pd_listcompre = [[-1 if i == '' or i == ' ' else float(i) if i.isdigit() == True else i.lower() for i in sublist]\
                     for sublist in raw_pd]

pprint (ref_pd_listcompre[0:3])

[['pokedexnumber',
  'name',
  'type',
  'total',
  'hp',
  'attack',
  'defense',
  'specialattack',
  'specialdefense',
  'speed'],
 [1.0, 'bulbasaur', 'grasspoison', 318.0, 45.0, 49.0, 49.0, 65.0, 65.0, 45.0],
 [2.0, 'ivysaur', 'grasspoison', 405.0, 60.0, 62.0, 63.0, 80.0, 80.0, 60.0]]


<img src="http://imgur.com/l5NasQj.png" style="float: left; margin: 25px 15px 0px 0px; height: 25px">

## 9. Write a function to generate the full pokedex

---

Write a function that recreates the pokedex you made before, but with the data read in from the full pokemon file. The `PokedexNumber` should be used as the `pokemon_id` key values for the dictionary of pokemon.

Your function should:

1. Take the parsed pokedex information you created above as an argument.
2. Return a dictionary in the same format as your original pokedex you created before containing the information from the parsed full pokedex file.

To test the function, print out the pokemon with id = 100.

In [15]:
def pokedex_func(pokemon_data):
    
    # separating the header from the rest of the data and remove 'pokedexnumber'
    header = pokemon_data[0]
    if 'pokedexnumber' in header:
        header.remove('pokedexnumber')

    # creating a new list containing pokedex number
    data = pokemon_data[1:]
    pokemon_id = [sublist[0] for sublist in data]
    
    # deleting pokedex number from data
    data_wo_id = [list(sublist) for sublist in zip(*data)]
    del data_wo_id[0]
    data = [list(sublist) for sublist in zip(*data_wo_id)]
    
    # creating a pokedex dictionary where pokemon_id is the key and the value is the dictionary of zipped header and sublist 
    pokedex = {id: dict(zip(header,s)) for id, s in zip(pokemon_id, data)}
    
    return pokedex
    
pokedex = pokedex_func(ref_pd_forloops)
pprint (pokedex[100])

{'attack': 30.0,
 'defense': 50.0,
 'hp': 40.0,
 'name': 'voltorb',
 'specialattack': 55.0,
 'specialdefense': 55.0,
 'speed': 100.0,
 'total': 330.0,
 'type': 'electric'}


<img src="http://i.imgur.com/GCAf1UX.png" style="float: left; margin: 25px 15px 0px 0px; height: 25px">

## 10. Write a function to generate a "filtered" pokedex
---
Your function should:
1. Take the parsed pokedex information you created above as an argument.
1. Take a dictionary as a parameter with keys matching the features of the Pokedex, filtering by exact match for string type values, and/or filter continuous variables specified value that is greater than or equal to the dictionary key parameter.
1. Return multiple elements from the Pokedex

Example:

```python

# Only filter based on parameters passed
filter_options = {
    'Attack':   25,
    'Defense':  30,
    'Type':     'Electric'
}

# Return records with attack >= 24, defense >= 30, and type == "Electric"
# Also anticipate that other paramters can also be passed such as "SpecialAttack", "Speed", etc.
filtered_pokedex(pokedex_data, filter=filter_options)

# Example output:
# [{'Attack': 30.0,
#  'Defense': 50.0,
#  'HP': 40.0,
#  'Name': 'Voltorb',
#  'SpecialAttack': 55.0,
#  'SpecialDefense': 55.0,
#  'Speed': 100.0,
#  'Total': 330.0,
#  'Type': 'Electric'},
#  {'Attack': 30.0,
#  'Defense': 33.0,
#  'HP': 32.0,
#  'Name': 'Pikachu',
#  'SpecialAttack': 55.0,
#  'SpecialDefense': 55.0,
#  'Speed': 100.0,
#  'Total': 330.0,
#  'Type': 'Electric'},
#  ... etc
#  ]

```



In [16]:
def filtered_pokedex(pokedex, filter):
    
    # removing the keys in pokemon and turn it into dictionaries in a list
    pokedex_wo_id = [value for key, value in pokedex.items()]
    
    filtered_pokemon = []
    for pokemon in pokedex_wo_id:
        # if the value of the pokemon[key] value fits the current criteria, it will move on to the next criteria
        for key, value in filter.items():  
            # checking value type of each key
            if type(value) == int or type(value) == float:
                if pokemon[key.lower()] <= value:  
                    # if value is less than the corresponding int/float pokemon[key] value, the loop will move on to the next pokemon
                    break                           
            elif type(value) == str:
                if pokemon[key.lower()] != value.lower():
                    # if value is not equal to the corresponding str pokemon[key] value, the loop will move on to the next pokemon
                    break
                    
        # only after all the criteria are looped through, append to filtered_pokemon list if there isn't any break 
        else:
            filtered_pokemon.append(pokemon)
    return filtered_pokemon

In [17]:
filter_options = {'specialattack':80, 'ATTACK':80, 'type':'FIRE', 'speed':80}
pprint (filtered_pokedex(pokedex, filter_options))

[{'attack': 110.0,
  'defense': 80.0,
  'hp': 90.0,
  'name': 'arcanine',
  'specialattack': 100.0,
  'specialdefense': 80.0,
  'speed': 95.0,
  'total': 555.0,
  'type': 'fire'},
 {'attack': 95.0,
  'defense': 57.0,
  'hp': 65.0,
  'name': 'magmar',
  'specialattack': 100.0,
  'specialdefense': 85.0,
  'speed': 93.0,
  'total': 495.0,
  'type': 'fire'},
 {'attack': 84.0,
  'defense': 78.0,
  'hp': 78.0,
  'name': 'typhlosion',
  'specialattack': 109.0,
  'specialdefense': 85.0,
  'speed': 100.0,
  'total': 534.0,
  'type': 'fire'},
 {'attack': 115.0,
  'defense': 85.0,
  'hp': 115.0,
  'name': 'entei',
  'specialattack': 90.0,
  'specialdefense': 75.0,
  'speed': 100.0,
  'total': 580.0,
  'type': 'fire'},
 {'attack': 95.0,
  'defense': 67.0,
  'hp': 75.0,
  'name': 'magmortar',
  'specialattack': 125.0,
  'specialdefense': 95.0,
  'speed': 83.0,
  'total': 540.0,
  'type': 'fire'},
 {'attack': 98.0,
  'defense': 63.0,
  'hp': 75.0,
  'name': 'simisear',
  'specialattack': 98.0,
  'sp


## 9. Descriptive statistics on the prototype pokedex

<img src="http://imgur.com/l5NasQj.png" style="float: left; margin: 25px 15px 0px 0px; height: 25px">
### 9.1

What is the population mean and standard deviation of the "Total" attribute for all characters in the Pokedex?



In [18]:
# separating header from the rest of the data
data = ref_pd_forloops[1:]

# creating a new list that contains only the 'total' attribute of all pokemons
total_list = [sublist[3] for sublist in data]

mean = np.mean(total_list)
sd = np.std(total_list)

print 'Population mean: ' + str(mean)
print 'Population standard deviation: ' + str(sd)

Population mean: 435.1275
Population standard deviation: 119.962020005


<img src="http://imgur.com/l5NasQj.png" style="float: left; margin: 25px 15px 0px 0px; height: 25px">
### 9.2

The game is no fun if the characters are wildly unbalanced! Are any characters "overpowered", which we'll define as having a "Total" more than three standard deviations from the population mean?

In [19]:
overpowered = mean + 3*sd

# creating a list that contains the indexes of overpowered pokemon
overpowered_list_index = [index for index, value in enumerate (total_list) if value > overpowered]

for i in overpowered_list_index:
    # data[i][1] corresponds to the name of pokemon
    print data[i][1] + ' is overpowered.'

mewtwomega mewtwo x is overpowered.


<img src="http://imgur.com/xDpSobf.png" style="float: left; margin: 25px 15px 0px 0px; height: 25px">

## 10. Calibrate the frequency of Pokemon

The design team wants you to make the powerful Pokemon rare, and the weaklings more common. How would you set the probability $p_i$ of finding Pokemon *i* each time a player visits a gym?

Write a function that takes in a Pokedex number and returns a value $p_i$ for that character.

Hint: there are many ways you could do this. What do _you_ think makes sense? Start with simplifying assumptions: for example, you could assume that the probabilities of encountering any two Pokemon on one visit to a gym are independent of each other.

In [20]:
# assuming that the probability of finding a pokemon is exponentially distributed
# weaker pokemons have a higher probability and the powerful pokemon have a lower probability
# assuming strength of a pokemon is determined by their 'total' attribute score
# assuming the probability of finding the same pokemon at every gym is different due to higher gym levels
# assuming 'total' attribute is a continuous variable

# assigning different scale for each gym given the third assumption
pokemon_gyms_scale = {'reddit.com':50, 'amazon.com':55, 'twitter.com':60, 'linkedin.com':65, 'ebay.com':70,\
                      'netflix.com':75, 'sporcle.com':80, 'stackoverflow.com':85, 'github.com':90,\
                      'quora.com':95, 'alacatraz':100, 'pacific_beach':105}

def prob_func(num, gym):
    """prob_func will take pokemon ID and a gym as a positional argument. Each gym has a different scale factor which
    will determine the steepness of the exponential curve."""
    
    # total_list is taken from the answer above
    # use set() to remove duplicates from total_list
    total_ls_uni = set(total_list)

    # sort the list from smallest to biggest
    total_ls_uni_sort= sorted(total_ls_uni)

    # normalised the list so that the starting value of the list is 1
    total_ls_uni_sort_norm = list(np.array(total_ls_uni_sort) - total_ls_uni_sort[0]+1)
    
    # identifying the index of the pokemond ID in the normalised list
    x1 = total_ls_uni_sort_norm.index(pokedex[num]['total'] - total_ls_uni_sort[0]+1)
    # identifying the previous index
    x2 = x1 - 1
    
    # expon.cdf(x) is the cdf of x in the exponential graph
    # hence, to find the probability of x, I would need to take expon.cdf(x) - expon.cdf(x-1)
    expon_x1 = expon.cdf(total_ls_uni_sort_norm[x1], scale = pokemon_gyms_scale[gym])
    expon_x2 = expon.cdf(total_ls_uni_sort_norm[x2], scale = pokemon_gyms_scale[gym])
    prob = (expon_x1 - expon_x2) *100
    
    return 'Probability of encoutering pokemon, ID: ' + str(num) +' Name: ' + pokedex[num]['name'].capitalize() + ', is ' + \
    str(prob) + ' %.'
    

In [21]:
print prob_func(721, 'pacific_beach')

Probability of encoutering pokemon, ID: 721 Name: Volcanion, is 0.10668795377 %.
