# Lab 1.01: Data Structures and Python with Pokemon

### Building "Pokemon Stay"

---
You are an analyst at a "scrappy" online gaming company that specializes in remakes of last year's fads.

Your boss, who runs the product development team, is convinced that Pokemon Go's fatal flaw was that you had to actually move around outside. She has design mock-ups for a new game called Pokemon Stay: in this version players still need to move, but just from website to website. Pokemon gyms are now popular online destinations, and catching Pokemon in the "wild" simply requires browsing the internet for hours in the comfort of your home.

She wants you to program a prototype version of the game.

## 1. Defining a player

---
Each player needs to have a set of charactaristics, stored in variables, such as an id, a username, play data, etc. A great structure to house these variables is a `dictionary`, because the `values` can contain any python datatype includeing `list`, `dict`, `tuple`, `int`, `float`, `bool`, or `str`. 

The player characteristics (keys to the player dict) are:

    player_id : id code unique to each player (integer)
    player_name : entered name of the player (string)
    time_played : number of time played the game in minutes (float)
    player_pokemon: the player's captured pokemon (dictionary)
    gyms_visited: ids of the gyms that a player has visited (list)

### A) Create a `dict` for a single player.

* The `player_id` should be 1
* Since the player doesn't have a name yet, you may set the `player_name` equal to `None`
* The rest of the fields should be populated properly depending on the datatype, i.e., `0.0` or an empty iterable of the appropriate type.

In [1]:
player_1 = dict(player_id = 1,
                player_name = None, 
                time_played = 0.0, 
                player_pokemon = {}, 
                gyms_visited = [])

player_1

{'player_id': 1,
 'player_name': None,
 'time_played': 0.0,
 'player_pokemon': {},
 'gyms_visited': []}

### B) Create a `dict` to house your dataset of players.

* Because only `player_1` exists, there should only be one `key:value` pair. 
* The `keys` of this `dict` should be the `player_id`, and the `values` should be the dictionaries with single-player info, including the `player_id` (slightly redundant).

In [2]:
poke_players = {player_1['player_id']: player_1}

To see the contents of a variable, just run a code cell with the variable name in it.

In [3]:
poke_players

{1: {'player_id': 1,
  'player_name': None,
  'time_played': 0.0,
  'player_pokemon': {},
  'gyms_visited': []}}

### C) Update player 1's info with your own.

* By indexing your `poke_players` dictionary, update the `player_name` field to your own name.
* Display the contents of `poke_players` to check your work.

In [4]:
# Your code here
poke_players[1]['player_name'] = 'kevin'

In [5]:
poke_players

{1: {'player_id': 1,
  'player_name': 'kevin',
  'time_played': 0.0,
  'player_pokemon': {},
  'gyms_visited': []}}

### D) Define a function that adds a player to `poke_players`.

Your functions should...

* Take arguments for `players_dict`, `player_id`, and `player_name`.
* Create a player with the above values and populate the `gyms_visited`, `player_pokemon`, and `time_played` in the same way you did for `player_1` above.
* Prints the name of the player added.
* `return` the updated dictionary.

In [6]:
def add_player(players_dict, player_id, player_name):
    player_name = player_name.upper()
    new_player = dict(player_id = player_id,
                      player_name = player_name,
                      time_played = 0.0,
                      player_pokemon = {},
                      gyms_visited = [])
    players_dict[player_id] = new_player
    print("Player", player_name, "added!")
    return players_dict

### E) Add a new player

* Add a second player to the `poke_players` dictionary using the `add_player` function. The id should be 2, but the name is up to you!
* Reassign and overwrite the `poke_players` dictionary.
* Display the contents of `poke_players` to check your work.

In [7]:
add_player(poke_players, 2, 'chalisse')

Player CHALISSE added!


{1: {'player_id': 1,
  'player_name': 'kevin',
  'time_played': 0.0,
  'player_pokemon': {},
  'gyms_visited': []},
 2: {'player_id': 2,
  'player_name': 'CHALISSE',
  'time_played': 0.0,
  'player_pokemon': {},
  'gyms_visited': []}}

## 2. Defining "gym" locations

---

As the sole programmer, Pokemon Stay will have to start small. To begin, there will be 10 different gym location websites on the internet. The gym locations are:

    1. 'reddit.com'
    2. 'amazon.com'
    3. 'twitter.com'
    4. 'linkedin.com'
    5. 'ebay.com'
    6. 'netflix.com'
    7. 'stackoverflow.com'
    8. 'github.com'
    9. 'quora.com'
    10. 'google.com'

* Set up a list of all the gym locations. This will be a list of strings. Print the list to check your work.
* For each player in `poke_players`, use `sample` (imported from `random` below) to randomly select 2 gyms and add these gyms to the `gyms_visited` field.
* Display the contents of `poke_players` to check your work.

In [8]:
from random import sample

In [9]:
# Run this cell a few times to understand sample. Play around with the function!
this_list = ['apple', 1, ('a','b','c'), 0.8]
sample(this_list, 3)

[1, 0.8, ('a', 'b', 'c')]

In [10]:
gyms = ['reddit.com', 
        'amazon.com', 
        'twitter.com', 
        'linkedin.com', 
        'ebay.com', 
        'netflix.com', 
        'stackoverflow.com', 
        'github.com', 
        'quora.com', 
        'google.com']

In [11]:
for player in poke_players:
    poke_players[player]['gyms_visited'] = sample(gyms, 2)
poke_players

{1: {'player_id': 1,
  'player_name': 'kevin',
  'time_played': 0.0,
  'player_pokemon': {},
  'gyms_visited': ['ebay.com', 'quora.com']},
 2: {'player_id': 2,
  'player_name': 'CHALISSE',
  'time_played': 0.0,
  'player_pokemon': {},
  'gyms_visited': ['reddit.com', 'amazon.com']}}

## 3. Create a pokedex

---

We also need to create some pokemon to catch! Let's store the attributes of each pokemon in a `dictionary`, since each pokemon has many characteristics we'd like to store.


Each pokemon will be defined by these characteristics (keys to the pokemon dict):

    poke_id : unique identifier for each pokemon (integer, sequential)
    poke_name : the name of the pokemon (string)
    poke_type : the category of pokemon (string)
    hp : base hitpoints (integer between 400 and 500)
    attack : base attack (integer between 50 and 100)
    defense : base defense (integer between 50 and 100)
    special_attack : base special attack (integer between 100 and 150)
    special_defense : base sepecial defense (integer between 100 and 150)
    speed : base speed (integer between 0 and 100)
    
**Note**: All integer ranges are inclusive on both ends.

### A) Create a function called `create_pokemon`

* The function should take arguments for `poke_id`, `poke_name`, and `poke_type`.
* Assign these arguments along with random stats into a `dict` using the guidelines above.
* Use `np.random.randint` to generate values for the numeric attributes based on the conditions above. If you're not clear on how this function works, there is a cell below with an example. Play around with it!
* The function should return a `dict` for the pokemon.
* Without assigning it to a variable, check the function's output by calling it with the following arguments:
  * `poke_id = 1`
  * `poke_name = 'charmander'`
  * `poke_type = 'fire'`

In [12]:
import numpy as np

In [13]:
# Play around with this cell to understand np.random.randint!

np.random.randint(0,3)

1

In [14]:
def create_pokemon(poke_id, poke_name, poke_type):
    return dict(poke_id = poke_id,
                poke_name = poke_name, 
                poke_type = poke_type, 
                hp = np.random.randint(400, 501),
                attack = np.random.randint(50, 101),
                defense = np.random.randint(50, 101),
                special_attack = np.random.randint(100, 151),
                special_defense = np.random.randint(100, 151),
                speed = np.random.randint(0, 101))
create_pokemon(1, 'charmander', 'fire')

{'poke_id': 1,
 'poke_name': 'charmander',
 'poke_type': 'fire',
 'hp': 417,
 'attack': 96,
 'defense': 95,
 'special_attack': 116,
 'special_defense': 100,
 'speed': 3}

### B) Populate the `pokedex`!

Now we need some pokemon to catch. Let's create a dictionary to store the information!

* Instantiate an empyt dictionary called `pokedex`.
* Define a function called `create_and_add_to_pokedex`. This function should...
  * Take arguments for `pokedex`,  `poke_id`, `poke_name`, and `poke_type`.
  * Use the `create_pokemon` function you created earlier to create a pokemon using the provided `poke_id`, `poke_name`, and `poke_type`.
  * Add a new `key:value` pair to the `pokedex` dictionary where:
    * the `key` is the `poke_id`, and
    * the `value` is the newly-created pokemon dict, including the `poke_id` (this is slightly redundant, but that's ok!)
  * Prints the name of the pokemon added to the pokedex using the string `format` method or `f` strings.
* Add the following 3 pokemon to your `pokedex` using `create_and_add_to_pokedex`:

|Id|Name|Type|
|---|---|---|
|1|charmander|fire|
|2|squirtle|water|
|3|bulasaur|poison|

Display your `pokedex` to check your work. It should look something like...

```python
{1: {'attack': 64,
  'defense': 59,
  'hp': 495,
  'poke_id': 1,
  'poke_name': 'charmander',
  'poke_type': 'fire',
  'special_attack': 100,
  ...
```

In [15]:
pokedex = {}

In [16]:
def create_and_add_to_pokedex(pokedex, poke_id, poke_name, poke_type):
    new_pokemon = create_pokemon(poke_id, poke_name, poke_type)
    pokedex.update({poke_id: new_pokemon})
    #pokedex[poke_id] = new_pokemon
    return print(f'New pokemon named "{poke_name}" added!')

In [17]:
create_and_add_to_pokedex(pokedex, 1, 'charmander', 'fire')
create_and_add_to_pokedex(pokedex, 2, 'squirtle', 'water')
create_and_add_to_pokedex(pokedex, 3, 'bulbasaur', 'poison')

New pokemon named "charmander" added!
New pokemon named "squirtle" added!
New pokemon named "bulbasaur" added!


In [18]:
pokedex

{1: {'poke_id': 1,
  'poke_name': 'charmander',
  'poke_type': 'fire',
  'hp': 483,
  'attack': 87,
  'defense': 89,
  'special_attack': 109,
  'special_defense': 134,
  'speed': 72},
 2: {'poke_id': 2,
  'poke_name': 'squirtle',
  'poke_type': 'water',
  'hp': 437,
  'attack': 91,
  'defense': 59,
  'special_attack': 103,
  'special_defense': 146,
  'speed': 84},
 3: {'poke_id': 3,
  'poke_name': 'bulbasaur',
  'poke_type': 'poison',
  'hp': 461,
  'attack': 98,
  'defense': 52,
  'special_attack': 140,
  'special_defense': 126,
  'speed': 77}}

## 4. Let's capture some pokemon!

---

Each player in `poke_players` should have a nested dictionary with the key `'player_pokemon'`. This is intended to be the place where we keep track of which of the pokemon each player has.

The keys of the `'player_pokemon'` dictionaries are the `poke_id`s that correspond to the ids in the `pokedex` dictionary you created earlier, and the values are the individual pokemon dicts. 

Essentially, we are replicating the structure of our `pokedex` for each user, only showing the Pokemon a particular user has captured nested within their individual player dictionary.

* Define a function called `add_pokemon_to_player` that...
  * Takes arguents for `player_id`, `poke_id`, `poke_players`, and `pokedex`.
  * Adds the desired pokemon to the `player_pokemon` field of the specified player
  * Prints which pokemon was added to which player.
  * Returns the modified `poke_players`.

In [19]:
def add_pokemon_to_player(player_id, poke_id, poke_players, pokedex):
    pokemon_to_add = pokedex[poke_id]
    destination = poke_players[player_id]['player_pokemon']
    destination.update({poke_id: pokemon_to_add})
    pokemon_to_add_name = pokemon_to_add['poke_name']
    print(pokemon_to_add_name + ' added to player ' + str(player_id))
    return poke_players

* Call your function three times to add 
  * `squirtle` to player 1
  * `charmander` to player 2
  * `bulbasaur` to player 2
* Overwrite your `poke_player` variable each time with the updated dictionary.
* Display the contents of `poke_players` to check your work.

In [20]:
add_pokemon_to_player(1, 2, poke_players, pokedex)
add_pokemon_to_player(2, 1, poke_players, pokedex)
add_pokemon_to_player(2, 3, poke_players, pokedex)
poke_players

squirtle added to player 1
charmander added to player 2
bulbasaur added to player 2


{1: {'player_id': 1,
  'player_name': 'kevin',
  'time_played': 0.0,
  'player_pokemon': {2: {'poke_id': 2,
    'poke_name': 'squirtle',
    'poke_type': 'water',
    'hp': 437,
    'attack': 91,
    'defense': 59,
    'special_attack': 103,
    'special_defense': 146,
    'speed': 84}},
  'gyms_visited': ['ebay.com', 'quora.com']},
 2: {'player_id': 2,
  'player_name': 'CHALISSE',
  'time_played': 0.0,
  'player_pokemon': {1: {'poke_id': 1,
    'poke_name': 'charmander',
    'poke_type': 'fire',
    'hp': 483,
    'attack': 87,
    'defense': 89,
    'special_attack': 109,
    'special_defense': 134,
    'speed': 72},
   3: {'poke_id': 3,
    'poke_name': 'bulbasaur',
    'poke_type': 'poison',
    'hp': 461,
    'attack': 98,
    'defense': 52,
    'special_attack': 140,
    'special_defense': 126,
    'speed': 77}},
  'gyms_visited': ['reddit.com', 'amazon.com']}}

## 5. What gyms have players visited?

### A) Checking gyms

Write a nested for-loop that:

1. Iterates through the `gyms` list of gym locations you defined before.
2. For each gym, iterate through each player in the `poke_players` dictionary with a second, internal for-loop, checking if the player has visited that gym (stored in the `'gyms_visited'` list).
3. If the player has visited the gym, print out "{player_name} has visited {gym}.", filling in `{player_name}` and `{gym}` with the current player's name and current gym location.

In [21]:
# Your code here
for gym in gyms:
    for player in poke_players:
        gyms_visited = poke_players[player]['gyms_visited']
        player_name = poke_players[player]['player_name']
        if gym in gyms_visited:
            print(f'{player_name} has visited {gym}')

CHALISSE has visited reddit.com
CHALISSE has visited amazon.com
kevin has visited ebay.com
kevin has visited quora.com


### B) Computational Complexity

How many times did that loop run? If you have N gyms and also M players, how many times would it run as a function of N and M? 

(You can write your answer as Markdown text.)

$N \text{ gyms x } M \text{ players } = NxM$

In [22]:
len(gyms) * len(poke_players)

20

10 gyms x 2 players = 20 loops

## 6. Calculate player "power".

---

Define a function that will calculate a player's "power". Player power is defined as the sum of the base statistics all of their pokemon.

$$
\text{player power } = \sum_{i = 1}^{n}\text{attack}_i + \text{defense}_i + \text{special attack}_i + \text{special defense}_i
$$

Where $i$ is an individual pokemon in a player's `player_pokemon`. ($\sum$ just means sum, so you're just adding up all the attributes listed above for all the pokemon in the player's `player_pokemon`).

Your function should:

*  Accept a `poke_players` dictionary and a `player_id` as arguments.
*  For the specified player_id, look up that player's pokemon.
*  Find and aggregate the attack and defense values for each of the player's pokemon.
*  Print "{player_name}'s power is {player_power}.", where the `player_power` is the sum of the base statistics for all of their pokemon.
*  Return the player's power value.

Check your work by looping through all players in your `poke_players` dict.

In [23]:
def get_power(player_id, player_dict = poke_players):
    player_pokemon_group = poke_players[player_id]['player_pokemon']
    #print(player_pokemon_group)
    player_power = 0
    for poke_id in player_pokemon_group:
        player_power += player_pokemon_group[poke_id]['attack']
        player_power += player_pokemon_group[poke_id]['defense']        
        player_power += player_pokemon_group[poke_id]['special_attack']
        player_power += player_pokemon_group[poke_id]['special_defense']
    player_name = poke_players[player_id]['player_name']    
    print(f"{player_name}'s power is {player_power}.")
    return player_power

In [24]:
for player_id in poke_players:
    get_power(player_id)

kevin's power is 399.
CHALISSE's power is 835.


## 7. Load a pokedex file containing all the pokemon

---

### Load data using the `with open()` method.

While you were putting together the prototype code, your colleagues were preparing a dataset of Pokemon and their attributes (This was a rush job, so they may have picked some crazy values for some...). Your task is to load the data into a list of lists so you can manipulate it.

* The `type` of the data should be a `list`
  * The `type` of each element in that list should be a `list`
    * The `type` of each element in the sub-list should be `str` or `float`.

The code provided loads the data into one looooong `str`. To get it into the correct format:
* Use the string `.replace()` method to remove `"`. 
* Use the string `.split()` method to create a new row for each line. New lines are denoted with a `'\n'`.
* Use `.split()` again on each line, splitting on commas to separate your individual values.
* Iterate through your data. Use `try/except` to cast numeric data as type `float`. 

Your end result is effectively a matrix. Each list $i$ in the outer list is a row, and the $j$th elements of list together form the *j*th column, which represents a data attribute. The first three lists in your pokedex list should look like this:

    ['PokedexNumber', 'Name', 'Type', 'Total', 'HP', 'Attack', 'Defense', 'SpecialAttack', 'SpecialDefense', 'Speed']
    [1.0, 'Bulbasaur', 'GrassPoison', 318.0, 45.0, 49.0, 49.0, 65.0, 65.0, 45.0]
    [2.0, 'Ivysaur', 'GrassPoison', 405.0, 60.0, 62.0, 63.0, 80.0, 80.0, 60.0]
    
In the above example, `new_pd[1][3]` would return the value `[318.0]`, which occupies the 4th index of the 2nd row (Python is 0-indexed).
    
**WARNING:** Don't print or display your entire new pokedex! Viewing that many entries will clog up your notebook and make it difficult to read.

In [25]:
# Code to read in pokedex info
raw_pd = ''
pokedex_file = 'pokedex_basic.csv'
with open(pokedex_file, 'r') as f:
    raw_pd = f.read()
    
# the pokedex string is assigned to the raw_pd variable


FileNotFoundError: [Errno 2] No such file or directory: 'pokedex_basic.csv'

In [None]:
new_pd = []
replaced = raw_pd.replace('"', '')
new_lines = replaced.split(sep = '\n')
for i in range(len(new_lines)):
    new_pd.append(new_lines[i].split(sep = ','))
for i in range(len(new_pd)):   
    for j in range(len(new_pd[0])):
        try: 
            new_pd[i][j] = (float(new_pd[i][j]))
        except ValueError:
            new_pd[i][j] = new_pd[i][j]

To preview the top 3 rows of your list of lists, use the code below:

In [None]:
new_pd[:3]

## 8. Changing Types

---

### A) Convert your data into a dictionary.

Your `dict` should...
* have `keys` of the new `pokedex` as the `PokedexNumber`
* have `values` containing data for each pokemon in a dictionary form, just like our `pokedex` from before
  * Keep in mind, the `keys` here are a little bit different than the original `pokedex`.
  * Be careful of the header, you do not want to include that as a pokemon.
* **WARNING:** Don't display your entire `pokedex` when turning this in! Viewing that many entries will clog up your notebook and make it difficult to read. If youd like to visualize your `pokedex`, index with a few of its `keys`.

Your `new_pd_dict` should be organized like...

```python
{1.0: {'Attack': 49.0,
  'Defense': 49.0,
  'HP': 45.0,
  'Name': 'Bulbasaur',
  'PokedexNumber': 1.0,
  'SpecialAttack': 65.0,
  'SpecialDefense': 65.0,
  'Speed': 45.0,
  'Total': 318.0,
  'Type': 'GrassPoison'},
 2.0: {'Attack': 62.0,
  'Defense': 63.0,
  'HP': 60.0,
  'Name': 'Ivysaur',
```

In [None]:
keys = new_pd[0]
print(keys)


In [None]:
# # jacob's code w list comprehension
# ne = {}
# for i in range(1, len(new_pd)):
#     ne[i] = {k:v for k,v in zip(new_pd[0], new_pd[i])}
# ne

In [None]:
new_pd_dict = {}
def add_pokemon(index):
    new_pokemon = {
        'Attack': new_pd[index][5],
        'Defense': new_pd[index][6],
        'HP': new_pd[index][4],
        'Name': new_pd[index][1],
        'PokedexNumber': new_pd[index][0],
        'SpecialAttack': new_pd[index][7],
        'SpecialDefense': new_pd[index][8],
        'Speed': new_pd[index][9],
        'Total': new_pd[index][3],
        'Type': new_pd[index][2]         
    }
    #new_pokemon = sorted(new_pokemon.iterkeys())
    #key_num = new_pd[index][0]
    # above line left out 80 "redundant" pokemon (those that shared pokedexnumbers with other pokemon)
    key_num = index
    new_pd_dict.update({key_num : new_pokemon})


for i in range(1, len(new_pd)):
    add_pokemon(i)

print(len(new_pd))
print(len(new_pd_dict))

Your new pokedex is oriented by index, meaning that each entry is a row value (the `PokedexNumber` we set at the key would become the index for the row, all the keys for a given Pokemon would become the column headers, and their values would be the row values for that Pokemon). If you've set this up correctly (including naming your dict **`new_pd_dict`**), the following code should display the top 10 lines of your Pokedex formatted in a Pandas DataFrame.

In [None]:
import pandas as pd
pd.DataFrame(new_pd_dict).T.head(10)

### (OPTIONAL) B) Orient your `new_pd_dict` by columns.

Your goal in this exercise is to orient the pokedex dict by columns, meaning:

* The keys of the dictionary are the column names
* The values of the dictionary are a **column vector** (this can be a list or a tuple) of that feature.
* **BONUS:** Do this with list and/or dictionary comprehensions only

You may find it's easier to work from your `new_pd` list of lists rather than your `new_pd_dict`.

In [None]:
#pd.DataFrame(your_dict_name).head(10)

You can pass this data through to a pandas DataFrame as well, using the example code below:

```pd.DataFrame(your_dict_name).head(10)```

## (OPTIONAL) 9. Write a function to filter your pokedex!
---

Your goal in this exercise is to search your pokedex based on your own defined criteria! Build a function that...

* Takes arguments of: 
  * a pokedex dict (can be either the row or column oriented dict, pick the one of your choice!)
  * a `filter_options` dict (described below)
* For parameters in your `filter_options` dict, your function should return:
  * pokemon that are >= (greater than or equal to) the value you passed in your `filter_options` for that field for continuous values
  * pokemon of that name or type for string values (equal)
* Return a list of the individual pokemon dictionaries that meet your search criteia!

Example:

```python

# Only filter based on parameters passed
filter_options = {
    'Attack':   25,
    'Defense':  30,
    'Type':     'Electric'
}

# Return records with attack >= 24, defense >= 30, and type == "Electric"
# Also anticipate that other paramters can also be passed such as "SpecialAttack", "Speed", etc.
filtered_pokedex(pokedex_data, filter_options)

# Example output:
[{'Attack': 30.0,
  'Defense': 50.0,
  'HP': 40.0,
  'Name': 'Voltorb',
  'SpecialAttack': 55.0,
  'SpecialDefense': 55.0,
  'Speed': 100.0,
  'Total': 330.0,
  'Type': 'Electric'},
  {'Attack': 30.0,
  'Defense': 33.0,
  'HP': 32.0,
  'Name': 'Pikachu',
  'SpecialAttack': 55.0,
  'SpecialDefense': 55.0,
  'Speed': 100.0,
  'Total': 330.0,
  'Type': 'Electric'},
  ... etc
  ]

```



In [None]:
# Your code here