# Lesson 6 Dictionaries and DataFrames

## This lesson will provide a brief introduction to two of the basic ways to organize information in python. 

## **Dictionaries** are a built in method in Python

## **DataFrames** are a part of the `pandas` module.  They are very similar to dictionaries, but have become widely used because of the ease of importing and exporting data.  

In [None]:
import numpy as np 
from matplotlib import pyplot as plt

In [None]:
import pandas as pd

## 6.1 Dictionaries

### Let's recall how lists work.  

In [None]:
my_kawhi_list = ['Kawhi','Leonard','June',29,1991,79.0]
points = 26.0
rebounds = 6.6
assists = 5.0
steals = 1.7
blocks = 0.4
#lets add these numbers to the list 
my_kawhi_list.append(points)
my_kawhi_list.append(rebounds)
my_kawhi_list.append(assists)
my_kawhi_list.append(steals)
my_kawhi_list.append(blocks)

### What do you see as potential problems with organizing my data on NBA players in this manner?

### 6.1.1 Creating a Dictionary 

In [None]:
Player = dict() #create an empty dictionary 
Player['First Name'] = 'Kahwi'  #This is known as a 'key':value pair.  The 'key' is first name and the value in 'Kawhi'
Player['Last Name'] = 'Leonard'
Player['Height'] = 79.0 # The value can be a float, integer, string, list, array, or even another dictionary.  

In [None]:
#You could also have created the dictionary this way but I disapprove. 
#Player = {'First Name':'Kawhi','Last Name':'Leonard','Height':79.0}


### We can add modify and remove Dictionary entries.  

In [None]:
#%% Adding, Modifying, Removing Dictionary entries -Slide 17
points = 26.0
rebounds = 6.6
assists = 5.0
steals = 1.7
blocks = 0.4
#Add items to the dictionary
Player['Points'] = points 
Player['Rebounds'] = rebounds
stats = np.array([points,rebounds,assists,steals,blocks])
Player['Stats'] = stats
#Remove items from the dictionary 
del Player['Height']
#modify items in dictionary 
Player['First Name'] = 'Kawhi'

### So why do I love dictionaries.   Because I don't have to remember whats in them.

In [None]:
Player.keys()

### Everything contained in a dictionary can be made very explicit by the appropriate choice of keys. 

### The information in a dictionary retains its data type 

In [None]:
kawhi_stats = Player['Stats']

## 6.2 Pandas DataFrames

### `pandas` is a file i/o and data organization module that is widely used in python programming. 

### One of the best things about `pandas` is that is can read in two of the most common data file types, 
* csv  - comma separated values
* xls or xlsx - Microsoft Excel files.  

### In this example I am going to load a csv file, *candy-data.csv* which contains data from a well-know study of candy preferences. 

In [None]:
candy_data = pd.read_csv('candy-data.csv')

### Like a Dictionary, a DataFrame has keys  

In [None]:
print(candy_data.keys())

### Let's examine some the entries.  

In [None]:
candyname = candy_data['competitorname']
is_choco = candy_data['chocolate']
win_percent = 100*candy_data['winpercent']
sugar_percent = 100*candy_data['sugarpercent']

### The datatype here is a **Series**.  Let's see what we can do with it. 

In [None]:
meansugar = np.mean(sugar_percent)

### Look's like we can apply numpy functions to it. But I have sometimes run into trouble with it, especially with plotting.  You can always do this: 

In [None]:
sugar_percent = np.array(sugar_percent)

### `array` is your friend!

### Lets end this on a note of talking about plots!

In [None]:
fig = plt.figure(figsize = (4,4))   #creates a blank canvas 
ax = fig.add_axes([0,0,1,1]) #creates an axes - (starting pt x, starting pt y, fractional size x, fractional size y)
ax.plot(sugar_percent,win_percent,'ro')
ax.set_xlabel('Sugar Percent')
ax.set_ylabel('Win Percent')
#ax.set_xlim([0,100])
#ax.set_ylim([0,100])