[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/nuitrcs/NextStepsInPython/blob/master/pandasLoc/pandas.ipynb)

### <br>*If you are using Google Colab, first run the code cell below. You can run a cell by clicking in the cell and clicking on the arrow that appears on the left side of the cell. DO NOT run this cell if you are not using Google Colab.*

In [None]:
!wget https://raw.githubusercontent.com/nuitrcs/NextStepsInPython/master/pandasLoc/wnba-team-elo-ratings.csv

# <br><br>pandas loc, iloc, and at

<br>This lesson is not an introduction to pandas, which is the Python package used for working with data tables in Python (in pandas they are called DataFrames). I am assuming you have some familiarity with basic pandas, either through our Python Fundamentals bootcamp or other experience. 

<br>*If you are new to Jupyter notebooks, each gray cell is a piece of code. To run the code, click inside the gray cell and either click the triangle button up top, or press shift+return (or shift+enter) on your keyboard. If you are using Google Colab, shift+return should also work.*

#### <br>Import pandas

We import pandas as a shortened nickname, `pd`, which is commonly used for pandas.

In [1]:
import pandas as pd

#### <br><br>Loading in our sample dataset

Our sample dataset was taken from FiveThirtyEight. It contains game data for WNBA games since 1997.

In [2]:
df = pd.read_csv("wnba-team-elo-ratings.csv")

<br>Take a minute to examine the dataset.

In [3]:
df.head()

Unnamed: 0,season,date,team1,team2,name1,name2,neutral,playoff,score1,score2,elo1_pre,elo2_pre,elo1_post,elo2_post,prob1,is_home1
0,2019,10/10/2019,WAS,CON,Washington Mystics,Connecticut Sun,0,1,89,78,1684,1634,1692,1627,0.718,1
1,2019,10/10/2019,CON,WAS,Connecticut Sun,Washington Mystics,0,1,78,89,1634,1684,1627,1692,0.282,0
2,2019,10/8/2019,WAS,CON,Washington Mystics,Connecticut Sun,0,1,86,90,1693,1626,1684,1634,0.476,0
3,2019,10/8/2019,CON,WAS,Connecticut Sun,Washington Mystics,0,1,90,86,1626,1693,1634,1684,0.524,1
4,2019,10/6/2019,WAS,CON,Washington Mystics,Connecticut Sun,0,1,94,81,1671,1648,1693,1626,0.399,0


<br>How many rows are in the dataset?

In [4]:
len(df)

10488

## <br><br>Subsampling the data

#### Today we will be talking about ways to subsample a pandas DataFrame.

<br><br>First, let's review how we can subsample certain columns in our DataFrame. We pass a list inside square brackets:

In [5]:
df[["team1", "team2"]]

Unnamed: 0,team1,team2
0,WAS,CON
1,CON,WAS
2,WAS,CON
3,CON,WAS
4,WAS,CON
...,...,...
10483,SAC,LVA
10484,NYL,LAS
10485,LAS,NYL
10486,LVA,SAC


<br>If we only want to return one column, we can ask for a pandas **Series** object by passing only a column name inside square brackets...

In [6]:
df["team1"]

0        WAS
1        CON
2        WAS
3        CON
4        WAS
        ... 
10483    SAC
10484    NYL
10485    LAS
10486    LVA
10487    CLE
Name: team1, Length: 10488, dtype: object

<br>Or we can ask for a pandas **DataFrame** object by passing a list.

In [7]:
df[["team1"]]

Unnamed: 0,team1
0,WAS
1,CON
2,WAS
3,CON
4,WAS
...,...
10483,SAC
10484,NYL
10485,LAS
10486,LVA


<br>**A Series object is a one-dimensional object, while a DataFrame is a two-dimensional object. A Series can be turned into a list, while a DataFrame can be indexed based on row number, so they both have their uses.**

<br><br>We can also pass a conditional statement (boolean) to return only some rows in a list:

In [8]:
df[df["team1"] == "CHI"]

Unnamed: 0,season,date,team1,team2,name1,name2,neutral,playoff,score1,score2,elo1_pre,elo2_pre,elo1_post,elo2_post,prob1,is_home1
25,2019,9/15/2019,CHI,LVA,Chicago Sky,Las Vegas Aces,0,1,92,93,1564,1541,1559,1545,0.399,0
28,2019,9/11/2019,CHI,PHO,Chicago Sky,Phoenix Mercury,0,1,105,76,1553,1450,1564,1440,0.788,1
36,2019,9/8/2019,CHI,WAS,Chicago Sky,Washington Mystics,0,0,86,100,1559,1706,1553,1712,0.214,0
47,2019,9/6/2019,CHI,CON,Chicago Sky,Connecticut Sun,0,0,109,104,1543,1615,1559,1599,0.295,0
65,2019,9/1/2019,CHI,PHO,Chicago Sky,Phoenix Mercury,0,0,105,78,1522,1527,1543,1506,0.607,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
6371,2006,6/2/2006,CHI,HOU,Chicago Sky,Houston Comets,0,0,60,71,1286,1575,1284,1577,0.107,0
6387,2006,5/30/2006,CHI,LAS,Chicago Sky,Los Angeles Sparks,0,0,55,64,1294,1500,1286,1508,0.326,1
6393,2006,5/26/2006,CHI,IND,Chicago Sky,Indiana Fever,0,0,60,75,1304,1548,1294,1557,0.280,1
6415,2006,5/23/2006,CHI,SAC,Chicago Sky,Sacramento Monarchs,0,0,63,76,1310,1603,1304,1610,0.227,1


<br>However, using this method, we cannot refer to individual rows by name or pull up individual cells in our DataFrame.

In [9]:
df[25]

KeyError: 25

### <br><br><br>pandas loc

`loc` allows us to call up certain rows and columns. The syntax is:

#### `df.loc[row, column]`

#### `df.loc[list of rows, list of columns]`

#### `df.loc[range of rows, range of columns]`

`loc` can take a row, a list of rows, or a range of rows, followed by a comma, and then a column, list of columns, or range of columns. <br><br>If you want all the rows or all the columns, you can use a `:`. <br><br>The rows that we refer to here are the row names (index names) that are found in bold on the far left of our DataFrame.

<br><br>Let's get a reminder of what our DataFrame looks like:

In [10]:
df.head()

Unnamed: 0,season,date,team1,team2,name1,name2,neutral,playoff,score1,score2,elo1_pre,elo2_pre,elo1_post,elo2_post,prob1,is_home1
0,2019,10/10/2019,WAS,CON,Washington Mystics,Connecticut Sun,0,1,89,78,1684,1634,1692,1627,0.718,1
1,2019,10/10/2019,CON,WAS,Connecticut Sun,Washington Mystics,0,1,78,89,1634,1684,1627,1692,0.282,0
2,2019,10/8/2019,WAS,CON,Washington Mystics,Connecticut Sun,0,1,86,90,1693,1626,1684,1634,0.476,0
3,2019,10/8/2019,CON,WAS,Connecticut Sun,Washington Mystics,0,1,90,86,1626,1693,1634,1684,0.524,1
4,2019,10/6/2019,WAS,CON,Washington Mystics,Connecticut Sun,0,1,94,81,1671,1648,1693,1626,0.399,0


<br><br>To reference one cell:

In [11]:
df.loc[25, "date"]

'9/15/2019'

<br><br>All rows for one column:

In [13]:
df.loc[:, ["team1"]]

Unnamed: 0,team1
0,WAS
1,CON
2,WAS
3,CON
4,WAS
...,...
10483,SAC
10484,NYL
10485,LAS
10486,LVA


<br><br>All columns for one row:

In [14]:
df.loc[12, :]

season                     2019
date                  9/22/2019
team1                       WAS
team2                       LVA
name1        Washington Mystics
name2            Las Vegas Aces
neutral                       0
playoff                       1
score1                       75
score2                       92
elo1_pre                   1717
elo2_pre                   1540
elo1_post                  1687
elo2_post                  1570
prob1                     0.669
is_home1                      0
Name: 12, dtype: object

### <br><br>Exercise 1

The very first game played by the Chicago Sky is in the row with index 6427.

Write code to return all columns in that row:

In [15]:
df.loc[6427, :]

season                  2006
date               5/20/2006
team1                    CHI
team2                    CHA
name1            Chicago Sky
name2        Charlotte Sting
neutral                    0
playoff                    0
score1                    83
score2                    82
elo1_pre                1300
elo2_pre                1428
elo1_post               1310
elo2_post               1418
prob1                  0.232
is_home1                   0
Name: 6427, dtype: object

Did the Chicago Sky play their very first game at home or away? Write code to return the data in the column "is_home1" for that row:

In [16]:
df.loc[6427, "is_home1"]

0

*(1 is True and 0 is False)* 

#### <br><br>Sampling a range

This code will return all columns for the rows 0 through 10.

In [17]:
df.loc[0:10, :]

Unnamed: 0,season,date,team1,team2,name1,name2,neutral,playoff,score1,score2,elo1_pre,elo2_pre,elo1_post,elo2_post,prob1,is_home1
0,2019,10/10/2019,WAS,CON,Washington Mystics,Connecticut Sun,0,1,89,78,1684,1634,1692,1627,0.718,1
1,2019,10/10/2019,CON,WAS,Connecticut Sun,Washington Mystics,0,1,78,89,1634,1684,1627,1692,0.282,0
2,2019,10/8/2019,WAS,CON,Washington Mystics,Connecticut Sun,0,1,86,90,1693,1626,1684,1634,0.476,0
3,2019,10/8/2019,CON,WAS,Connecticut Sun,Washington Mystics,0,1,90,86,1626,1693,1634,1684,0.524,1
4,2019,10/6/2019,WAS,CON,Washington Mystics,Connecticut Sun,0,1,94,81,1671,1648,1693,1626,0.399,0
5,2019,10/6/2019,CON,WAS,Connecticut Sun,Washington Mystics,0,1,81,94,1648,1671,1626,1693,0.601,1
6,2019,10/1/2019,WAS,CON,Washington Mystics,Connecticut Sun,0,1,87,99,1700,1618,1671,1648,0.763,1
7,2019,10/1/2019,CON,WAS,Connecticut Sun,Washington Mystics,0,1,99,87,1618,1700,1648,1671,0.237,0
8,2019,9/29/2019,WAS,CON,Washington Mystics,Connecticut Sun,0,1,95,86,1694,1624,1700,1618,0.747,1
9,2019,9/29/2019,CON,WAS,Connecticut Sun,Washington Mystics,0,1,86,95,1624,1694,1618,1700,0.253,0


<br>**Unlike Python indexing, `loc` is referencing the rows by their index names, so row 10 is included.**

<br><br>We can also ask for a range of columns, from left to right:

In [18]:
df.loc[0:10, "season":"name2"]

Unnamed: 0,season,date,team1,team2,name1,name2
0,2019,10/10/2019,WAS,CON,Washington Mystics,Connecticut Sun
1,2019,10/10/2019,CON,WAS,Connecticut Sun,Washington Mystics
2,2019,10/8/2019,WAS,CON,Washington Mystics,Connecticut Sun
3,2019,10/8/2019,CON,WAS,Connecticut Sun,Washington Mystics
4,2019,10/6/2019,WAS,CON,Washington Mystics,Connecticut Sun
5,2019,10/6/2019,CON,WAS,Connecticut Sun,Washington Mystics
6,2019,10/1/2019,WAS,CON,Washington Mystics,Connecticut Sun
7,2019,10/1/2019,CON,WAS,Connecticut Sun,Washington Mystics
8,2019,9/29/2019,WAS,CON,Washington Mystics,Connecticut Sun
9,2019,9/29/2019,CON,WAS,Connecticut Sun,Washington Mystics


<br><br>Again, `loc` uses the column and row names, not their positions, so this will not work:

In [19]:
df.loc[0:10, 0:4]

TypeError: cannot do slice indexing on <class 'pandas.core.indexes.base.Index'> with these indexers [0] of <class 'int'>

#### <br><br>Sampling a list

We can also pass a list of rows or columns:

In [20]:
df.loc[[0, 10, 8], ["team1", "score1", "team2", "score2"]]

Unnamed: 0,team1,score1,team2,score2
0,WAS,89,CON,78
10,WAS,94,LVA,90
8,WAS,95,CON,86


*Notice how the returned DataFrame used the same order given in the lists.*

### <br><br>Exercise 2

Run the following cell to store the list of row indexes for the first 5 games played by the Chicago Sky:

In [21]:
first5 = [6427, 6415, 6393, 6387, 6371]

Write code to return the rows for the first 5 Chicago Sky games, and return only the columns "team2", "score1", "score2", and "is_home1":

In [22]:
df.loc[first5, ["team2", "score1", "score2", "is_home1"]]

Unnamed: 0,team2,score1,score2,is_home1
6427,CHA,83,82,0
6415,SAC,63,76,1
6393,IND,60,75,1
6387,LAS,55,64,1
6371,HOU,60,71,0


#### <br><br>`loc` with a conditional

You can use a conditional to filter rows. The conditional is written the same way as we would write it without using loc:

In [23]:
df.loc[df["team1"] == "CHI", :]

Unnamed: 0,season,date,team1,team2,name1,name2,neutral,playoff,score1,score2,elo1_pre,elo2_pre,elo1_post,elo2_post,prob1,is_home1
25,2019,9/15/2019,CHI,LVA,Chicago Sky,Las Vegas Aces,0,1,92,93,1564,1541,1559,1545,0.399,0
28,2019,9/11/2019,CHI,PHO,Chicago Sky,Phoenix Mercury,0,1,105,76,1553,1450,1564,1440,0.788,1
36,2019,9/8/2019,CHI,WAS,Chicago Sky,Washington Mystics,0,0,86,100,1559,1706,1553,1712,0.214,0
47,2019,9/6/2019,CHI,CON,Chicago Sky,Connecticut Sun,0,0,109,104,1543,1615,1559,1599,0.295,0
65,2019,9/1/2019,CHI,PHO,Chicago Sky,Phoenix Mercury,0,0,105,78,1522,1527,1543,1506,0.607,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
6371,2006,6/2/2006,CHI,HOU,Chicago Sky,Houston Comets,0,0,60,71,1286,1575,1284,1577,0.107,0
6387,2006,5/30/2006,CHI,LAS,Chicago Sky,Los Angeles Sparks,0,0,55,64,1294,1500,1286,1508,0.326,1
6393,2006,5/26/2006,CHI,IND,Chicago Sky,Indiana Fever,0,0,60,75,1304,1548,1294,1557,0.280,1
6415,2006,5/23/2006,CHI,SAC,Chicago Sky,Sacramento Monarchs,0,0,63,76,1310,1603,1304,1610,0.227,1


<br><br>Here I use the same filter for the rows, but I only ask for three columns to be returned:

In [24]:
df.loc[df["team1"] == "CHI", ["team2", "score1", "score2"]]

Unnamed: 0,team2,score1,score2
25,LVA,92,93
28,PHO,105,76
36,WAS,86,100
47,CON,109,104
65,PHO,105,78
...,...,...,...
6371,HOU,60,71
6387,LAS,55,64
6393,IND,60,75
6415,SAC,63,76


### <br><br>Exercise 3

Write code to return all games played in the 2012 season. Only return the columns "date", "name1", and "name2".

In [25]:
df.loc[df["season"] == 2012, ["date", "name1", "name2"]]

Unnamed: 0,date,name1,name2
3096,10/21/2012,Indiana Fever,Minnesota Lynx
3097,10/21/2012,Minnesota Lynx,Indiana Fever
3098,10/19/2012,Indiana Fever,Minnesota Lynx
3099,10/19/2012,Minnesota Lynx,Indiana Fever
3100,10/17/2012,Minnesota Lynx,Indiana Fever
...,...,...,...
3537,5/19/2012,Chicago Sky,Washington Mystics
3538,5/19/2012,Washington Mystics,Chicago Sky
3539,5/19/2012,Tulsa Shock,San Antonio Silver Stars
3540,5/18/2012,Seattle Storm,Los Angeles Sparks


<br>**I've included a bonus section at the end of this notebook to remind you how to use multiple conditionals in pandas.**

### <br><br>pandas `iloc`

While `loc` searches by row and column names, `iloc` searches only by the indexed positions in the DataFrame.

Here, I'm asking for the top 10 rows and the first four columns:

In [26]:
df.iloc[0:10, 0:4]

Unnamed: 0,season,date,team1,team2
0,2019,10/10/2019,WAS,CON
1,2019,10/10/2019,CON,WAS
2,2019,10/8/2019,WAS,CON
3,2019,10/8/2019,CON,WAS
4,2019,10/6/2019,WAS,CON
5,2019,10/6/2019,CON,WAS
6,2019,10/1/2019,WAS,CON
7,2019,10/1/2019,CON,WAS
8,2019,9/29/2019,WAS,CON
9,2019,9/29/2019,CON,WAS


<br>**Notice that `iloc` uses Python indexing!** When we ask for rows 0:10, it returns rows 0 to 9. Also notice that the index (the bold number on the left side of each row) does not count as a true column.

<br>Because `iloc` uses Python indexing, we can use negative numbers:

In [27]:
df.iloc[0:-5000, 4:-10]

Unnamed: 0,name1,name2
0,Washington Mystics,Connecticut Sun
1,Connecticut Sun,Washington Mystics
2,Washington Mystics,Connecticut Sun
3,Connecticut Sun,Washington Mystics
4,Washington Mystics,Connecticut Sun
...,...,...
5483,Los Angeles Sparks,Houston Comets
5484,Phoenix Mercury,Sacramento Monarchs
5485,Connecticut Sun,New York Liberty
5486,Indiana Fever,San Antonio Silver Stars


### <br><br>Exercise 4

The games are included in reverse chronological order, so the last row in the table is the very first game ever played.

Was the very first game played at home or away for team1? Use iloc to return the column "is_home1" for the very last row in the DataFrame:

In [28]:
df.iloc[-1, -1]

1

Use iloc to write code to return the columns "team1" and "team2" for the most recent 20 games:

In [29]:
df.iloc[0:20, 2:4]

Unnamed: 0,team1,team2
0,WAS,CON
1,CON,WAS
2,WAS,CON
3,CON,WAS
4,WAS,CON
5,CON,WAS
6,WAS,CON
7,CON,WAS
8,WAS,CON
9,CON,WAS


### <br><br>pandas `at` and `iat`

If you are looking for the contents of only a single cell (called a **scalar**) in the DataFrame, you can use `loc` or `iloc`:

In [30]:
df.loc[0, "season"]

2019

In [31]:
df.iloc[0, 0]

2019

<br>However, there is another set of pandas functions designed to look up only a single cell. `at` will look up a single cell by row name and column name (like `loc`), and `iat` will look up a single cell by index position (like `iloc`).

Why does pandas have a separate way to look up a single cell? Because `at` and `iat` are very fast. If you write code to look up 10,000 single points in a DataFrame, it would be much faster to use `at` or `iat` than `loc` or `iloc`.

In [32]:
df.at[0, "season"]

2019

In [33]:
df.iat[0, 0]

2019

<br><br>Just to reiterate, `at` and `iat` cannot be used with multiple rows or columns:

In [34]:
df.at[0, ["season", "date"]]

TypeError: unhashable type: 'list'

### <br><br>Exercise 5

Use `at` to write code to find out if the game in row 5485 was played at home or away:

In [35]:
df.at[5485, "is_home1"]

0

Now use `iat` to find the same answer:

In [36]:
df.iat[5485, -1]

0

### <br><br>A note about index labels

The bold numbers on the far left of each column were assigned to each row when the csv file was originally loaded into pandas.

In [37]:
df.head()

Unnamed: 0,season,date,team1,team2,name1,name2,neutral,playoff,score1,score2,elo1_pre,elo2_pre,elo1_post,elo2_post,prob1,is_home1
0,2019,10/10/2019,WAS,CON,Washington Mystics,Connecticut Sun,0,1,89,78,1684,1634,1692,1627,0.718,1
1,2019,10/10/2019,CON,WAS,Connecticut Sun,Washington Mystics,0,1,78,89,1634,1684,1627,1692,0.282,0
2,2019,10/8/2019,WAS,CON,Washington Mystics,Connecticut Sun,0,1,86,90,1693,1626,1684,1634,0.476,0
3,2019,10/8/2019,CON,WAS,Connecticut Sun,Washington Mystics,0,1,90,86,1626,1693,1634,1684,0.524,1
4,2019,10/6/2019,WAS,CON,Washington Mystics,Connecticut Sun,0,1,94,81,1671,1648,1693,1626,0.399,0


<br>If we make a new DataFrame out of only some rows, the index labels will stay the same, leaving gaps. Let's make a new DataFrame that only includes games played by the Chicago Sky:

In [38]:
CHIdf = df.loc[df["team1"] == "CHI", :]
CHIdf.head()

Unnamed: 0,season,date,team1,team2,name1,name2,neutral,playoff,score1,score2,elo1_pre,elo2_pre,elo1_post,elo2_post,prob1,is_home1
25,2019,9/15/2019,CHI,LVA,Chicago Sky,Las Vegas Aces,0,1,92,93,1564,1541,1559,1545,0.399,0
28,2019,9/11/2019,CHI,PHO,Chicago Sky,Phoenix Mercury,0,1,105,76,1553,1450,1564,1440,0.788,1
36,2019,9/8/2019,CHI,WAS,Chicago Sky,Washington Mystics,0,0,86,100,1559,1706,1553,1712,0.214,0
47,2019,9/6/2019,CHI,CON,Chicago Sky,Connecticut Sun,0,0,109,104,1543,1615,1559,1599,0.295,0
65,2019,9/1/2019,CHI,PHO,Chicago Sky,Phoenix Mercury,0,0,105,78,1522,1527,1543,1506,0.607,1


<br>I can now use `iloc` to get Chicago's most recent 30 games:

In [39]:
CHIdf.iloc[0:30, :]

Unnamed: 0,season,date,team1,team2,name1,name2,neutral,playoff,score1,score2,elo1_pre,elo2_pre,elo1_post,elo2_post,prob1,is_home1
25,2019,9/15/2019,CHI,LVA,Chicago Sky,Las Vegas Aces,0,1,92,93,1564,1541,1559,1545,0.399,0
28,2019,9/11/2019,CHI,PHO,Chicago Sky,Phoenix Mercury,0,1,105,76,1553,1450,1564,1440,0.788,1
36,2019,9/8/2019,CHI,WAS,Chicago Sky,Washington Mystics,0,0,86,100,1559,1706,1553,1712,0.214,0
47,2019,9/6/2019,CHI,CON,Chicago Sky,Connecticut Sun,0,0,109,104,1543,1615,1559,1599,0.295,0
65,2019,9/1/2019,CHI,PHO,Chicago Sky,Phoenix Mercury,0,0,105,78,1522,1527,1543,1506,0.607,1
78,2019,8/29/2019,CHI,DAL,Chicago Sky,Dallas Wings,0,0,83,88,1541,1395,1522,1413,0.787,1
86,2019,8/27/2019,CHI,MIN,Chicago Sky,Minnesota Lynx,0,0,85,93,1551,1539,1541,1548,0.404,0
96,2019,8/25/2019,CHI,PHO,Chicago Sky,Phoenix Mercury,0,0,94,86,1535,1522,1551,1506,0.405,0
107,2019,8/23/2019,CHI,WAS,Chicago Sky,Washington Mystics,0,0,85,78,1519,1692,1535,1676,0.369,1
117,2019,8/20/2019,CHI,ATL,Chicago Sky,Atlanta Dream,0,0,87,83,1513,1347,1519,1341,0.621,0


<br>But I could not use `loc` to get the same thing:

In [40]:
CHIdf.loc[0:30, :]

Unnamed: 0,season,date,team1,team2,name1,name2,neutral,playoff,score1,score2,elo1_pre,elo2_pre,elo1_post,elo2_post,prob1,is_home1
25,2019,9/15/2019,CHI,LVA,Chicago Sky,Las Vegas Aces,0,1,92,93,1564,1541,1559,1545,0.399,0
28,2019,9/11/2019,CHI,PHO,Chicago Sky,Phoenix Mercury,0,1,105,76,1553,1450,1564,1440,0.788,1


<br>You can, however, set one of your columns as the index labels:

In [41]:
CHIdf = CHIdf.set_index("date")

In [42]:
CHIdf.head()

Unnamed: 0_level_0,season,team1,team2,name1,name2,neutral,playoff,score1,score2,elo1_pre,elo2_pre,elo1_post,elo2_post,prob1,is_home1
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1
9/15/2019,2019,CHI,LVA,Chicago Sky,Las Vegas Aces,0,1,92,93,1564,1541,1559,1545,0.399,0
9/11/2019,2019,CHI,PHO,Chicago Sky,Phoenix Mercury,0,1,105,76,1553,1450,1564,1440,0.788,1
9/8/2019,2019,CHI,WAS,Chicago Sky,Washington Mystics,0,0,86,100,1559,1706,1553,1712,0.214,0
9/6/2019,2019,CHI,CON,Chicago Sky,Connecticut Sun,0,0,109,104,1543,1615,1559,1599,0.295,0
9/1/2019,2019,CHI,PHO,Chicago Sky,Phoenix Mercury,0,0,105,78,1522,1527,1543,1506,0.607,1


<br>Now I can use `loc` to reference the games by date:

In [43]:
CHIdf.loc["9/13/2012", :]

season                     2012
team1                       CHI
team2                       LAS
name1               Chicago Sky
name2        Los Angeles Sparks
neutral                       0
playoff                       0
score1                       77
score2                       86
elo1_pre                   1490
elo2_pre                   1566
elo1_post                  1483
elo2_post                  1573
prob1                      0.29
is_home1                      0
Name: 9/13/2012, dtype: object

<br><br>I can still use a range of row labels:

In [44]:
CHIdf.loc["9/13/2012":"5/19/2012", :]

Unnamed: 0_level_0,season,team1,team2,name1,name2,neutral,playoff,score1,score2,elo1_pre,elo2_pre,elo1_post,elo2_post,prob1,is_home1
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1
9/13/2012,2012,CHI,LAS,Chicago Sky,Los Angeles Sparks,0,0,77,86,1490,1566,1483,1573,0.29,0
9/11/2012,2012,CHI,MIN,Chicago Sky,Minnesota Lynx,0,0,83,70,1460,1727,1490,1697,0.254,1
9/9/2012,2012,CHI,CON,Chicago Sky,Connecticut Sun,0,0,77,82,1465,1552,1460,1556,0.277,0
9/7/2012,2012,CHI,NYL,Chicago Sky,New York Liberty,0,0,92,83,1446,1460,1465,1442,0.368,0
9/2/2012,2012,CHI,LAS,Chicago Sky,Los Angeles Sparks,0,0,85,74,1426,1585,1446,1565,0.389,1
9/1/2012,2012,CHI,IND,Chicago Sky,Indiana Fever,0,0,64,81,1433,1592,1426,1599,0.202,0
8/28/2012,2012,CHI,CON,Chicago Sky,Connecticut Sun,0,0,72,83,1447,1550,1433,1564,0.467,1
8/26/2012,2012,CHI,CON,Chicago Sky,Connecticut Sun,0,0,82,70,1415,1583,1447,1550,0.193,0
8/24/2012,2012,CHI,DAL,Chicago Sky,Tulsa Shock,0,0,78,81,1423,1325,1415,1333,0.526,0
8/22/2012,2012,CHI,ATL,Chicago Sky,Atlanta Dream,0,0,71,82,1430,1539,1423,1545,0.252,0


### <br><br>Exercise 6

Using the CHIdf, write code (use either `loc` or `at`) to find out which team Chicago played against on 6/16/2017:

### <br><br>BONUS SECTION

#### A reminder about searching for multiple conditionals in pandas

<br><br>Let's say we want to search through the original DataFrame, `df`, for all games played by the Chicago Sky where the Chicago Sky won. For each of these games, we want to return only the columns for season and the name of the opposing team.

The conditional for only Chicago Sky games is:
<br>`df["team1"] == "CHI"`
<br><br>The conditional for games that Chicago won is:
<br>`df["score1"] > df["score2"]`

We might try to use Python operators (`and`, `or`, `not`):

In [None]:
df.loc[df["team1"] == "CHI" and df["score1"] > df["score2"], ["season", "name2"]]

<br><br>However, as a reminder, pandas uses the operators `&`, `|`, `!`. pandas also requires you to include each conditional inside parentheses.

In [None]:
df.loc[(df["team1"] == "CHI") & (df["score1"] > df["score2"]), ["season", "team2"]]

### <br><br>Exercise 7

Use `loc` to return rows in the DataFrame that were played in either the 1999 or 2000 seasons. For each row, return all columns:

Has Chicago ever played in any playoff games? Return rows that have CHI in the "team1" column and 1 in the "playoff" column. Only return the columns "season", "team2", and "date":