# Practice

The source code summarised below were retrieved from my post about a <a href="http://www.carlosrodrigues.me/2021/07/18/brief-introduction-to-pandas/" target="_blank">brief introduction to pandas</a>

Here I present and discuss basic features and functionality for the pandas library using the dataset of players statistics on the series of the [NBA finals 2021](https://www.nba.com/playoffs/2021/the-finals/stats), right after the Bucks defeated the Suns in game 5 on the road, and marked their third triumph against 2 from the Suns.

## Step 1. Import pandas

After installing it, the first step is to load the library using the regular python import statement:

In [1]:
import pandas as pd

I believe that for the sake of practicality and to avoid overlapping of *built-in* methods, like `map()`, `all()`, `any()`, `filter()`, `max()` and `min()`, the library     is usually imported under the `pd` alias.

## Step 2. Load data

In [2]:
df = pd.read_csv("nba_finals_gm5_2021.csv")

In general .csv files use a comma (,) as a column separator, but is common that other programs use a tab (`\t`), pipe (`|`), semicolon (`;`) and other characters. Pandas allows it manually specify the character used to separate columns in your input file via the `spec` parameter.

Columns names are automatically read from the first line of the input file. For headless input files, set `header=None` in order to skip this behaviour. Alternativey, for manually setting column names, set `header=0` and use the parameter `names` to set the new column names as an ordered list.

By default pandas will automatically assign data types to the values stored in all columns. The `dtype` parameter accepts a dictionary and allows one to overwrite this behaviour.

## Step 3. Basic data exploring

After loaded, the number of dimensions (number of rows and columns) as well as the labels assigned to the columns of a dataframe can be accessed via the `shape` and `columns` attributes:

In [4]:
df.shape

(25, 23)

In [5]:
df.columns

Index(['PLAYER', 'TEAM', 'GP', 'MIN', 'FGM', 'FGA', 'FG%', '3PM', '3PA', '3P%',
       'FTM', 'FTA', 'FT%', 'OREB', 'DREB', 'REB', 'AST', 'TOV', 'STL', 'BLK',
       'PF', 'PTS', '+/-'],
      dtype='object')

Descriptive statistics for all columns on the dataframe can be generated using the `.describe()` method:

In [6]:
df.describe()

Unnamed: 0,GP,MIN,FGM,FGA,FG%,3PM,3PA,3P%,FTM,FTA,...,OREB,DREB,REB,AST,TOV,STL,BLK,PF,PTS,+/-
count,25.0,25.0,25.0,25.0,25.0,25.0,25.0,25.0,25.0,25.0,...,25.0,25.0,25.0,25.0,25.0,25.0,25.0,25.0,25.0,25.0
mean,3.76,19.78,3.504,7.284,41.508,1.008,2.548,28.452,1.208,1.576,...,0.9,2.66,3.56,1.804,0.864,0.556,0.28,1.572,9.224,0.004
std,1.762574,15.547267,3.716235,7.304718,27.301494,1.028721,2.487489,25.004502,1.847052,2.796617,...,0.966954,2.886463,3.588523,2.662781,0.98737,0.54626,0.432049,1.328696,9.629005,3.6464
min,1.0,0.3,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,-7.0
25%,2.0,4.1,1.0,1.7,30.0,0.0,0.0,0.0,0.0,0.0,...,0.3,0.6,1.0,0.0,0.0,0.0,0.0,0.4,2.0,-3.3
50%,5.0,16.6,2.2,6.0,44.6,0.8,2.2,32.1,0.4,0.6,...,0.6,2.2,3.0,0.8,0.6,0.6,0.0,1.2,6.0,0.0
75%,5.0,36.5,4.6,9.6,54.3,1.8,4.0,45.5,1.6,2.0,...,1.2,3.2,4.8,2.0,1.2,1.0,0.4,2.4,13.0,3.6
max,5.0,42.8,12.0,24.2,100.0,3.0,8.2,100.0,7.8,13.2,...,4.0,10.8,13.2,9.0,3.6,1.8,1.4,3.8,32.2,5.4


Values stored on each column can be directly accessed using the columns label as a key to the dataframe, similar to the key/value association of python dictionaries:

In [7]:
df['PLAYER']

0               Devin Booker
1                 Chris Paul
2              Deandre Ayton
3              Mikal Bridges
4                Jae Crowder
5            Cameron Johnson
6              Cameron Payne
7               Torrey Craig
8             Frank Kaminsky
9          Ty-Shon Alexander
10               Abdel Nader
11               Dario Saric
12     Giannis Antetokounmpo
13           Khris Middleton
14              Jrue Holiday
15               Brook Lopez
16           Pat Connaughton
17              Bobby Portis
18               P.J. Tucker
19               Bryn Forbes
20              Jordan Nwora
21               Jeff Teague
22             Elijah Bryant
23               Sam Merrill
24    Thanasis Antetokounmpo
Name: PLAYER, dtype: object

Multiple columns can be retrieve at the same time if a list of values is provided:

In [8]:
df[['PLAYER', 'PTS']]

Unnamed: 0,PLAYER,PTS
0,Devin Booker,30.0
1,Chris Paul,21.0
2,Deandre Ayton,15.2
3,Mikal Bridges,13.0
4,Jae Crowder,11.0
5,Cameron Johnson,9.6
6,Cameron Payne,6.8
7,Torrey Craig,3.4
8,Frank Kaminsky,2.0
9,Ty-Shon Alexander,2.0


**TIP:** If a single value is used as input, **pandas** returns a *Series* object and if a list of columns is provided, the output returned in the form of a *Dataframe*.

Use python list slice operations to filter based on row index, like selecting only the first dozen rows from our dataframe of NBA data:

In [9]:
df[['PLAYER', 'PTS']][0:12]

Unnamed: 0,PLAYER,PTS
0,Devin Booker,30.0
1,Chris Paul,21.0
2,Deandre Ayton,15.2
3,Mikal Bridges,13.0
4,Jae Crowder,11.0
5,Cameron Johnson,9.6
6,Cameron Payne,6.8
7,Torrey Craig,3.4
8,Frank Kaminsky,2.0
9,Ty-Shon Alexander,2.0


When dealing with arge number of rows, use `.head` and `.tail()` to display top and bottom rows. In the same context, the `.sample` function can be used to retrieve a random sample of items from an axis of object. The number of items to be "sampled" is specified via the `n` or `frac` parameters (a specific number or a a fraction of the data, respectively).

In [10]:
df.head()

Unnamed: 0,PLAYER,TEAM,GP,MIN,FGM,FGA,FG%,3PM,3PA,3P%,...,OREB,DREB,REB,AST,TOV,STL,BLK,PF,PTS,+/-
0,Devin Booker,Suns,5,39.0,11.4,24.2,47.1,2.2,6.8,32.4,...,1.2,2.4,3.6,3.8,2.6,1.0,0.4,3.8,30.0,5.4
1,Chris Paul,Suns,5,36.9,8.8,16.2,54.3,2.2,4.2,52.4,...,0.6,2.2,2.8,8.8,3.6,0.6,0.2,2.8,21.0,-0.6
2,Deandre Ayton,Suns,5,37.8,6.0,10.4,57.7,0.0,0.0,0.0,...,2.4,10.8,13.2,2.0,1.2,1.4,1.4,3.6,15.2,3.2
3,Mikal Bridges,Suns,5,30.6,4.6,8.4,54.8,1.8,4.0,45.0,...,0.6,3.2,3.8,1.0,1.4,0.8,0.6,1.2,13.0,4.4
4,Jae Crowder,Suns,5,36.5,3.4,8.0,42.5,2.8,6.0,46.7,...,0.4,7.2,7.6,2.0,1.0,1.2,1.0,3.4,11.0,1.2


In [11]:
df.tail()

Unnamed: 0,PLAYER,TEAM,GP,MIN,FGM,FGA,FG%,3PM,3PA,3P%,...,OREB,DREB,REB,AST,TOV,STL,BLK,PF,PTS,+/-
20,Jordan Nwora,Bucks,1,1.3,1.0,1.0,100.0,1.0,1.0,100.0,...,0.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,3.0,1.0
21,Jeff Teague,Bucks,5,10.8,0.4,2.0,20.0,0.2,0.6,33.3,...,0.4,0.6,1.0,0.8,0.6,0.6,0.0,0.4,1.8,-1.4
22,Elijah Bryant,Bucks,1,0.3,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
23,Sam Merrill,Bucks,1,1.3,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0
24,Thanasis Antetokounmpo,Bucks,1,1.6,0.0,2.0,0.0,0.0,0.0,0.0,...,1.0,2.0,3.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0


In [12]:
df.sample(frac=0.10)

Unnamed: 0,PLAYER,TEAM,GP,MIN,FGM,FGA,FG%,3PM,3PA,3P%,...,OREB,DREB,REB,AST,TOV,STL,BLK,PF,PTS,+/-
19,Bryn Forbes,Bucks,3,7.3,1.0,3.3,30.0,1.0,3.0,33.3,...,0.0,0.3,0.3,0.0,0.3,0.3,0.0,0.7,3.0,-3.3
5,Cameron Johnson,Suns,5,23.9,3.2,6.0,53.3,1.8,3.8,47.4,...,0.8,2.4,3.2,1.0,0.8,0.4,0.4,2.4,9.6,-7.0


By default, values will be stored in the same order they are read from their input source. **pandas** allows for controlling the order of entries in a **dataframe** using the `sort_values()` method. This function requires a *string* or a *list* of columns identifier. The order (ascending or descending) can be controlled via the `ascending` parameter.

In [13]:
df.sort_values(['PTS','PLAYER'], ascending=[False, True])

Unnamed: 0,PLAYER,TEAM,GP,MIN,FGM,FGA,FG%,3PM,3PA,3P%,...,OREB,DREB,REB,AST,TOV,STL,BLK,PF,PTS,+/-
12,Giannis Antetokounmpo,Bucks,5,39.3,12.0,19.6,61.2,0.4,2.4,16.7,...,4.0,9.0,13.0,5.6,1.6,1.4,1.2,3.2,32.2,4.0
0,Devin Booker,Suns,5,39.0,11.4,24.2,47.1,2.2,6.8,32.4,...,1.2,2.4,3.6,3.8,2.6,1.0,0.4,3.8,30.0,5.4
13,Khris Middleton,Bucks,5,42.8,10.0,22.4,44.6,3.0,8.2,36.6,...,0.4,6.2,6.6,5.4,2.8,1.0,0.0,2.2,25.4,-0.8
1,Chris Paul,Suns,5,36.9,8.8,16.2,54.3,2.2,4.2,52.4,...,0.6,2.2,2.8,8.8,3.6,0.6,0.2,2.8,21.0,-0.6
14,Jrue Holiday,Bucks,5,40.8,7.0,17.8,39.3,1.8,5.6,32.1,...,1.2,4.4,5.6,9.0,2.0,1.8,0.8,2.0,17.6,5.0
2,Deandre Ayton,Suns,5,37.8,6.0,10.4,57.7,0.0,0.0,0.0,...,2.4,10.8,13.2,2.0,1.2,1.4,1.4,3.6,15.2,3.2
3,Mikal Bridges,Suns,5,30.6,4.6,8.4,54.8,1.8,4.0,45.0,...,0.6,3.2,3.8,1.0,1.4,0.8,0.6,1.2,13.0,4.4
15,Brook Lopez,Bucks,5,24.0,4.6,9.6,47.9,1.0,3.6,27.8,...,1.8,3.0,4.8,0.2,1.0,0.6,0.8,2.2,11.8,-3.0
4,Jae Crowder,Suns,5,36.5,3.4,8.0,42.5,2.8,6.0,46.7,...,0.4,7.2,7.6,2.0,1.0,1.2,1.0,3.4,11.0,1.2
16,Pat Connaughton,Bucks,5,31.4,3.8,7.2,52.8,3.0,6.0,50.0,...,1.6,3.8,5.4,1.2,0.6,0.2,0.0,2.4,11.0,4.4


**FYI**: most of the commands I have shown so far, can also be "chained" into **"one-liners"** to form more specific subsets of data. For example, if we want to retrieve the top 5 scoring players from our dataset we can do something like the following:

In [14]:
df.sort_values(['PTS','PLAYER'], ascending=[False, True])[0:5][['PLAYER','TEAM','PTS']]

Unnamed: 0,PLAYER,TEAM,PTS
12,Giannis Antetokounmpo,Bucks,32.2
0,Devin Booker,Suns,30.0
13,Khris Middleton,Bucks,25.4
1,Chris Paul,Suns,21.0
14,Jrue Holiday,Bucks,17.6


the functions are applied in order from left to right:

1. `sort_values` sort the dataframe by **PTS** in descending order (highest values first and lowest for last)
2. then we use a slicing operation with `[0:5]` to select the first 5 entries from the dataset sorted in the previous step
3. finally we select only **PLAYER**, **TEAM** and **PTS** to be displayed using the indexing operator (**[]**)

For columns storing categorical variabels (string), the `value_counts()` method is very handy for identify the number of entries per categorical value. From our NBA dataframe, we can identify the number of players in each team by:

In [15]:
df['TEAM'].value_counts()

Bucks    13
Suns     12
Name: TEAM, dtype: int64

Finally, the `.drop` method allows one to remove entries from a dataframe:

In [16]:
df.shape

(25, 23)

In [17]:
df.drop(0).shape

(24, 23)

In [18]:
df.drop('GP', axis=1).shape

(25, 22)

By default, `drop` deletes the row for a particular index, and this behaviour can be altered using the `axis` parameter.

## Step 4. .loc[] and .iloc[]

Among other methods implemented within **pandas**, `.loc[]` and `.iloc[]` are very useful for making slightly more complex queries with great flexibility.

### .loc[]

`.loc[]` is a label based method for querying data from a dataframe. In this case, one is required to provide the *name* of the row and/or column to be selected. Using our NBA dataset, one can draw a few examples of operations that we can perform using `.loc[]`:

- To check for players on the Milwalkee Bucks which average at least 10 points in the series

In [20]:
df.loc[(df.TEAM == 'Bucks') & (df.PTS >= 10)]

Unnamed: 0,PLAYER,TEAM,GP,MIN,FGM,FGA,FG%,3PM,3PA,3P%,...,OREB,DREB,REB,AST,TOV,STL,BLK,PF,PTS,+/-
12,Giannis Antetokounmpo,Bucks,5,39.3,12.0,19.6,61.2,0.4,2.4,16.7,...,4.0,9.0,13.0,5.6,1.6,1.4,1.2,3.2,32.2,4.0
13,Khris Middleton,Bucks,5,42.8,10.0,22.4,44.6,3.0,8.2,36.6,...,0.4,6.2,6.6,5.4,2.8,1.0,0.0,2.2,25.4,-0.8
14,Jrue Holiday,Bucks,5,40.8,7.0,17.8,39.3,1.8,5.6,32.1,...,1.2,4.4,5.6,9.0,2.0,1.8,0.8,2.0,17.6,5.0
15,Brook Lopez,Bucks,5,24.0,4.6,9.6,47.9,1.0,3.6,27.8,...,1.8,3.0,4.8,0.2,1.0,0.6,0.8,2.2,11.8,-3.0
16,Pat Connaughton,Bucks,5,31.4,3.8,7.2,52.8,3.0,6.0,50.0,...,1.6,3.8,5.4,1.2,0.6,0.2,0.0,2.4,11.0,4.4


**TIP:** Each condition should be wrapped within parenthesis and operators *and* (`&`) and *or* (`|`) are valid.

**TIP:** Use `.str.contains()` for substring searching. In the example below, we are looking for the greek brothers by their last name:

In [21]:
df.loc[(df['PLAYER'].str.contains("Antetokounmpo"))]

Unnamed: 0,PLAYER,TEAM,GP,MIN,FGM,FGA,FG%,3PM,3PA,3P%,...,OREB,DREB,REB,AST,TOV,STL,BLK,PF,PTS,+/-
12,Giannis Antetokounmpo,Bucks,5,39.3,12.0,19.6,61.2,0.4,2.4,16.7,...,4.0,9.0,13.0,5.6,1.6,1.4,1.2,3.2,32.2,4.0
24,Thanasis Antetokounmpo,Bucks,1,1.6,0.0,2.0,0.0,0.0,0.0,0.0,...,1.0,2.0,3.0,0.0,0.0,0.0,0.0,1.0,0.0,1.0


- To select a range of rows based on their index

In [23]:
df.loc[0:5]

Unnamed: 0,PLAYER,TEAM,GP,MIN,FGM,FGA,FG%,3PM,3PA,3P%,...,OREB,DREB,REB,AST,TOV,STL,BLK,PF,PTS,+/-
0,Devin Booker,Suns,5,39.0,11.4,24.2,47.1,2.2,6.8,32.4,...,1.2,2.4,3.6,3.8,2.6,1.0,0.4,3.8,30.0,5.4
1,Chris Paul,Suns,5,36.9,8.8,16.2,54.3,2.2,4.2,52.4,...,0.6,2.2,2.8,8.8,3.6,0.6,0.2,2.8,21.0,-0.6
2,Deandre Ayton,Suns,5,37.8,6.0,10.4,57.7,0.0,0.0,0.0,...,2.4,10.8,13.2,2.0,1.2,1.4,1.4,3.6,15.2,3.2
3,Mikal Bridges,Suns,5,30.6,4.6,8.4,54.8,1.8,4.0,45.0,...,0.6,3.2,3.8,1.0,1.4,0.8,0.6,1.2,13.0,4.4
4,Jae Crowder,Suns,5,36.5,3.4,8.0,42.5,2.8,6.0,46.7,...,0.4,7.2,7.6,2.0,1.0,1.2,1.0,3.4,11.0,1.2
5,Cameron Johnson,Suns,5,23.9,3.2,6.0,53.3,1.8,3.8,47.4,...,0.8,2.4,3.2,1.0,0.8,0.4,0.4,2.4,9.6,-7.0


- To update the value of any given columns

In [24]:
df['STATUS'] = ""                               

df.loc[(df['PTS'] >= 10) & (df['REB'] >= 10), 'STATUS'] = 'Double-Double'

print(df[['PLAYER', 'PTS', 'REB', 'AST', 'STATUS']])

                    PLAYER   PTS   REB  AST         STATUS
0             Devin Booker  30.0   3.6  3.8               
1               Chris Paul  21.0   2.8  8.8               
2            Deandre Ayton  15.2  13.2  2.0  Double-Double
3            Mikal Bridges  13.0   3.8  1.0               
4              Jae Crowder  11.0   7.6  2.0               
5          Cameron Johnson   9.6   3.2  1.0               
6            Cameron Payne   6.8   2.6  2.0               
7             Torrey Craig   3.4   1.6  0.2               
8           Frank Kaminsky   2.0   1.3  0.7               
9        Ty-Shon Alexander   2.0   0.0  0.0               
10             Abdel Nader   0.0   0.0  0.0               
11             Dario Saric   0.0   1.0  0.0               
12   Giannis Antetokounmpo  32.2  13.0  5.6  Double-Double
13         Khris Middleton  25.4   6.6  5.4               
14            Jrue Holiday  17.6   5.6  9.0               
15             Brook Lopez  11.8   4.8  0.2             

In the above example, we created a new column **STATUS** with default value as an empty string. Then we define the set of conditions we were looking for as the first parameter of `.loc[]`, the second parameter is the name of the column we want to change and the last component is the new-value.

Since a **double-double** is a status that occurs when a player reaches two digits in two out of three different categories **PTS**, **REB** and **AST**, we are better off writing the last example as:

In [26]:
df.loc[((df['PTS'] >= 10) & (df['REB'] >= 10)) | \
        ((df['PTS'] >= 10) & (df['AST'] >= 10)) | \
        ((df['AST'] >= 10) & (df['REB'] >= 10)), 'STATUS'] = 'Double-Double'
df

Unnamed: 0,PLAYER,TEAM,GP,MIN,FGM,FGA,FG%,3PM,3PA,3P%,...,DREB,REB,AST,TOV,STL,BLK,PF,PTS,+/-,STATUS
0,Devin Booker,Suns,5,39.0,11.4,24.2,47.1,2.2,6.8,32.4,...,2.4,3.6,3.8,2.6,1.0,0.4,3.8,30.0,5.4,
1,Chris Paul,Suns,5,36.9,8.8,16.2,54.3,2.2,4.2,52.4,...,2.2,2.8,8.8,3.6,0.6,0.2,2.8,21.0,-0.6,
2,Deandre Ayton,Suns,5,37.8,6.0,10.4,57.7,0.0,0.0,0.0,...,10.8,13.2,2.0,1.2,1.4,1.4,3.6,15.2,3.2,Double-Double
3,Mikal Bridges,Suns,5,30.6,4.6,8.4,54.8,1.8,4.0,45.0,...,3.2,3.8,1.0,1.4,0.8,0.6,1.2,13.0,4.4,
4,Jae Crowder,Suns,5,36.5,3.4,8.0,42.5,2.8,6.0,46.7,...,7.2,7.6,2.0,1.0,1.2,1.0,3.4,11.0,1.2,
5,Cameron Johnson,Suns,5,23.9,3.2,6.0,53.3,1.8,3.8,47.4,...,2.4,3.2,1.0,0.8,0.4,0.4,2.4,9.6,-7.0,
6,Cameron Payne,Suns,5,16.6,3.0,7.0,42.9,0.6,2.4,25.0,...,2.2,2.6,2.0,0.8,0.8,0.0,1.2,6.8,-6.0,
7,Torrey Craig,Suns,5,12.6,1.2,3.0,40.0,0.6,2.0,30.0,...,1.2,1.6,0.2,0.4,0.0,0.0,1.4,3.4,-4.8,
8,Frank Kaminsky,Suns,3,6.2,1.0,1.7,60.0,0.0,0.0,0.0,...,1.0,1.3,0.7,0.3,0.0,0.0,0.3,2.0,-3.3,
9,Ty-Shon Alexander,Suns,1,1.3,1.0,1.0,100.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2.0,-1.0,


In fact, this modification did not made a difference to the output, since **Deandre Ayton** and **Giannis Antetokounmpo** are the only players averaging a double-double in this series.

### .iloc[]

Different from `.loc[]`, `.iloc[]` is an indexed based selecting method and it uses indexing to select specific rows/columns. Some useful ways of using `.iloc[]`:

- to select specific rows using a list of indices

In [27]:
indices2select = [0,2,3,4,7,6]             

df.iloc[indices2select]

Unnamed: 0,PLAYER,TEAM,GP,MIN,FGM,FGA,FG%,3PM,3PA,3P%,...,DREB,REB,AST,TOV,STL,BLK,PF,PTS,+/-,STATUS
0,Devin Booker,Suns,5,39.0,11.4,24.2,47.1,2.2,6.8,32.4,...,2.4,3.6,3.8,2.6,1.0,0.4,3.8,30.0,5.4,
2,Deandre Ayton,Suns,5,37.8,6.0,10.4,57.7,0.0,0.0,0.0,...,10.8,13.2,2.0,1.2,1.4,1.4,3.6,15.2,3.2,Double-Double
3,Mikal Bridges,Suns,5,30.6,4.6,8.4,54.8,1.8,4.0,45.0,...,3.2,3.8,1.0,1.4,0.8,0.6,1.2,13.0,4.4,
4,Jae Crowder,Suns,5,36.5,3.4,8.0,42.5,2.8,6.0,46.7,...,7.2,7.6,2.0,1.0,1.2,1.0,3.4,11.0,1.2,
7,Torrey Craig,Suns,5,12.6,1.2,3.0,40.0,0.6,2.0,30.0,...,1.2,1.6,0.2,0.4,0.0,0.0,1.4,3.4,-4.8,
6,Cameron Payne,Suns,5,16.6,3.0,7.0,42.9,0.6,2.4,25.0,...,2.2,2.6,2.0,0.8,0.8,0.0,1.2,6.8,-6.0,


- to select a range of columns and rows simultaneously

In [28]:
df.iloc[2:9, 0:5]

Unnamed: 0,PLAYER,TEAM,GP,MIN,FGM
2,Deandre Ayton,Suns,5,37.8,6.0
3,Mikal Bridges,Suns,5,30.6,4.6
4,Jae Crowder,Suns,5,36.5,3.4
5,Cameron Johnson,Suns,5,23.9,3.2
6,Cameron Payne,Suns,5,16.6,3.0
7,Torrey Craig,Suns,5,12.6,1.2
8,Frank Kaminsky,Suns,3,6.2,1.0


 In the example above, the first parameter of `.iloc[]` defines the set of rows to be selected based on a range of indices (`2:9`), and the second parameter defines the columns that only the first 5 columns will be retrieved (`0:5`). In both cases, the input values could have also been a list of indices.

### Step 5. Other resources

- <a href="https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html" target="_blank">Pandas documentation</a>
- <a href="https://www.geeksforgeeks.org/difference-between-loc-and-iloc-in-pandas-dataframe/" target="_blank">Difference between loc() and iloc() in Pandas Dataframe</a    >
- <a href="https://towardsdatascience.com/all-the-pandas-read-csv-you-should-know-to-speed-up-your-data-analysis-1e16fe1039f3" target="_blank">Pandas read_csv() tricks y    ou should known to speed up your data analysis</a>
- <a href="https://github.com/BindiChen/machine-learning#pandas-tutorials" target="_blank">Pandas Tutorials</a>