# DataFrames I

In [3]:
import pandas as pd

## Methods and Attributes between Series and DataFrames
- A **DataFrame** is a 2-dimensional table consisting of rows and columns.
- Pandas uses a `NaN` designation for cells that have a missing value. It is short for "not a number". Most operations on `NaN` values will produce `NaN` values.
- Like with a **Series**, Pandas assigns an index position/label to each **DataFrame** row.
- The **DataFrame** and **Series** have common and exclusive methods/attributes.
- The `hasnans` attribute exists only a **Series**. The `columns` attribute exists only on a **DataFrame**.
- Some methods/attributes will return different types of data.
- The `info` method returns a summary of the pandas object.

In [3]:
nba = pd.read_csv("nba.csv")
nba

Unnamed: 0,Name,Team,Position,Height,Weight,College,Salary
0,Saddiq Bey,Atlanta Hawks,F,6-7,215.0,Villanova,4556983.0
1,Bogdan Bogdanovic,Atlanta Hawks,G,6-5,225.0,Fenerbahce,18700000.0
2,Kobe Bufkin,Atlanta Hawks,G,6-5,195.0,Michigan,4094244.0
3,Clint Capela,Atlanta Hawks,C,6-10,256.0,Elan Chalon,20616000.0
4,Bruno Fernando,Atlanta Hawks,F-C,6-10,240.0,Maryland,2581522.0
...,...,...,...,...,...,...,...
587,Ryan Rollins,Washington Wizards,G,6-3,180.0,Toledo,1719864.0
588,Landry Shamet,Washington Wizards,G,6-4,190.0,Wichita State,10250000.0
589,Tristan Vukcevic,Washington Wizards,F,6-10,220.0,Real Madrid,
590,Delon Wright,Washington Wizards,G,6-5,185.0,Utah,8195122.0


In [4]:
s = pd.Series([1, 2, 3, 4, 5])
s

0    1
1    2
2    3
3    4
4    5
dtype: int64

In [6]:
nba.head()

Unnamed: 0,Name,Team,Position,Height,Weight,College,Salary
0,Saddiq Bey,Atlanta Hawks,F,6-7,215.0,Villanova,4556983.0
1,Bogdan Bogdanovic,Atlanta Hawks,G,6-5,225.0,Fenerbahce,18700000.0
2,Kobe Bufkin,Atlanta Hawks,G,6-5,195.0,Michigan,4094244.0
3,Clint Capela,Atlanta Hawks,C,6-10,256.0,Elan Chalon,20616000.0
4,Bruno Fernando,Atlanta Hawks,F-C,6-10,240.0,Maryland,2581522.0


In [7]:
nba.tail(n=10)

Unnamed: 0,Name,Team,Position,Height,Weight,College,Salary
582,Kyle Kuzma,Washington Wizards,F,6-9,221.0,Utah,25568182.0
583,Mike Muscala,Washington Wizards,F-C,6-11,240.0,Bucknell,3500000.0
584,Kendrick Nunn,Washington Wizards,G,6-3,190.0,Oakland,
585,Eugene Omoruyi,Washington Wizards,F,6-6,235.0,Oregon,559782.0
586,Jordan Poole,Washington Wizards,G,6-4,194.0,Michigan,27955357.0
587,Ryan Rollins,Washington Wizards,G,6-3,180.0,Toledo,1719864.0
588,Landry Shamet,Washington Wizards,G,6-4,190.0,Wichita State,10250000.0
589,Tristan Vukcevic,Washington Wizards,F,6-10,220.0,Real Madrid,
590,Delon Wright,Washington Wizards,G,6-5,185.0,Utah,8195122.0
591,,,,,,,


In [8]:
s.index

RangeIndex(start=0, stop=5, step=1)

In [9]:
nba.index

RangeIndex(start=0, stop=592, step=1)

In [11]:
s.values

array([1, 2, 3, 4, 5], dtype=int64)

In [12]:
nba.values

array([['Saddiq Bey', 'Atlanta Hawks', 'F', ..., 215.0, 'Villanova',
        4556983.0],
       ['Bogdan Bogdanovic', 'Atlanta Hawks', 'G', ..., 225.0,
        'Fenerbahce', 18700000.0],
       ['Kobe Bufkin', 'Atlanta Hawks', 'G', ..., 195.0, 'Michigan',
        4094244.0],
       ...,
       ['Tristan Vukcevic', 'Washington Wizards', 'F', ..., 220.0,
        'Real Madrid', nan],
       ['Delon Wright', 'Washington Wizards', 'G', ..., 185.0, 'Utah',
        8195122.0],
       [nan, nan, nan, ..., nan, nan, nan]], dtype=object)

In [13]:
s.shape

(5,)

In [15]:
nba.shape # provides the number of rows and columns in a data frame

(592, 7)

In [16]:
nba.dtypes #index is the field names and gives you the data types for each field

Name         object
Team         object
Position     object
Height       object
Weight      float64
College      object
Salary      float64
dtype: object

In [17]:
nba.tail() #because Salary has a NaN value, the data type changes from integer, which was the original def to float

Unnamed: 0,Name,Team,Position,Height,Weight,College,Salary
587,Ryan Rollins,Washington Wizards,G,6-3,180.0,Toledo,1719864.0
588,Landry Shamet,Washington Wizards,G,6-4,190.0,Wichita State,10250000.0
589,Tristan Vukcevic,Washington Wizards,F,6-10,220.0,Real Madrid,
590,Delon Wright,Washington Wizards,G,6-5,185.0,Utah,8195122.0
591,,,,,,,


In [18]:
s.hasnans # does our series have missing values?

False

In [20]:
nba.hasnans #hasnans is not available for a data frame, only a series

AttributeError: 'DataFrame' object has no attribute 'hasnans'

In [23]:
nba.columns

Index(['Name', 'Team', 'Position', 'Height', 'Weight', 'College', 'Salary'], dtype='object')

In [24]:
nba.head() #pandas is assigning each column an index number just like the rows.  Name has index value 1, 

Unnamed: 0,Name,Team,Position,Height,Weight,College,Salary
0,Saddiq Bey,Atlanta Hawks,F,6-7,215.0,Villanova,4556983.0
1,Bogdan Bogdanovic,Atlanta Hawks,G,6-5,225.0,Fenerbahce,18700000.0
2,Kobe Bufkin,Atlanta Hawks,G,6-5,195.0,Michigan,4094244.0
3,Clint Capela,Atlanta Hawks,C,6-10,256.0,Elan Chalon,20616000.0
4,Bruno Fernando,Atlanta Hawks,F-C,6-10,240.0,Maryland,2581522.0


In [25]:
nba.axes # the RangeIndex is the rows and the Index is the columns. Provides info on both rows and cols

[RangeIndex(start=0, stop=592, step=1),
 Index(['Name', 'Team', 'Position', 'Height', 'Weight', 'College', 'Salary'], dtype='object')]

In [27]:
s.info()

<class 'pandas.core.series.Series'>
RangeIndex: 5 entries, 0 to 4
Series name: None
Non-Null Count  Dtype
--------------  -----
5 non-null      int64
dtypes: int64(1)
memory usage: 172.0 bytes


In [28]:
nba.info() # this is useful because it gives you missing values as Non-Null count. Position is missing 8 (592-584) values

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 592 entries, 0 to 591
Data columns (total 7 columns):
 #   Column    Non-Null Count  Dtype  
---  ------    --------------  -----  
 0   Name      591 non-null    object 
 1   Team      591 non-null    object 
 2   Position  584 non-null    object 
 3   Height    585 non-null    object 
 4   Weight    584 non-null    float64
 5   College   578 non-null    object 
 6   Salary    488 non-null    float64
dtypes: float64(2), object(5)
memory usage: 32.5+ KB


## Differences between Shared Methods
- The `sum` method adds a **Series's** values.
- On a **DataFrame**, the `sum` method defaults to adding the values by traversing the index (row values).
- The `axis` parameter customizes the direction that we add across. Pass `"columns"` or `1` to add "across" the columns.

In [31]:
rev = pd.read_csv("revenue.csv", index_col="Date")
rev

Unnamed: 0_level_0,New York,Los Angeles,Miami
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1/1/26,985,122,499
1/2/26,738,788,534
1/3/26,14,20,933
1/4/26,730,904,885
1/5/26,114,71,253
1/6/26,936,502,497
1/7/26,123,996,115
1/8/26,935,492,886
1/9/26,846,954,823
1/10/26,54,285,216


In [32]:
s = pd.Series([1,2,3])  #For a series, the sum funtion is easy to apply. There is only one var
s.sum()

6

In [36]:
rev.sum() # the sum function is more complicated in a data frame.  Many cols and do we sum across or down? Default is each var

New York       5475
Los Angeles    5134
Miami          5641
dtype: int64

In [38]:
rev.sum(axis="index") # this sums by the order of the row index, so each columns is summed

New York       5475
Los Angeles    5134
Miami          5641
dtype: int64

In [39]:
rev.sum(axis="columns") # this sum across the columns for each row

Date
1/1/26     1606
1/2/26     2060
1/3/26      967
1/4/26     2519
1/5/26      438
1/6/26     1935
1/7/26     1234
1/8/26     2313
1/9/26     2623
1/10/26     555
dtype: int64

In [43]:
rev.sum(axis="columns").sum()

16250

## Select One Column from a DataFrame
- We can use attribute syntax (`df.column_name`) to select a column from a **DataFrame**. The syntax will not work if the column name has spaces.
- We can also use square bracket syntax (`df["column name"]`) which will work for any column name.
- Pandas extracts a column from a **DataFrame** as a **Series**.
- The **Series** is a view, so changes to the **Series** *will* affect the **DataFrame**.
- Pandas will display a warning if you mutate the **Series**. Use the `copy` method to create a duplicate.

In [44]:
nba = pd.read_csv("nba.csv")
nba

Unnamed: 0,Name,Team,Position,Height,Weight,College,Salary
0,Saddiq Bey,Atlanta Hawks,F,6-7,215.0,Villanova,4556983.0
1,Bogdan Bogdanovic,Atlanta Hawks,G,6-5,225.0,Fenerbahce,18700000.0
2,Kobe Bufkin,Atlanta Hawks,G,6-5,195.0,Michigan,4094244.0
3,Clint Capela,Atlanta Hawks,C,6-10,256.0,Elan Chalon,20616000.0
4,Bruno Fernando,Atlanta Hawks,F-C,6-10,240.0,Maryland,2581522.0
...,...,...,...,...,...,...,...
587,Ryan Rollins,Washington Wizards,G,6-3,180.0,Toledo,1719864.0
588,Landry Shamet,Washington Wizards,G,6-4,190.0,Wichita State,10250000.0
589,Tristan Vukcevic,Washington Wizards,F,6-10,220.0,Real Madrid,
590,Delon Wright,Washington Wizards,G,6-5,185.0,Utah,8195122.0


In [51]:
nba.Team # don't use this method. It will not allow you to select a column if it has a space in the name

0           Atlanta Hawks
1           Atlanta Hawks
2           Atlanta Hawks
3           Atlanta Hawks
4           Atlanta Hawks
              ...        
587    Washington Wizards
588    Washington Wizards
589    Washington Wizards
590    Washington Wizards
591                   NaN
Name: Team, Length: 592, dtype: object

In [53]:
nba["Team"] # this method will allow you to select the column even if it has a space

0           Atlanta Hawks
1           Atlanta Hawks
2           Atlanta Hawks
3           Atlanta Hawks
4           Atlanta Hawks
              ...        
587    Washington Wizards
588    Washington Wizards
589    Washington Wizards
590    Washington Wizards
591                   NaN
Name: Team, Length: 592, dtype: object

In [60]:
Teams = nba["Team"]
Teams

0           Atlanta Hawks
1           Atlanta Hawks
2           Atlanta Hawks
3           Atlanta Hawks
4           Atlanta Hawks
              ...        
587    Washington Wizards
588    Washington Wizards
589    Washington Wizards
590    Washington Wizards
591                   NaN
Name: Team, Length: 592, dtype: object

In [61]:
names = nba["Name"]
names

0             Saddiq Bey
1      Bogdan Bogdanovic
2            Kobe Bufkin
3           Clint Capela
4         Bruno Fernando
             ...        
587         Ryan Rollins
588        Landry Shamet
589     Tristan Vukcevic
590         Delon Wright
591                  NaN
Name: Name, Length: 592, dtype: object

In [64]:
names.iloc[3]="Dr J" # this error message is a warning that we changed the data in the dframe

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  names.iloc[3]="Dr J"


In [65]:
names.head() # the data has been changed

0           Saddiq Bey
1    Bogdan Bogdanovic
2          Kobe Bufkin
3                 Dr J
4       Bruno Fernando
Name: Name, dtype: object

In [70]:
names = nba["Name"].copy()
names

0             Saddiq Bey
1      Bogdan Bogdanovic
2            Kobe Bufkin
3                   Dr J
4         Bruno Fernando
             ...        
587         Ryan Rollins
588        Landry Shamet
589     Tristan Vukcevic
590         Delon Wright
591                  NaN
Name: Name, Length: 592, dtype: object

In [71]:
names.iloc[3] = "Michael Jordan"

In [72]:
names.head() # the change happens in the copy but not the original nba df

0           Saddiq Bey
1    Bogdan Bogdanovic
2          Kobe Bufkin
3       Michael Jordan
4       Bruno Fernando
Name: Name, dtype: object

In [73]:
nba.head() # because we made a copy, Dr J was not overwritten

Unnamed: 0,Name,Team,Position,Height,Weight,College,Salary
0,Saddiq Bey,Atlanta Hawks,F,6-7,215.0,Villanova,4556983.0
1,Bogdan Bogdanovic,Atlanta Hawks,G,6-5,225.0,Fenerbahce,18700000.0
2,Kobe Bufkin,Atlanta Hawks,G,6-5,195.0,Michigan,4094244.0
3,Dr J,Atlanta Hawks,C,6-10,256.0,Elan Chalon,20616000.0
4,Bruno Fernando,Atlanta Hawks,F-C,6-10,240.0,Maryland,2581522.0


In [None]:
# selecting multiple columns

In [3]:
import pandas as pd
nba = pd.read_csv("nba.csv")
nba

Unnamed: 0,Name,Team,Position,Height,Weight,College,Salary
0,Saddiq Bey,Atlanta Hawks,F,6-7,215.0,Villanova,4556983.0
1,Bogdan Bogdanovic,Atlanta Hawks,G,6-5,225.0,Fenerbahce,18700000.0
2,Kobe Bufkin,Atlanta Hawks,G,6-5,195.0,Michigan,4094244.0
3,Clint Capela,Atlanta Hawks,C,6-10,256.0,Elan Chalon,20616000.0
4,Bruno Fernando,Atlanta Hawks,F-C,6-10,240.0,Maryland,2581522.0
...,...,...,...,...,...,...,...
587,Ryan Rollins,Washington Wizards,G,6-3,180.0,Toledo,1719864.0
588,Landry Shamet,Washington Wizards,G,6-4,190.0,Wichita State,10250000.0
589,Tristan Vukcevic,Washington Wizards,F,6-10,220.0,Real Madrid,
590,Delon Wright,Washington Wizards,G,6-5,185.0,Utah,8195122.0


In [6]:
# two ways to select a subset of columns
nba[["Name","Team"]] # The first square brackets mean extraction. The second set are because it is a list
columns_selected = ["Name","Team"] # you can also create a column list and then call them in the statement below
nba[columns_selected]

Unnamed: 0,Name,Team
0,Saddiq Bey,Atlanta Hawks
1,Bogdan Bogdanovic,Atlanta Hawks
2,Kobe Bufkin,Atlanta Hawks
3,Clint Capela,Atlanta Hawks
4,Bruno Fernando,Atlanta Hawks
...,...,...
587,Ryan Rollins,Washington Wizards
588,Landry Shamet,Washington Wizards
589,Tristan Vukcevic,Washington Wizards
590,Delon Wright,Washington Wizards


In [None]:
# add a column to a df

In [81]:
nba["Sport"] = "Basketball" # adds basketball across all rows at the end of the table

In [82]:
nba.head()

Unnamed: 0,Name,Team,Position,Height,Weight,College,Salary,Sport
0,Saddiq Bey,Atlanta Hawks,F,6-7,215.0,Villanova,4556983.0,Basketball
1,Bogdan Bogdanovic,Atlanta Hawks,G,6-5,225.0,Fenerbahce,18700000.0,Basketball
2,Kobe Bufkin,Atlanta Hawks,G,6-5,195.0,Michigan,4094244.0,Basketball
3,Clint Capela,Atlanta Hawks,C,6-10,256.0,Elan Chalon,20616000.0,Basketball
4,Bruno Fernando,Atlanta Hawks,F-C,6-10,240.0,Maryland,2581522.0,Basketball


In [85]:
nba.insert(loc=3, column="Test", value="Test")  # this adds a column after column 3 and inserts the value "Test"

In [86]:
nba.head()

Unnamed: 0,Name,Team,Position,Test,Height,Weight,College,Salary,Sport
0,Saddiq Bey,Atlanta Hawks,F,Test,6-7,215.0,Villanova,4556983.0,Basketball
1,Bogdan Bogdanovic,Atlanta Hawks,G,Test,6-5,225.0,Fenerbahce,18700000.0,Basketball
2,Kobe Bufkin,Atlanta Hawks,G,Test,6-5,195.0,Michigan,4094244.0,Basketball
3,Clint Capela,Atlanta Hawks,C,Test,6-10,256.0,Elan Chalon,20616000.0,Basketball
4,Bruno Fernando,Atlanta Hawks,F-C,Test,6-10,240.0,Maryland,2581522.0,Basketball


In [8]:
#Salary_2x = nba["Salary"] * 2 
nba.insert(loc=7, column="Salary_2x", value=nba["Salary"] * 2)
nba.head()

Unnamed: 0,Name,Team,Position,Height,Weight,College,Salary,Salary_2x
0,Saddiq Bey,Atlanta Hawks,F,6-7,215.0,Villanova,4556983.0,9113966.0
1,Bogdan Bogdanovic,Atlanta Hawks,G,6-5,225.0,Fenerbahce,18700000.0,37400000.0
2,Kobe Bufkin,Atlanta Hawks,G,6-5,195.0,Michigan,4094244.0,8188488.0
3,Clint Capela,Atlanta Hawks,C,6-10,256.0,Elan Chalon,20616000.0,41232000.0
4,Bruno Fernando,Atlanta Hawks,F-C,6-10,240.0,Maryland,2581522.0,5163044.0


In [11]:
nba["Weight"] * 2 # this code tests it
nba["Weight_2x"] = nba["Weight"] * 2 # this code creates the column.  It automatically places it at the end
nba.head()

Unnamed: 0,Name,Team,Position,Height,Weight,College,Salary,Salary_2x,Weight_2x
0,Saddiq Bey,Atlanta Hawks,F,6-7,215.0,Villanova,4556983.0,9113966.0,430.0
1,Bogdan Bogdanovic,Atlanta Hawks,G,6-5,225.0,Fenerbahce,18700000.0,37400000.0,450.0
2,Kobe Bufkin,Atlanta Hawks,G,6-5,195.0,Michigan,4094244.0,8188488.0,390.0
3,Clint Capela,Atlanta Hawks,C,6-10,256.0,Elan Chalon,20616000.0,41232000.0,512.0
4,Bruno Fernando,Atlanta Hawks,F-C,6-10,240.0,Maryland,2581522.0,5163044.0,480.0


In [21]:
#nba = nba.drop(columns=['Salary_3x'])
nba.columns
nba.head()
nba.insert(loc=8, column="Salary_3x", value=nba["Salary"] * 3)
nba.head()

Unnamed: 0,Name,Team,Position,Height,Weight,College,Salary,Salary_2x,Salary_3x,Weight_2x
0,Saddiq Bey,Atlanta Hawks,F,6-7,215.0,Villanova,4556983.0,9113966.0,13670949.0,430.0
1,Bogdan Bogdanovic,Atlanta Hawks,G,6-5,225.0,Fenerbahce,18700000.0,37400000.0,56100000.0,450.0
2,Kobe Bufkin,Atlanta Hawks,G,6-5,195.0,Michigan,4094244.0,8188488.0,12282732.0,390.0
3,Clint Capela,Atlanta Hawks,C,6-10,256.0,Elan Chalon,20616000.0,41232000.0,61848000.0,512.0
4,Bruno Fernando,Atlanta Hawks,F-C,6-10,240.0,Maryland,2581522.0,5163044.0,7744566.0,480.0


In [29]:
#nba["Salary"] - 5000000 # reduces salary by 5M
#nba["Salary"].sub(5000000) #sames as above
nba["New Salary"] = nba["Salary"].sub(5000000) # creates a new variable "new salary"
nba

Unnamed: 0,Name,Team,Position,Height,Weight,College,Salary,Salary_2x,Salary_3x,Weight_2x,New Salary
0,Saddiq Bey,Atlanta Hawks,F,6-7,215.0,Villanova,4556983.0,9113966.0,13670949.0,430.0,-443017.0
1,Bogdan Bogdanovic,Atlanta Hawks,G,6-5,225.0,Fenerbahce,18700000.0,37400000.0,56100000.0,450.0,13700000.0
2,Kobe Bufkin,Atlanta Hawks,G,6-5,195.0,Michigan,4094244.0,8188488.0,12282732.0,390.0,-905756.0
3,Clint Capela,Atlanta Hawks,C,6-10,256.0,Elan Chalon,20616000.0,41232000.0,61848000.0,512.0,15616000.0
4,Bruno Fernando,Atlanta Hawks,F-C,6-10,240.0,Maryland,2581522.0,5163044.0,7744566.0,480.0,-2418478.0
...,...,...,...,...,...,...,...,...,...,...,...
587,Ryan Rollins,Washington Wizards,G,6-3,180.0,Toledo,1719864.0,3439728.0,5159592.0,360.0,-3280136.0
588,Landry Shamet,Washington Wizards,G,6-4,190.0,Wichita State,10250000.0,20500000.0,30750000.0,380.0,5250000.0
589,Tristan Vukcevic,Washington Wizards,F,6-10,220.0,Real Madrid,,,,440.0,
590,Delon Wright,Washington Wizards,G,6-5,185.0,Utah,8195122.0,16390244.0,24585366.0,370.0,3195122.0


## A Review of the value_counts Method
- The `value_counts` method counts the number of times that each unique value occurs in a **Series**.

In [32]:
nba["Team"].value_counts() # chose a field to count unique values
nba.value_counts() #provides unique value count for all fields

Name             Team                    Position  Height  Weight  College                   Salary      Salary_2x   Salary_3x    Weight_2x  New Salary 
AJ Green         Milwaukee Bucks         G         6-5     190.0   Northern Iowa             1901769.0   3803538.0   5705307.0    380.0      -3098231.0     1
Leonard Miller   Minnesota Timberwolves  F         6-10    210.0   NBA G League Ignite       1800000.0   3600000.0   5400000.0    420.0      -3200000.0     1
Malaki Branham   San Antonio Spurs       F         6-4     180.0   Ohio State                3071880.0   6143760.0   9215640.0    360.0      -1928120.0     1
Malachi Flynn    Toronto Raptors         G         6-1     175.0   San Diego State           3873024.0   7746048.0   11619072.0   350.0      -1126976.0     1
Luke Kornet      Boston Celtics          C-F       7-2     250.0   Vanderbilt                2413304.0   4826608.0   7239912.0    500.0      -2586696.0     1
                                                         

In [109]:
nba["Position"].value_counts() 

Position
G      229
F      187
C       47
G-F     46
F-C     37
C-F     23
F-G     15
Name: count, dtype: int64

In [110]:
nba["Position"].value_counts(normalize=True) # Column percents

Position
G      0.392123
F      0.320205
C      0.080479
G-F    0.078767
F-C    0.063356
C-F    0.039384
F-G    0.025685
Name: proportion, dtype: float64

In [111]:
nba["Salary"].value_counts()

Salary
559782.0      59
1119563.0     27
3196448.0     13
2019706.0      9
1719864.0      8
              ..
9600000.0      1
3536280.0      1
8809284.0      1
40806300.0     1
8195122.0      1
Name: count, Length: 298, dtype: int64

In [35]:
nba["Salary"].mean() 

9218978.288934426

In [113]:
nba["Salary"].median() # hit the tab key after the period and see all of the different functions 

4018638.5

## Drop Rows with Missing Values
- Pandas uses a `NaN` designation for cells that have a missing value.
- The `dropna` method deletes rows with missing values. Its default behavior is to remove a row if it has *any* missing values.
- Pass the `how` parameter an argument of "all" to delete rows where all the values are `NaN`.
- The `subset` parameters customizes/limits the columns that pandas will use to drop rows with missing values.

In [5]:
nba = pd.read_csv("nba.csv")
nba

Unnamed: 0,Name,Team,Position,Height,Weight,College,Salary
0,Saddiq Bey,Atlanta Hawks,F,6-7,215.0,Villanova,4556983.0
1,Bogdan Bogdanovic,Atlanta Hawks,G,6-5,225.0,Fenerbahce,18700000.0
2,Kobe Bufkin,Atlanta Hawks,G,6-5,195.0,Michigan,4094244.0
3,Clint Capela,Atlanta Hawks,C,6-10,256.0,Elan Chalon,20616000.0
4,Bruno Fernando,Atlanta Hawks,F-C,6-10,240.0,Maryland,2581522.0
...,...,...,...,...,...,...,...
587,Ryan Rollins,Washington Wizards,G,6-3,180.0,Toledo,1719864.0
588,Landry Shamet,Washington Wizards,G,6-4,190.0,Wichita State,10250000.0
589,Tristan Vukcevic,Washington Wizards,F,6-10,220.0,Real Madrid,
590,Delon Wright,Washington Wizards,G,6-5,185.0,Utah,8195122.0


In [37]:
remove_nan_ds = nba.dropna() #drops any row witn a NaN in it
remove_nan_ds

Unnamed: 0,Name,Team,Position,Height,Weight,College,Salary
0,Saddiq Bey,Atlanta Hawks,F,6-7,215.0,Villanova,4556983.0
1,Bogdan Bogdanovic,Atlanta Hawks,G,6-5,225.0,Fenerbahce,18700000.0
2,Kobe Bufkin,Atlanta Hawks,G,6-5,195.0,Michigan,4094244.0
3,Clint Capela,Atlanta Hawks,C,6-10,256.0,Elan Chalon,20616000.0
4,Bruno Fernando,Atlanta Hawks,F-C,6-10,240.0,Maryland,2581522.0
...,...,...,...,...,...,...,...
585,Eugene Omoruyi,Washington Wizards,F,6-6,235.0,Oregon,559782.0
586,Jordan Poole,Washington Wizards,G,6-4,194.0,Michigan,27955357.0
587,Ryan Rollins,Washington Wizards,G,6-3,180.0,Toledo,1719864.0
588,Landry Shamet,Washington Wizards,G,6-4,190.0,Wichita State,10250000.0


In [118]:
remove_nan_ds.shape #went from 592 rows to 475 rows

(475, 7)

In [119]:
nba.dropna(how="any") #This is default so same as above

Unnamed: 0,Name,Team,Position,Height,Weight,College,Salary
0,Saddiq Bey,Atlanta Hawks,F,6-7,215.0,Villanova,4556983.0
1,Bogdan Bogdanovic,Atlanta Hawks,G,6-5,225.0,Fenerbahce,18700000.0
2,Kobe Bufkin,Atlanta Hawks,G,6-5,195.0,Michigan,4094244.0
3,Clint Capela,Atlanta Hawks,C,6-10,256.0,Elan Chalon,20616000.0
4,Bruno Fernando,Atlanta Hawks,F-C,6-10,240.0,Maryland,2581522.0
...,...,...,...,...,...,...,...
585,Eugene Omoruyi,Washington Wizards,F,6-6,235.0,Oregon,559782.0
586,Jordan Poole,Washington Wizards,G,6-4,194.0,Michigan,27955357.0
587,Ryan Rollins,Washington Wizards,G,6-3,180.0,Toledo,1719864.0
588,Landry Shamet,Washington Wizards,G,6-4,190.0,Wichita State,10250000.0


In [120]:
nba.dropna(how="all") # only removes rows where all vars are missing

Unnamed: 0,Name,Team,Position,Height,Weight,College,Salary
0,Saddiq Bey,Atlanta Hawks,F,6-7,215.0,Villanova,4556983.0
1,Bogdan Bogdanovic,Atlanta Hawks,G,6-5,225.0,Fenerbahce,18700000.0
2,Kobe Bufkin,Atlanta Hawks,G,6-5,195.0,Michigan,4094244.0
3,Clint Capela,Atlanta Hawks,C,6-10,256.0,Elan Chalon,20616000.0
4,Bruno Fernando,Atlanta Hawks,F-C,6-10,240.0,Maryland,2581522.0
...,...,...,...,...,...,...,...
586,Jordan Poole,Washington Wizards,G,6-4,194.0,Michigan,27955357.0
587,Ryan Rollins,Washington Wizards,G,6-3,180.0,Toledo,1719864.0
588,Landry Shamet,Washington Wizards,G,6-4,190.0,Wichita State,10250000.0
589,Tristan Vukcevic,Washington Wizards,F,6-10,220.0,Real Madrid,


In [6]:
nba.dropna(subset=["College", "Salary"], how="all") #choose columns to look for missing values. THis is an or and not an and

Unnamed: 0,Name,Team,Position,Height,Weight,College,Salary
0,Saddiq Bey,Atlanta Hawks,F,6-7,215.0,Villanova,4556983.0
1,Bogdan Bogdanovic,Atlanta Hawks,G,6-5,225.0,Fenerbahce,18700000.0
2,Kobe Bufkin,Atlanta Hawks,G,6-5,195.0,Michigan,4094244.0
3,Clint Capela,Atlanta Hawks,C,6-10,256.0,Elan Chalon,20616000.0
4,Bruno Fernando,Atlanta Hawks,F-C,6-10,240.0,Maryland,2581522.0
...,...,...,...,...,...,...,...
586,Jordan Poole,Washington Wizards,G,6-4,194.0,Michigan,27955357.0
587,Ryan Rollins,Washington Wizards,G,6-3,180.0,Toledo,1719864.0
588,Landry Shamet,Washington Wizards,G,6-4,190.0,Wichita State,10250000.0
589,Tristan Vukcevic,Washington Wizards,F,6-10,220.0,Real Madrid,


## Fill in Missing Values with the fillna Method
- The `fillna` method replaces missing `NaN` values with its argument.
- The `fillna` method is available on both **DataFrames** and **Series**.
- An extracted **Series** is a view on the original **DataFrame**, but the `fillna` method returns a copy.

In [38]:
nba = pd.read_csv("nba.csv").dropna(how="all")
nba

Unnamed: 0,Name,Team,Position,Height,Weight,College,Salary
0,Saddiq Bey,Atlanta Hawks,F,6-7,215.0,Villanova,4556983.0
1,Bogdan Bogdanovic,Atlanta Hawks,G,6-5,225.0,Fenerbahce,18700000.0
2,Kobe Bufkin,Atlanta Hawks,G,6-5,195.0,Michigan,4094244.0
3,Clint Capela,Atlanta Hawks,C,6-10,256.0,Elan Chalon,20616000.0
4,Bruno Fernando,Atlanta Hawks,F-C,6-10,240.0,Maryland,2581522.0
...,...,...,...,...,...,...,...
586,Jordan Poole,Washington Wizards,G,6-4,194.0,Michigan,27955357.0
587,Ryan Rollins,Washington Wizards,G,6-3,180.0,Toledo,1719864.0
588,Landry Shamet,Washington Wizards,G,6-4,190.0,Wichita State,10250000.0
589,Tristan Vukcevic,Washington Wizards,F,6-10,220.0,Real Madrid,


In [131]:
nba["College"] = nba["College"].fillna(value="unknown")  # we are overwriting the "College" var with "unknown"
nba

Unnamed: 0,Name,Team,Position,Height,Weight,College,Salary
0,Saddiq Bey,Atlanta Hawks,F,6-7,215.0,Villanova,4556983.0
1,Bogdan Bogdanovic,Atlanta Hawks,G,6-5,225.0,Fenerbahce,18700000.0
2,Kobe Bufkin,Atlanta Hawks,G,6-5,195.0,Michigan,4094244.0
3,Clint Capela,Atlanta Hawks,C,6-10,256.0,Elan Chalon,20616000.0
4,Bruno Fernando,Atlanta Hawks,F-C,6-10,240.0,Maryland,2581522.0
...,...,...,...,...,...,...,...
586,Jordan Poole,Washington Wizards,G,6-4,194.0,Michigan,27955357.0
587,Ryan Rollins,Washington Wizards,G,6-3,180.0,Toledo,1719864.0
588,Landry Shamet,Washington Wizards,G,6-4,190.0,Wichita State,10250000.0
589,Tristan Vukcevic,Washington Wizards,F,6-10,220.0,Real Madrid,0.0


In [51]:
nba.value_counts(["College"]) == "unknown"

College               
Kentucky                  False
Duke                      False
UCLA                      False
Arizona                   False
Gonzaga                   False
                          ...  
Loyola-Maryland           False
Louisiana Tech            False
Lipscomb                  False
Lietuvos rytas Vilnius    False
Zalgiris                  False
Name: count, Length: 182, dtype: bool

## The astype Method I
- The `astype` method converts a **Series's** values to a specified type.
- Pass in the specified type as either a string or the core Python data type.
- Pandas cannot convert `NaN` values to numeric types, so we need to eliminate/replace them before we perform the conversion.
- The `dtypes` attribute returns a **Series** with the **DataFrame's** columns and their types.

In [133]:
nba = pd.read_csv("nba.csv").dropna(how="all")
nba

Unnamed: 0,Name,Team,Position,Height,Weight,College,Salary
0,Saddiq Bey,Atlanta Hawks,F,6-7,215.0,Villanova,4556983.0
1,Bogdan Bogdanovic,Atlanta Hawks,G,6-5,225.0,Fenerbahce,18700000.0
2,Kobe Bufkin,Atlanta Hawks,G,6-5,195.0,Michigan,4094244.0
3,Clint Capela,Atlanta Hawks,C,6-10,256.0,Elan Chalon,20616000.0
4,Bruno Fernando,Atlanta Hawks,F-C,6-10,240.0,Maryland,2581522.0
...,...,...,...,...,...,...,...
586,Jordan Poole,Washington Wizards,G,6-4,194.0,Michigan,27955357.0
587,Ryan Rollins,Washington Wizards,G,6-3,180.0,Toledo,1719864.0
588,Landry Shamet,Washington Wizards,G,6-4,190.0,Wichita State,10250000.0
589,Tristan Vukcevic,Washington Wizards,F,6-10,220.0,Real Madrid,


In [135]:
nba["Salary"] = nba["Salary"].fillna(value=0) 
nba

Unnamed: 0,Name,Team,Position,Height,Weight,College,Salary
0,Saddiq Bey,Atlanta Hawks,F,6-7,215.0,Villanova,4556983.0
1,Bogdan Bogdanovic,Atlanta Hawks,G,6-5,225.0,Fenerbahce,18700000.0
2,Kobe Bufkin,Atlanta Hawks,G,6-5,195.0,Michigan,4094244.0
3,Clint Capela,Atlanta Hawks,C,6-10,256.0,Elan Chalon,20616000.0
4,Bruno Fernando,Atlanta Hawks,F-C,6-10,240.0,Maryland,2581522.0
...,...,...,...,...,...,...,...
586,Jordan Poole,Washington Wizards,G,6-4,194.0,Michigan,27955357.0
587,Ryan Rollins,Washington Wizards,G,6-3,180.0,Toledo,1719864.0
588,Landry Shamet,Washington Wizards,G,6-4,190.0,Wichita State,10250000.0
589,Tristan Vukcevic,Washington Wizards,F,6-10,220.0,Real Madrid,0.0


In [136]:
nba.dtypes

Name         object
Team         object
Position     object
Height       object
Weight      float64
College      object
Salary      float64
dtype: object

In [138]:
nba["Salary"].astype("int") #this method does not overwrite the original

0       4556983
1      18700000
2       4094244
3      20616000
4       2581522
         ...   
586    27955357
587     1719864
588    10250000
589           0
590     8195122
Name: Salary, Length: 591, dtype: int32

In [139]:
nba.dtypes

Name         object
Team         object
Position     object
Height       object
Weight      float64
College      object
Salary      float64
dtype: object

In [140]:
nba["Salary"] = nba["Salary"].astype("int") # because we have nba["Salary"] in the left, we are telling it to permanently overwrite

In [141]:
nba.dtypes

Name         object
Team         object
Position     object
Height       object
Weight      float64
College      object
Salary        int32
dtype: object

In [142]:
nba

Unnamed: 0,Name,Team,Position,Height,Weight,College,Salary
0,Saddiq Bey,Atlanta Hawks,F,6-7,215.0,Villanova,4556983
1,Bogdan Bogdanovic,Atlanta Hawks,G,6-5,225.0,Fenerbahce,18700000
2,Kobe Bufkin,Atlanta Hawks,G,6-5,195.0,Michigan,4094244
3,Clint Capela,Atlanta Hawks,C,6-10,256.0,Elan Chalon,20616000
4,Bruno Fernando,Atlanta Hawks,F-C,6-10,240.0,Maryland,2581522
...,...,...,...,...,...,...,...
586,Jordan Poole,Washington Wizards,G,6-4,194.0,Michigan,27955357
587,Ryan Rollins,Washington Wizards,G,6-3,180.0,Toledo,1719864
588,Landry Shamet,Washington Wizards,G,6-4,190.0,Wichita State,10250000
589,Tristan Vukcevic,Washington Wizards,F,6-10,220.0,Real Madrid,0


In [145]:
nba["Weight"] = nba["Weight"].fillna(value=0) 
nba["Weight"] = nba["Weight"].astype("int")
nba
nba.dtypes

Name        object
Team        object
Position    object
Height      object
Weight       int32
College     object
Salary       int32
dtype: object

## The astype Method II
- The `category` type is ideal for columns with a limited number of unique values.
- The `nunique` method will return a **Series** with the number of unique values in each column.
- With categories, pandas does not create a separate value in memory for each "cell". Rather, the cells point to a single copy for each unique value.

In [146]:
nba = pd.read_csv("nba.csv").dropna(how="all")
nba.tail()

Unnamed: 0,Name,Team,Position,Height,Weight,College,Salary
586,Jordan Poole,Washington Wizards,G,6-4,194.0,Michigan,27955357.0
587,Ryan Rollins,Washington Wizards,G,6-3,180.0,Toledo,1719864.0
588,Landry Shamet,Washington Wizards,G,6-4,190.0,Wichita State,10250000.0
589,Tristan Vukcevic,Washington Wizards,F,6-10,220.0,Real Madrid,
590,Delon Wright,Washington Wizards,G,6-5,185.0,Utah,8195122.0


In [147]:
nba["Team"].nunique()

30

In [149]:
nba.nunique()

Name        591
Team         30
Position      7
Height       20
Weight       93
College     182
Salary      298
dtype: int64

In [150]:
nba.info()

<class 'pandas.core.frame.DataFrame'>
Index: 591 entries, 0 to 590
Data columns (total 7 columns):
 #   Column    Non-Null Count  Dtype  
---  ------    --------------  -----  
 0   Name      591 non-null    object 
 1   Team      591 non-null    object 
 2   Position  584 non-null    object 
 3   Height    585 non-null    object 
 4   Weight    584 non-null    float64
 5   College   578 non-null    object 
 6   Salary    488 non-null    float64
dtypes: float64(2), object(5)
memory usage: 36.9+ KB


In [154]:
nba["Position"]= nba["Position"].astype("category")

In [155]:
nba["Team"]= nba["Team"].astype("category")

In [157]:
nba.info()

<class 'pandas.core.frame.DataFrame'>
Index: 591 entries, 0 to 590
Data columns (total 7 columns):
 #   Column    Non-Null Count  Dtype   
---  ------    --------------  -----   
 0   Name      591 non-null    object  
 1   Team      591 non-null    category
 2   Position  584 non-null    category
 3   Height    585 non-null    object  
 4   Weight    584 non-null    float64 
 5   College   578 non-null    object  
 6   Salary    488 non-null    float64 
dtypes: category(2), float64(2), object(3)
memory usage: 30.5+ KB


In [161]:
nba.dtypes

Name          object
Team        category
Position    category
Height        object
Weight       float64
College       object
Salary       float64
dtype: object

## Sort a DataFrame with the sort_values Method I
- The `sort_values` method sorts a **DataFrame** by the values in one or more columns. The default sort is an ascending one (alphabetical for strings).
- The first parameter (`by`) expects the column(s) to sort by.
- If sorting by a single column, pass a string with its name.
- The `ascending` parameter customizes the sort order.
- The `na_position` parameter customizes where pandas places `NaN` values.

In [162]:
nba = pd.read_csv("nba.csv")
nba.tail()

Unnamed: 0,Name,Team,Position,Height,Weight,College,Salary
587,Ryan Rollins,Washington Wizards,G,6-3,180.0,Toledo,1719864.0
588,Landry Shamet,Washington Wizards,G,6-4,190.0,Wichita State,10250000.0
589,Tristan Vukcevic,Washington Wizards,F,6-10,220.0,Real Madrid,
590,Delon Wright,Washington Wizards,G,6-5,185.0,Utah,8195122.0
591,,,,,,,


In [169]:
nba.sort_values(by="Name", ascending=False)

Unnamed: 0,Name,Team,Position,Height,Weight,College,Salary
370,Zion Williamson,New Orleans Pelicans,F,6-6,284.0,Duke,34005250.0
291,Ziaire Williams,Memphis Grizzlies,F,6-9,185.0,Stanford,4810200.0
149,Zeke Nnaji,Denver Nuggets,F-C,6-9,240.0,Arizona,4306281.0
83,Zach LaVine,Chicago Bulls,G,6-5,200.0,UCLA,40064220.0
515,Zach Collins,San Antonio Spurs,F-C,6-11,250.0,Gonzaga,7700000.0
...,...,...,...,...,...,...,...
141,Aaron Gordon,Denver Nuggets,F,6-8,235.0,Arizona,22266182.0
6,AJ Griffin,Atlanta Hawks,F,6-6,220.0,Duke,3712920.0
324,AJ Green,Milwaukee Bucks,G,6-5,190.0,Northern Iowa,1901769.0
122,A.J. Lawson,Dallas Mavericks,G,6-6,179.0,South Carolina,


In [174]:
nba.sort_values(by="Salary", ascending=False, na_position="last")

Unnamed: 0,Name,Team,Position,Height,Weight,College,Salary
175,Stephen Curry,Golden State Warriors,G,6-2,185.0,Davidson,51915615.0
461,Kevin Durant,Phoenix Suns,F,6-10,240.0,Texas,47649433.0
436,Joel Embiid,Philadelphia 76ers,C-F,7-0,280.0,Kansas,47607350.0
145,Nikola Jokic,Denver Nuggets,C,6-11,284.0,Mega Basket,47607350.0
261,LeBron James,Los Angeles Lakers,F,6-9,250.0,St. Vincent-St. Mary HS (OH),47607350.0
...,...,...,...,...,...,...,...
547,Gary Trent Jr.,Toronto Raptors,G-F,6-5,209.0,Duke,
578,Taj Gibson,Washington Wizards,F,6-9,232.0,Southern California,
584,Kendrick Nunn,Washington Wizards,G,6-3,190.0,Oakland,
589,Tristan Vukcevic,Washington Wizards,F,6-10,220.0,Real Madrid,


## Sort a DataFrame with the sort_values Method II
- To sort by multiple columns, pass the `by` parameter a list of column names. Pandas will sort in the specified column order (first to last).
- Pass the `ascending` parameter a Boolean to sort all columns in a consistent order (all ascending or all descending).
- Pass `ascending` a list to customize the sort order *per* column. The `ascending` list length must match the `by` list.

In [175]:
nba = pd.read_csv("nba.csv")
nba.tail()

Unnamed: 0,Name,Team,Position,Height,Weight,College,Salary
587,Ryan Rollins,Washington Wizards,G,6-3,180.0,Toledo,1719864.0
588,Landry Shamet,Washington Wizards,G,6-4,190.0,Wichita State,10250000.0
589,Tristan Vukcevic,Washington Wizards,F,6-10,220.0,Real Madrid,
590,Delon Wright,Washington Wizards,G,6-5,185.0,Utah,8195122.0
591,,,,,,,


In [183]:
nba.sort_values(by=["Position", "Salary"], ascending=[False, False])

Unnamed: 0,Name,Team,Position,Height,Weight,College,Salary
46,Ben Simmons,Brooklyn Nets,G-F,6-10,240.0,Louisiana State,37893408.0
21,Jaylen Brown,Boston Celtics,G-F,6-6,223.0,California,31830357.0
78,DeMar DeRozan,Chicago Bulls,G-F,6-6,220.0,Southern California,28600000.0
192,Dillon Brooks,Houston Rockets,G-F,6-6,225.0,Oregon,22627671.0
212,Bruce Brown,Indiana Pacers,G-F,6-4,202.0,Miami,22000000.0
...,...,...,...,...,...,...,...
299,Caleb Daniels,Miami Heat,,,,,1119563.0
541,Kevin Obanor,Toronto Raptors,,6-8,235.0,,1119563.0
564,Nick Ongenda,Utah Jazz,,,,,1119563.0
15,Miles Norris,Atlanta Hawks,,,,,559782.0


## Sort a DataFrame by its Index
- The `sort_index` method sorts the **DataFrame** by its index positions/labels.

In [185]:
nba = pd.read_csv("nba.csv")
nba = nba.sort_values(["Team", "Name"])
nba

Unnamed: 0,Name,Team,Position,Height,Weight,College,Salary
6,AJ Griffin,Atlanta Hawks,F,6-6,220.0,Duke,3712920.0
1,Bogdan Bogdanovic,Atlanta Hawks,G,6-5,225.0,Fenerbahce,18700000.0
4,Bruno Fernando,Atlanta Hawks,F-C,6-10,240.0,Maryland,2581522.0
3,Clint Capela,Atlanta Hawks,C,6-10,256.0,Elan Chalon,20616000.0
8,De'Andre Hunter,Atlanta Hawks,F-G,6-8,221.0,Virginia,20089286.0
...,...,...,...,...,...,...,...
578,Taj Gibson,Washington Wizards,F,6-9,232.0,Southern California,
589,Tristan Vukcevic,Washington Wizards,F,6-10,220.0,Real Madrid,
580,Tyus Jones,Washington Wizards,G,6-2,196.0,Duke,14000000.0
573,Xavier Cooks,Washington Wizards,F,6-8,183.0,Winthrop,1719864.0


In [187]:
nba.sort_index(ascending=False)

Unnamed: 0,Name,Team,Position,Height,Weight,College,Salary
591,,,,,,,
590,Delon Wright,Washington Wizards,G,6-5,185.0,Utah,8195122.0
589,Tristan Vukcevic,Washington Wizards,F,6-10,220.0,Real Madrid,
588,Landry Shamet,Washington Wizards,G,6-4,190.0,Wichita State,10250000.0
587,Ryan Rollins,Washington Wizards,G,6-3,180.0,Toledo,1719864.0
...,...,...,...,...,...,...,...
4,Bruno Fernando,Atlanta Hawks,F-C,6-10,240.0,Maryland,2581522.0
3,Clint Capela,Atlanta Hawks,C,6-10,256.0,Elan Chalon,20616000.0
2,Kobe Bufkin,Atlanta Hawks,G,6-5,195.0,Michigan,4094244.0
1,Bogdan Bogdanovic,Atlanta Hawks,G,6-5,225.0,Fenerbahce,18700000.0


## Rank Values with the rank Method
- The `rank` method assigns a numeric ranking to each **Series** value.
- Pandas will assign the same rank to equal values and create a "gap" in the dataset for the ranks.

In [191]:
nba = pd.read_csv("nba.csv").dropna(how="all")
nba.tail()

Unnamed: 0,Name,Team,Position,Height,Weight,College,Salary
586,Jordan Poole,Washington Wizards,G,6-4,194.0,Michigan,27955357.0
587,Ryan Rollins,Washington Wizards,G,6-3,180.0,Toledo,1719864.0
588,Landry Shamet,Washington Wizards,G,6-4,190.0,Wichita State,10250000.0
589,Tristan Vukcevic,Washington Wizards,F,6-10,220.0,Real Madrid,
590,Delon Wright,Washington Wizards,G,6-5,185.0,Utah,8195122.0


In [9]:
nba["Salary"] = nba["Salary"].fillna(0).astype(int)

In [10]:
nba["Salary"].rank(ascending=False).astype(int)

0      231
1       80
2      243
3       69
4      308
      ... 
587    394
588    140
589    540
590    163
591    540
Name: Salary, Length: 592, dtype: int32

In [197]:
nba["Salary Rank"] = nba["Salary"].rank(ascending=False).astype(int)

In [198]:
nba.sort_values("Salary", ascending=False)

Unnamed: 0,Name,Team,Position,Height,Weight,College,Salary,Salary Rank
175,Stephen Curry,Golden State Warriors,G,6-2,185.0,Davidson,51915615,1
461,Kevin Durant,Phoenix Suns,F,6-10,240.0,Texas,47649433,2
261,LeBron James,Los Angeles Lakers,F,6-9,250.0,St. Vincent-St. Mary HS (OH),47607350,4
145,Nikola Jokic,Denver Nuggets,C,6-11,284.0,Mega Basket,47607350,4
436,Joel Embiid,Philadelphia 76ers,C-F,7-0,280.0,Kansas,47607350,4
...,...,...,...,...,...,...,...,...
64,James Nnaji,Charlotte Hornets,F,6-11,250.0,FC Barcelona,0,540
132,Christian Wood,Dallas Mavericks,F,6-9,214.0,UNLV,0,540
126,Theo Pinson,Dallas Mavericks,G-F,6-7,212.0,North Carolina,0,540
125,Markieff Morris,Dallas Mavericks,F,6-9,245.0,Kansas,0,540
