# DataFrames I

In [1]:
import pandas as pd

## Methods and Attributes between Series and DataFrames
- A **DataFrame** is a 2-dimensional table consisting of rows and columns.
- Pandas uses a `NaN` designation for cells that have a missing value. It is short for "not a number". Most operations on `NaN` values will produce `NaN` values.
- Like with a **Series**, Pandas assigns an index position/label to each **DataFrame** row.
- The **DataFrame** and **Series** have common and exclusive methods/attributes.
- The `hasnans` attribute exists only a **Series**. The `columns` attribute exists only on a **DataFrame**.
- Some methods/attributes will return different types of data.
- The `info` method returns a summary of the pandas object.

In [3]:
nba= pd.read_csv('nba.csv')
nba

Unnamed: 0,Name,Team,Position,Height,Weight,College,Salary
0,Saddiq Bey,Atlanta Hawks,F,6-7,215.0,Villanova,4556983.0
1,Bogdan Bogdanovic,Atlanta Hawks,G,6-5,225.0,Fenerbahce,18700000.0
2,Kobe Bufkin,Atlanta Hawks,G,6-5,195.0,Michigan,4094244.0
3,Clint Capela,Atlanta Hawks,C,6-10,256.0,Elan Chalon,20616000.0
4,Bruno Fernando,Atlanta Hawks,F-C,6-10,240.0,Maryland,2581522.0
...,...,...,...,...,...,...,...
587,Ryan Rollins,Washington Wizards,G,6-3,180.0,Toledo,1719864.0
588,Landry Shamet,Washington Wizards,G,6-4,190.0,Wichita State,10250000.0
589,Tristan Vukcevic,Washington Wizards,F,6-10,220.0,Real Madrid,
590,Delon Wright,Washington Wizards,G,6-5,185.0,Utah,8195122.0


In [7]:
nba.head(10)
nba.tail(4)

Unnamed: 0,Name,Team,Position,Height,Weight,College,Salary
588,Landry Shamet,Washington Wizards,G,6-4,190.0,Wichita State,10250000.0
589,Tristan Vukcevic,Washington Wizards,F,6-10,220.0,Real Madrid,
590,Delon Wright,Washington Wizards,G,6-5,185.0,Utah,8195122.0
591,,,,,,,


In [8]:
nba.index

RangeIndex(start=0, stop=592, step=1)

In [9]:
nba.values

array([['Saddiq Bey', 'Atlanta Hawks', 'F', ..., 215.0, 'Villanova',
        4556983.0],
       ['Bogdan Bogdanovic', 'Atlanta Hawks', 'G', ..., 225.0,
        'Fenerbahce', 18700000.0],
       ['Kobe Bufkin', 'Atlanta Hawks', 'G', ..., 195.0, 'Michigan',
        4094244.0],
       ...,
       ['Tristan Vukcevic', 'Washington Wizards', 'F', ..., 220.0,
        'Real Madrid', nan],
       ['Delon Wright', 'Washington Wizards', 'G', ..., 185.0, 'Utah',
        8195122.0],
       [nan, nan, nan, ..., nan, nan, nan]], dtype=object)

In [10]:
nba.shape # returns a tuple saying which is the number of rows and columns  

(592, 7)

In [14]:
nba.dtypes

Name         object
Team         object
Position     object
Height       object
Weight      float64
College      object
Salary      float64
dtype: object

In [17]:
nba.columns

Index(['Name', 'Team', 'Position', 'Height', 'Weight', 'College', 'Salary'], dtype='object')

In [18]:
nba.axes

[RangeIndex(start=0, stop=592, step=1),
 Index(['Name', 'Team', 'Position', 'Height', 'Weight', 'College', 'Salary'], dtype='object')]

In [19]:
nba.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 592 entries, 0 to 591
Data columns (total 7 columns):
 #   Column    Non-Null Count  Dtype  
---  ------    --------------  -----  
 0   Name      591 non-null    object 
 1   Team      591 non-null    object 
 2   Position  584 non-null    object 
 3   Height    585 non-null    object 
 4   Weight    584 non-null    float64
 5   College   578 non-null    object 
 6   Salary    488 non-null    float64
dtypes: float64(2), object(5)
memory usage: 32.5+ KB


## Differences between Shared Methods
- The `sum` method adds a **Series's** values.
- On a **DataFrame**, the `sum` method defaults to adding the values by traversing the index (row values).
- The `axis` parameter customizes the direction that we add across. Pass `"columns"` or `1` to add "across" the columns.

In [24]:
revenue= pd.read_csv('revenue.csv', index_col=['Date'])
revenue

Unnamed: 0_level_0,New York,Los Angeles,Miami
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1/1/26,985,122,499
1/2/26,738,788,534
1/3/26,14,20,933
1/4/26,730,904,885
1/5/26,114,71,253
1/6/26,936,502,497
1/7/26,123,996,115
1/8/26,935,492,886
1/9/26,846,954,823
1/10/26,54,285,216


In [32]:
revenue.sum()
revenue.sum(axis='index')

New York       5475
Los Angeles    5134
Miami          5641
dtype: int64

In [33]:
revenue.sum(axis='columns').sum() # the exit of first operation is a Series, and then and the previous knowledge applies

16250

## Select One Column from a DataFrame
- We can use attribute syntax (`df.column_name`) to select a column from a **DataFrame**. The syntax will not work if the column name has spaces.
- We can also use square bracket syntax (`df["column name"]`) which will work for any column name.
- Pandas extracts a column from a **DataFrame** as a **Series**.
- The **Series** is a view, so changes to the **Series** *will* affect the **DataFrame**.
- Pandas will display a warning if you mutate the **Series**. Use the `copy` method to create a duplicate.

In [35]:
nba= pd.read_csv('nba.csv')
nba.head()

Unnamed: 0,Name,Team,Position,Height,Weight,College,Salary
0,Saddiq Bey,Atlanta Hawks,F,6-7,215.0,Villanova,4556983.0
1,Bogdan Bogdanovic,Atlanta Hawks,G,6-5,225.0,Fenerbahce,18700000.0
2,Kobe Bufkin,Atlanta Hawks,G,6-5,195.0,Michigan,4094244.0
3,Clint Capela,Atlanta Hawks,C,6-10,256.0,Elan Chalon,20616000.0
4,Bruno Fernando,Atlanta Hawks,F-C,6-10,240.0,Maryland,2581522.0


In [None]:
nba.Team
nba.Salary
nba.Name

# this will only work if our column names don't have special characters or spaces

0             Saddiq Bey
1      Bogdan Bogdanovic
2            Kobe Bufkin
3           Clint Capela
4         Bruno Fernando
             ...        
587         Ryan Rollins
588        Landry Shamet
589     Tristan Vukcevic
590         Delon Wright
591                  NaN
Name: Name, Length: 592, dtype: object

In [43]:
nba['Team']
nba['Salary']
nba['Position']

0        F
1        G
2        G
3        C
4      F-C
      ... 
587      G
588      G
589      F
590      G
591    NaN
Name: Position, Length: 592, dtype: object

In [45]:
names= nba['Name'].copy()
names

0             Saddiq Bey
1      Bogdan Bogdanovic
2            Kobe Bufkin
3           Clint Capela
4         Bruno Fernando
             ...        
587         Ryan Rollins
588        Landry Shamet
589     Tristan Vukcevic
590         Delon Wright
591                  NaN
Name: Name, Length: 592, dtype: object

In [46]:
names.iloc[0]= 'teste'

In [47]:
names.head()

0                teste
1    Bogdan Bogdanovic
2          Kobe Bufkin
3         Clint Capela
4       Bruno Fernando
Name: Name, dtype: object

In [48]:
nba.head()

Unnamed: 0,Name,Team,Position,Height,Weight,College,Salary
0,Saddiq Bey,Atlanta Hawks,F,6-7,215.0,Villanova,4556983.0
1,Bogdan Bogdanovic,Atlanta Hawks,G,6-5,225.0,Fenerbahce,18700000.0
2,Kobe Bufkin,Atlanta Hawks,G,6-5,195.0,Michigan,4094244.0
3,Clint Capela,Atlanta Hawks,C,6-10,256.0,Elan Chalon,20616000.0
4,Bruno Fernando,Atlanta Hawks,F-C,6-10,240.0,Maryland,2581522.0


## Select Multiple Columns from a DataFrame
- Use square brackets with a list of names to extract multiple **DataFrame** columns.
- Pandas stores the result in a new **DataFrame** (a copy).

In [49]:
nba[['Team', 'Name']]

Unnamed: 0,Team,Name
0,Atlanta Hawks,Saddiq Bey
1,Atlanta Hawks,Bogdan Bogdanovic
2,Atlanta Hawks,Kobe Bufkin
3,Atlanta Hawks,Clint Capela
4,Atlanta Hawks,Bruno Fernando
...,...,...
587,Washington Wizards,Ryan Rollins
588,Washington Wizards,Landry Shamet
589,Washington Wizards,Tristan Vukcevic
590,Washington Wizards,Delon Wright


In [53]:
for column in nba.columns:
    print(column)

Name
Team
Position
Height
Weight
College
Salary


In [54]:
columns_select= [col for col in nba.columns if col not in ['Height', 'Weight']]
columns_select

['Name', 'Team', 'Position', 'College', 'Salary']

In [55]:
nba[columns_select]

Unnamed: 0,Name,Team,Position,College,Salary
0,Saddiq Bey,Atlanta Hawks,F,Villanova,4556983.0
1,Bogdan Bogdanovic,Atlanta Hawks,G,Fenerbahce,18700000.0
2,Kobe Bufkin,Atlanta Hawks,G,Michigan,4094244.0
3,Clint Capela,Atlanta Hawks,C,Elan Chalon,20616000.0
4,Bruno Fernando,Atlanta Hawks,F-C,Maryland,2581522.0
...,...,...,...,...,...
587,Ryan Rollins,Washington Wizards,G,Toledo,1719864.0
588,Landry Shamet,Washington Wizards,G,Wichita State,10250000.0
589,Tristan Vukcevic,Washington Wizards,F,Real Madrid,
590,Delon Wright,Washington Wizards,G,Utah,8195122.0


## Add New Column to DataFrame
- Use square bracket extraction syntax with an equal sign to add a new **Series** to a **DataFrame**.
- The `insert` method allows us to insert an element at a specific column index.
- On the right-hand side, we can reference an existing **DataFrame** column and perform a broadcasting operation on it to create the new **Series**.

In [63]:
nba= pd.read_csv('nba.csv')
nba

Unnamed: 0,Name,Team,Position,Height,Weight,College,Salary
0,Saddiq Bey,Atlanta Hawks,F,6-7,215.0,Villanova,4556983.0
1,Bogdan Bogdanovic,Atlanta Hawks,G,6-5,225.0,Fenerbahce,18700000.0
2,Kobe Bufkin,Atlanta Hawks,G,6-5,195.0,Michigan,4094244.0
3,Clint Capela,Atlanta Hawks,C,6-10,256.0,Elan Chalon,20616000.0
4,Bruno Fernando,Atlanta Hawks,F-C,6-10,240.0,Maryland,2581522.0
...,...,...,...,...,...,...,...
587,Ryan Rollins,Washington Wizards,G,6-3,180.0,Toledo,1719864.0
588,Landry Shamet,Washington Wizards,G,6-4,190.0,Wichita State,10250000.0
589,Tristan Vukcevic,Washington Wizards,F,6-10,220.0,Real Madrid,
590,Delon Wright,Washington Wizards,G,6-5,185.0,Utah,8195122.0


In [59]:
nba['Sport'] = 'Basketball'

In [60]:
nba

Unnamed: 0,Name,Team,Position,Height,Weight,College,Salary,Sport
0,Saddiq Bey,Atlanta Hawks,F,6-7,215.0,Villanova,4556983.0,Basketball
1,Bogdan Bogdanovic,Atlanta Hawks,G,6-5,225.0,Fenerbahce,18700000.0,Basketball
2,Kobe Bufkin,Atlanta Hawks,G,6-5,195.0,Michigan,4094244.0,Basketball
3,Clint Capela,Atlanta Hawks,C,6-10,256.0,Elan Chalon,20616000.0,Basketball
4,Bruno Fernando,Atlanta Hawks,F-C,6-10,240.0,Maryland,2581522.0,Basketball
...,...,...,...,...,...,...,...,...
587,Ryan Rollins,Washington Wizards,G,6-3,180.0,Toledo,1719864.0,Basketball
588,Landry Shamet,Washington Wizards,G,6-4,190.0,Wichita State,10250000.0,Basketball
589,Tristan Vukcevic,Washington Wizards,F,6-10,220.0,Real Madrid,,Basketball
590,Delon Wright,Washington Wizards,G,6-5,185.0,Utah,8195122.0,Basketball


In [61]:
nba.insert(loc= 3, column= 'NewColumnTest', value=None)
nba

Unnamed: 0,Name,Team,Position,NewColumnTest,Height,Weight,College,Salary,Sport
0,Saddiq Bey,Atlanta Hawks,F,,6-7,215.0,Villanova,4556983.0,Basketball
1,Bogdan Bogdanovic,Atlanta Hawks,G,,6-5,225.0,Fenerbahce,18700000.0,Basketball
2,Kobe Bufkin,Atlanta Hawks,G,,6-5,195.0,Michigan,4094244.0,Basketball
3,Clint Capela,Atlanta Hawks,C,,6-10,256.0,Elan Chalon,20616000.0,Basketball
4,Bruno Fernando,Atlanta Hawks,F-C,,6-10,240.0,Maryland,2581522.0,Basketball
...,...,...,...,...,...,...,...,...,...
587,Ryan Rollins,Washington Wizards,G,,6-3,180.0,Toledo,1719864.0,Basketball
588,Landry Shamet,Washington Wizards,G,,6-4,190.0,Wichita State,10250000.0,Basketball
589,Tristan Vukcevic,Washington Wizards,F,,6-10,220.0,Real Madrid,,Basketball
590,Delon Wright,Washington Wizards,G,,6-5,185.0,Utah,8195122.0,Basketball


In [77]:
def weight_rate(x):
    if 100 < x and  x <= 150:
        return 'Very Light'
    elif 150 < x and x <= 200:
        return 'Light'
    elif 200 < x and x <= 250:
        return 'Normal Weight'
    elif 250 < x and x <= 300:
        return 'Heavy'
    elif x > 300:
        return 'Very Heavy'

if 'Weight Rate' not in nba.columns:
    nba.insert(loc= 5, column= 'Weight Rate', value= nba['Weight'].apply(weight_rate))
else:
    pass
nba

Unnamed: 0,Name,Team,Position,Height,Weight,Weight Rate,College,Salary
0,Saddiq Bey,Atlanta Hawks,F,6-7,215.0,Normal Weight,Villanova,4556983.0
1,Bogdan Bogdanovic,Atlanta Hawks,G,6-5,225.0,Normal Weight,Fenerbahce,18700000.0
2,Kobe Bufkin,Atlanta Hawks,G,6-5,195.0,Light,Michigan,4094244.0
3,Clint Capela,Atlanta Hawks,C,6-10,256.0,Heavy,Elan Chalon,20616000.0
4,Bruno Fernando,Atlanta Hawks,F-C,6-10,240.0,Normal Weight,Maryland,2581522.0
...,...,...,...,...,...,...,...,...
587,Ryan Rollins,Washington Wizards,G,6-3,180.0,Light,Toledo,1719864.0
588,Landry Shamet,Washington Wizards,G,6-4,190.0,Light,Wichita State,10250000.0
589,Tristan Vukcevic,Washington Wizards,F,6-10,220.0,Normal Weight,Real Madrid,
590,Delon Wright,Washington Wizards,G,6-5,185.0,Light,Utah,8195122.0


In [79]:
nba.drop(columns=['Weight Rate'], inplace= True) 

# dropping columns (or indexes, method also applies for them) without having to reassign it to a new variable


In [80]:
nba

Unnamed: 0,Name,Team,Position,Height,Weight,College,Salary
0,Saddiq Bey,Atlanta Hawks,F,6-7,215.0,Villanova,4556983.0
1,Bogdan Bogdanovic,Atlanta Hawks,G,6-5,225.0,Fenerbahce,18700000.0
2,Kobe Bufkin,Atlanta Hawks,G,6-5,195.0,Michigan,4094244.0
3,Clint Capela,Atlanta Hawks,C,6-10,256.0,Elan Chalon,20616000.0
4,Bruno Fernando,Atlanta Hawks,F-C,6-10,240.0,Maryland,2581522.0
...,...,...,...,...,...,...,...
587,Ryan Rollins,Washington Wizards,G,6-3,180.0,Toledo,1719864.0
588,Landry Shamet,Washington Wizards,G,6-4,190.0,Wichita State,10250000.0
589,Tristan Vukcevic,Washington Wizards,F,6-10,220.0,Real Madrid,
590,Delon Wright,Washington Wizards,G,6-5,185.0,Utah,8195122.0


## A Review of the value_counts Method
- The `value_counts` method counts the number of times that each unique value occurs in a **Series**.

In [2]:
nba= pd.read_csv('nba.csv')
nba.head()

Unnamed: 0,Name,Team,Position,Height,Weight,College,Salary
0,Saddiq Bey,Atlanta Hawks,F,6-7,215.0,Villanova,4556983.0
1,Bogdan Bogdanovic,Atlanta Hawks,G,6-5,225.0,Fenerbahce,18700000.0
2,Kobe Bufkin,Atlanta Hawks,G,6-5,195.0,Michigan,4094244.0
3,Clint Capela,Atlanta Hawks,C,6-10,256.0,Elan Chalon,20616000.0
4,Bruno Fernando,Atlanta Hawks,F-C,6-10,240.0,Maryland,2581522.0


In [87]:
nba['Team'].value_counts()
nba['Position'].value_counts()
nba.College.value_counts(normalize= True) # to see it in a percentage basis we just need to multiply it by 100

College
Kentucky                    0.050173
Duke                        0.043253
UCLA                        0.025952
Southern California         0.020761
Gonzaga                     0.020761
                              ...   
Radford                     0.001730
Loyola-Maryland             0.001730
Austin Peay                 0.001730
California-Santa Barbara    0.001730
Toledo                      0.001730
Name: proportion, Length: 182, dtype: float64

## Drop Rows with Missing Values
- Pandas uses a `NaN` designation for cells that have a missing value.
- The `dropna` method deletes rows with missing values. Its default behavior is to remove a row if it has *any* missing values.
- Pass the `how` parameter an argument of "all" to delete rows where all the values are `NaN`.
- The `subset` parameters customizes/limits the columns that pandas will use to drop rows with missing values.

In [None]:
nba.dropna()
nba.dropna(how= 'any') # drops the row if any of there are a single missing value
nba.dropna(how= 'all') # drops the row only if it has all its values missing

Unnamed: 0,Name,Team,Position,Height,Weight,College,Salary
0,Saddiq Bey,Atlanta Hawks,F,6-7,215.0,Villanova,4556983.0
1,Bogdan Bogdanovic,Atlanta Hawks,G,6-5,225.0,Fenerbahce,18700000.0
2,Kobe Bufkin,Atlanta Hawks,G,6-5,195.0,Michigan,4094244.0
3,Clint Capela,Atlanta Hawks,C,6-10,256.0,Elan Chalon,20616000.0
4,Bruno Fernando,Atlanta Hawks,F-C,6-10,240.0,Maryland,2581522.0
...,...,...,...,...,...,...,...
585,Eugene Omoruyi,Washington Wizards,F,6-6,235.0,Oregon,559782.0
586,Jordan Poole,Washington Wizards,G,6-4,194.0,Michigan,27955357.0
587,Ryan Rollins,Washington Wizards,G,6-3,180.0,Toledo,1719864.0
588,Landry Shamet,Washington Wizards,G,6-4,190.0,Wichita State,10250000.0


In [90]:
nba.dropna(subset=['College']) # it will return a new dataframe with only not NaN values of College

Unnamed: 0,Name,Team,Position,Height,Weight,College,Salary
0,Saddiq Bey,Atlanta Hawks,F,6-7,215.0,Villanova,4556983.0
1,Bogdan Bogdanovic,Atlanta Hawks,G,6-5,225.0,Fenerbahce,18700000.0
2,Kobe Bufkin,Atlanta Hawks,G,6-5,195.0,Michigan,4094244.0
3,Clint Capela,Atlanta Hawks,C,6-10,256.0,Elan Chalon,20616000.0
4,Bruno Fernando,Atlanta Hawks,F-C,6-10,240.0,Maryland,2581522.0
...,...,...,...,...,...,...,...
586,Jordan Poole,Washington Wizards,G,6-4,194.0,Michigan,27955357.0
587,Ryan Rollins,Washington Wizards,G,6-3,180.0,Toledo,1719864.0
588,Landry Shamet,Washington Wizards,G,6-4,190.0,Wichita State,10250000.0
589,Tristan Vukcevic,Washington Wizards,F,6-10,220.0,Real Madrid,


In [92]:
nba.dropna(subset=['College', 'Salary'])

Unnamed: 0,Name,Team,Position,Height,Weight,College,Salary
0,Saddiq Bey,Atlanta Hawks,F,6-7,215.0,Villanova,4556983.0
1,Bogdan Bogdanovic,Atlanta Hawks,G,6-5,225.0,Fenerbahce,18700000.0
2,Kobe Bufkin,Atlanta Hawks,G,6-5,195.0,Michigan,4094244.0
3,Clint Capela,Atlanta Hawks,C,6-10,256.0,Elan Chalon,20616000.0
4,Bruno Fernando,Atlanta Hawks,F-C,6-10,240.0,Maryland,2581522.0
...,...,...,...,...,...,...,...
585,Eugene Omoruyi,Washington Wizards,F,6-6,235.0,Oregon,559782.0
586,Jordan Poole,Washington Wizards,G,6-4,194.0,Michigan,27955357.0
587,Ryan Rollins,Washington Wizards,G,6-3,180.0,Toledo,1719864.0
588,Landry Shamet,Washington Wizards,G,6-4,190.0,Wichita State,10250000.0


In [14]:
# creating a code capable of telling us how many NaN rows we have per column in order to decide wheter we drop them or not
nba['Name'].isna().sum() # verifying the quantity of NaN rows in column Name

# isna() method takes a look into the column/dataframe and returns booleans (True/False) depending on the corresponding value being filled or not

# sum() is a method that sums over all the not null values

# by combining them, we will be summing over all the "True" results obtained from isna()


1

In [17]:
# so, using the previous method we can se the following dict compreehension structure to see how many NaN rows we have per column in a more visual way

null_rows_per_column= {
    col: nba[col].isna().sum() for col in nba.columns
}
null_rows_per_column

{'Name': 1,
 'Team': 1,
 'Position': 8,
 'Height': 7,
 'Weight': 8,
 'College': 14,
 'Salary': 104}

In [18]:
# a much more simpler way to get the same result, but without having to build a dict for it would be by applying the same methods not to specific columns of the dataframe, but for its entire structure.

nba.isna().sum()

# which gives us the exact same result, but now as a new dataframe instead of a dict

Name          1
Team          1
Position      8
Height        7
Weight        8
College      14
Salary      104
dtype: int64

## Fill in Missing Values with the fillna Method
- The `fillna` method replaces missing `NaN` values with its argument.
- The `fillna` method is available on both **DataFrames** and **Series**.
- An extracted **Series** is a view on the original **DataFrame**, but the `fillna` method returns a copy.

## The astype Method I
- The `astype` method converts a **Series's** values to a specified type.
- Pass in the specified type as either a string or the core Python data type.
- Pandas cannot convert `NaN` values to numeric types, so we need to eliminate/replace them before we perform the conversion.
- The `dtypes` attribute returns a **Series** with the **DataFrame's** columns and their types.

## The astype Method II
- The `category` type is ideal for columns with a limited number of unique values.
- The `nunique` method will return a **Series** with the number of unique values in each column.
- With categories, pandas does not create a separate value in memory for each "cell". Rather, the cells point to a single copy for each unique value.

## Sort a DataFrame with the sort_values Method I
- The `sort_values` method sorts a **DataFrame** by the values in one or more columns. The default sort is an ascending one (alphabetical for strings).
- The first parameter (`by`) expects the column(s) to sort by.
- If sorting by a single column, pass a string with its name.
- The `ascending` parameter customizes the sort order.
- The `na_position` parameter customizes where pandas places `NaN` values.

## Sort a DataFrame with the sort_values Method II
- To sort by multiple columns, pass the `by` parameter a list of column names. Pandas will sort in the specified column order (first to last).
- Pass the `ascending` parameter a Boolean to sort all columns in a consistent order (all ascending or all descending).
- Pass `ascending` a list to customize the sort order *per* column. The `ascending` list length must match the `by` list.

## Sort a DataFrame by its Index
- The `sort_index` method sorts the **DataFrame** by its index positions/labels.

## Rank Values with the rank Method
- The `rank` method assigns a numeric ranking to each **Series** value.
- Pandas will assign the same rank to equal values and create a "gap" in the dataset for the ranks.