<a href="https://www.kaggle.com/code/mlvprasad/pandas-part-1-of-5-indepth-notebook?scriptVersionId=146781166" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

![mlv prasad](https://github.com/MlvPrasadOfficial/kaggle_notebooks/raw/main/mlvprasad.png)


![Pandas](https://github.com/MlvPrasadOfficial/kaggle_notebooks/raw/main/CANVA%20PANDAS/P11.png)


# Pandas 💫 Chapters index

1. [Introduction to Pandas](#introduction-to-pandas)
2. [Overview of Data Manipulation and Analysis](#overview-of-data-manipulation-and-analysis)
3. [Installing Pandas and Required Dependencies](#installing-pandas-and-required-dependencies)
4. [Importing the Pandas Library](#importing-the-pandas-library)
5. [Series and DataFrame: Introduction to Data Structures](#series-and-dataframe-introduction-to-data-structures)
6. [Creating a DataFrame from Scratch](#creating-a-dataframe-from-scratch)
7. [Loading Data into a DataFrame: CSV, Excel, and other formats](#loading-data-into-a-dataframe-csv-excel-and-other-formats)
8. [Inspecting DataFrames: head(), tail(), shape, info(), describe()](#inspecting-dataframes-head-tail-shape-info-describe)
9. [Indexing and Selecting Data: loc, iloc, column selection](#indexing-and-selecting-data-loc-iloc-column-selection)
10. [Conditional Selection and Filtering Data](#conditional-selection-and-filtering-data)
    
11. [Sorting Data in a DataFrame](#sorting-data-in-a-dataframe)
12. [Adding, Updating, and Deleting Columns](#adding-updating-and-deleting-columns)
13. [Handling Missing Data: isnull(), dropna(), fillna()](#handling-missing-data-isnull-dropna-fillna)
14. [Removing Duplicate Rows: duplicated(), drop_duplicates()](#removing-duplicate-rows-duplicated-drop_duplicates)
15. [Data Type Conversion and Handling](#data-type-conversion-and-handling)
16. [Dealing with Outliers](#dealing-with-outliers)
17. [Concatenating and Merging DataFrames](#concatenating-and-merging-dataframes)
18. [Reshaping Data: Pivot, Stack, Unstack](#reshaping-data-pivot-stack-unstack)
19. [Grouping and Aggregating Data: groupby(), aggregating functions](#grouping-and-aggregating-data-groupby-aggregating-functions)
    
20. [Applying Functions to Data: apply(), applymap()](#applying-functions-to-data-apply-applymap)



<h1 align="left"><font color='red'>1</font></h1>


# Chapter 1 : Introduction to Pandas

## Welcome to Magic world of pandas, where it is bread & butter for Data Analyst and Data Scientist 

<h1 align="left"><font color='red'>2</font></h1>


# Chapter 2 : Overview of Data Manipulation and Analysis
## 2.1 Introduction to Data Manipulation and Analysis
#### Data manipulation and analysis are essential steps in extracting valuable insights from raw data. These processes are crucial in the field of data science and analytics. Pandas is a powerful open-source library in Python that provides efficient tools for data manipulation and analysis.

```python
# Example: Importing the Pandas library
import pandas as pd

# Load data from a CSV file into a DataFrame
df = pd.read_csv('data.csv')
```

# 2.2 Key Features of Pandas
#### Pandas offers key features that make it a popular choice for data manipulation and analysis. The two primary data structures in Pandas are Series and DataFrame. A Series is a one-dimensional array-like object, and a DataFrame is a two-dimensional tabular data structure.


In [1]:

# Example: Creating a Series
import pandas as pd

# Create a Series with data and custom index
s = pd.Series([10, 20, 30, 40], index=['A', 'B', 'C', 'D'])
s

A    10
B    20
C    30
D    40
dtype: int64


## 2.3 Data Manipulation with Pandas
#### Pandas provides various techniques for data manipulation. You can load data from different sources, clean and preprocess the data, perform transformations, and merge or reshape datasets.


```python
# Example: Loading data from a CSV file
import pandas as pd

# Load data from a CSV file into a DataFrame
df = pd.read_csv('data.csv')

# Example: Cleaning data and handling missing values
import pandas as pd

# Remove rows with missing values
df.dropna()

# Fill missing values with a specific value
df.fillna(0)

# Example: Data transformation and feature engineering
import pandas as pd

# Create a new column by performing calculations on existing columns
df['total'] = df['column1'] + df['column2']

# Example: Merging datasets
import pandas as pd

# Merge two DataFrames based on a common column
merged_df = pd.merge(df1, df2, on='common_column')

# Example: Reshaping data
import pandas as pd

# Pivot a DataFrame based on specific columns
pivoted_df = df.pivot(index='index_column', columns='column_to_pivot')
```

<h1 align="left"><font color='red'>3</font></h1>


# Chapter 3: Installing Pandas and Required Dependencies
## 3.1 Installation of Pandas
#### To install Pandas, you can use the pip package installer, which is the recommended method. Open your command prompt or terminal and run the following command:

```python
pip install pandas
```


## 3.2 Installing Required Dependencies
#### Pandas has dependencies, such as NumPy and matplotlib, that need to be installed for its proper functioning. You can install these dependencies using pip.


```python
# Example: Installing NumPy
pip install numpy

# Example: Installing matplotlib
pip install matplotlib

```


## 3.3 Verifying the Installation
#### To ensure that Pandas is installed correctly, you can import it into your Python environment and check its version.

In [2]:
# Example: Importing Pandas and checking the version
import pandas as pd

print(pd.__version__)

1.5.3


<h1 align="left"><font color='red'>4</font></h1>


# Chapter 4: Importing the Pandas Library
## 4.1 Importing Pandas in Python
#### You can import the Pandas library using the import statement. It is convention to alias Pandas as pd for brevity in your code.


In [3]:

# Example: Importing Pandas
import pandas as pd



## 4.2 Exploring the Pandas Namespace
#### The Pandas library provides various functions, classes, and attributes that you can access using the Pandas namespace.

```python
# Example: Exploring the Pandas namespace
import pandas as pd

# Accessing functions
pd.function_name()

# Accessing classes
pd.ClassName()

# Accessing attributes
pd.attribute_name

```

## 4.3 Common Importing Techniques
#### In addition to importing the entire Pandas library, you can use different techniques to import specific modules, submodules, functions, or classes from Pandas. This allows you to only import what you need, reducing the memory footprint and potential naming conflicts.

```python
# Example: Importing specific modules or submodules
import pandas.module_name
import pandas.module_name.submodule

# Example: Importing specific functions or classes
from pandas import function_name
from pandas.module_name import ClassName

# Example: Importing Pandas alongside other commonly used libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
```

<h1 align="left"><font color='red'>5</font></h1>


# Chapter 5: Series and DataFrame: Introduction to Data Structures
## 5.1 Introduction to Series
#### A Series is a one-dimensional labeled array in Pandas that can hold any data type. In this subchapter, we will explore the basic properties and functionalities of Series, such as indexing, slicing, and mathematical operations.

In [4]:
# Example: Creating a Series
import pandas as pd

# Create a Series from a list
s = pd.Series([10, 20, 30, 40])

# Example: Indexing and slicing a Series
import pandas as pd

# Accessing values by index
value = s[0]

# Slicing the Series
sliced_series = s[1:3]
sliced_series

1    20
2    30
dtype: int64


## 5.2 Introduction to DataFrame
#### A DataFrame is a two-dimensional tabular data structure in Pandas that consists of rows and columns. It is the primary data structure used for data analysis and manipulation. In this subchapter, we will explore the basic properties and functionalities of DataFrames, including indexing, slicing, and column operations.


In [5]:
# Example: Creating a DataFrame
import pandas as pd

# Create a DataFrame from a dictionary
data = {'Name': ['John', 'Emily', 'Michael'],
        'Age': [25, 30, 35]}
df = pd.DataFrame(data)

# Example: Indexing and slicing a DataFrame
import pandas as pd

# Accessing a column by name
column = df['Name']

# Slicing the DataFrame
sliced_df = df.iloc[1:3]
sliced_df

Unnamed: 0,Name,Age
1,Emily,30
2,Michael,35


<h1 align="left"><font color='red'>6</font></h1>


# Chapter 6: Creating a DataFrame from Scratch
## 6.1 Creating a DataFrame from a Dictionary
#### In this subchapter, we will learn how to create a DataFrame from scratch using a Python dictionary. We will explore different ways of specifying the index and columns of the DataFrame, as well as handling missing values.



In [6]:
# Example: Creating a DataFrame from a dictionary
import pandas as pd

# Create a DataFrame from a dictionary with default index and columns
data = {'Name': ['John', 'Emily', 'Michael'],
        'Age': [25, 30, 35]}
df = pd.DataFrame(data)

# Create a DataFrame with custom index and columns
df_custom = pd.DataFrame(data, index=['A', 'B', 'C'], columns=['Name', 'Age', 'City'])
df_custom

Unnamed: 0,Name,Age,City
A,John,25,
B,Emily,30,
C,Michael,35,



## 6.2 Creating a DataFrame from a List of Lists or Numpy Arrays
#### In this subchapter, we will explore how to create a DataFrame from a list of lists or NumPy arrays. We will cover scenarios where the lists or arrays have different lengths, and how to handle missing or mismatched values.


In [7]:
# Example: Creating a DataFrame from a list of lists
import pandas as pd

# Create a DataFrame from a list of lists
data = [['John', 25], ['Emily', 30], ['Michael', 35]]
df = pd.DataFrame(data, columns=['Name', 'Age'])

# Example: Creating a DataFrame from NumPy arrays
import pandas as pd
import numpy as np

# Create a DataFrame from NumPy arrays
data = np.array([[1, 2, 3], [4, 5, 6]])
df = pd.DataFrame(data, columns=['A', 'B', 'C'])
df

Unnamed: 0,A,B,C
0,1,2,3
1,4,5,6


<h1 align="left"><font color='red'>7</font></h1>

# Chapter 7: Loading Data into a DataFrame: CSV, Excel, and other formats
## 7.1 Loading Data from a CSV File
#### In this subchapter, we will learn how to load data from a CSV file into a DataFrame using Pandas. We will cover different options for customizing the import process, such as specifying delimiter, encoding, and handling missing values.

```python
# Example: Loading data from a CSV file
import pandas as pd

# Load data from a CSV file into a DataFrame
df = pd.read_csv('data.csv')

# Example: Customizing the import process
import pandas as pd

# Load data with custom delimiter and missing values handling
df = pd.read_csv('data.csv', delimiter=';', na_values=['NA', 'N/A'])
df
```



## 7.2 Loading Data from an Excel File
#### In this subchapter, we will explore how to load data from an Excel file into a DataFrame using Pandas. We will cover different options for importing specific sheets, selecting columns, and skipping rows.


```python
# Example: Loading data from an Excel file
import pandas as pd

# Load data from an Excel file into a DataFrame
df = pd.read_excel('data.xlsx')

# Example: Importing specific sheets and selecting columns
import pandas as pd

# Load data from specific sheets and select columns
df_sheet1 = pd.read_excel('data.xlsx', sheet_name='Sheet1', usecols=['A', 'B'])
df_sheet2 = pd.read_excel('data.xlsx', sheet_name='Sheet2', usecols=['C', 'D'])
```



## 7.3 Loading Data from Other Formats
#### In this subchapter, we will discuss other common formats for loading data into a DataFrame using Pandas. We will cover loading data from JSON, SQL databases, and other file formats.



```python
# Example: Loading data from a JSON file
import pandas as pd

# Load data from a JSON file into a DataFrame
df = pd.read_json('data.json')

# Example: Loading data from a SQL database
import pandas as pd
import sqlite3

# Connect to a SQLite database
conn = sqlite3.connect('database.db')

# Load data from a SQL query into a DataFrame
df = pd.read_sql_query('SELECT * FROM table', conn)
```


<h1 align="left"><font color='red'>8</font></h1>

# Chapter 8: Inspecting DataFrames: head(), tail(), shape, info(), describe()
## 8.1 The head() and tail() Methods
#### In this subchapter, we will explore two useful methods for inspecting DataFrames: head() and tail(). These methods allow us to view a sample of the DataFrame's rows, providing a quick overview of the data.

```python
# Example: Using head() to display the first few rows of a DataFrame
import pandas as pd

# Display the first 5 rows of the DataFrame
df.head()

# Example: Using tail() to display the last few rows of a DataFrame
import pandas as pd

# Display the last 5 rows of the DataFrame
df.tail()
```


## 8.2 The shape Attribute
#### The shape attribute of a DataFrame provides information about its dimensions, i.e., the number of rows and columns.

```python
# Example: Using shape to get the dimensions of a DataFrame
import pandas as pd

# Get the dimensions of the DataFrame
rows, columns = df.shape
print(f"The DataFrame has {rows} rows and {columns} columns.")
```

## 8.3 The info() Method
#### The info() method provides a summary of the DataFrame's structure, including the number of non-null values and the data types of each column.

```python
# Example: Using info() to get the summary of a DataFrame
import pandas as pd

# Display the summary of the DataFrame
df.info()
```


## 8.4 The describe() Method
#### The describe() method generates descriptive statistics of the DataFrame's numerical columns, such as count, mean, standard deviation, minimum, and maximum values.

```python
# Example: Using describe() to get descriptive statistics of a DataFrame
import pandas as pd

# Generate descriptive statistics of the DataFrame
df.describe()
```

<h1 align="left"><font color='red'>9</font></h1>


# Chapter 9: Indexing and Selecting Data: loc, iloc, column selection
## 9.1 Indexing with loc
#### The loc indexer is used for label-based indexing in Pandas. It allows you to select specific rows and columns using their labels.

```python
# Example: Using loc to select rows and columns by labels
import pandas as pd

# Select a single row by label
row = df.loc['label']

# Select multiple rows by labels
rows = df.loc[['label1', 'label2']]

# Select rows and columns by labels
subset = df.loc[['label1', 'label2'], ['column1', 'column2']]
```

# 9.2 Indexing with iloc
#### The iloc indexer is used for integer-based indexing in Pandas. It allows you to select specific rows and columns using their integer positions.

```python
# Example: Using iloc to select rows and columns by integer positions
import pandas as pd

# Select a single row by integer position
row = df.iloc[0]

# Select multiple rows by integer positions
rows = df.iloc[[0, 1, 2]]

# Select rows and columns by integer positions
subset = df.iloc[[0, 1, 2], [0, 1, 2]]
```


## 9.3 Selecting Columns
#### You can select specific columns from a DataFrame using indexing notation or the loc and iloc indexers.

```python
# Example: Selecting columns from a DataFrame
import pandas as pd

# Select a single column by label
column = df['column_name']

# Select multiple columns by labels
columns = df[['column1', 'column2']]

# Select a single column by integer position using iloc
column = df.iloc[:, 0]

# Select multiple columns by integer positions using iloc
columns = df.iloc[:, [0, 1, 2]]
```

<h1 align="left"><font color='red'>10</font></h1>

# Chapter 10: Conditional Selection and Filtering Data
## 10.1 Conditional Selection with Comparison Operators
#### In this subchapter, we will learn how to use comparison operators to perform conditional selection on DataFrames. This allows us to filter rows based on specific conditions.

```python
# Example: Conditional selection using comparison operators
import pandas as pd

# Select rows where a column meets a specific condition
subset = df[df['column'] > 10]

# Select rows based on multiple conditions
subset = df[(df['column1'] > 10) & (df['column2'] < 20)]
```

## 10.2 Conditional Selection with isin() Method
### The isin() method allows you to perform conditional selection based on multiple values in a column.

```python
# Example: Conditional selection using isin() method
import pandas as pd

# Select rows where a column's value is in a list of values
subset = df[df['column'].isin(['value1', 'value2', 'value3'])]
```

## 10.3 Conditional Selection with String Methods
#### Pandas provides various string methods that can be used for conditional selection based on string values.

```python
# Example: Conditional selection using string methods
import pandas as pd

# Select rows where a column's value contains a specific substring
subset = df[df['column'].str.contains('substring')]

# Select rows where a column's value starts with a specific string
subset = df[df['column'].str.startswith('string')]
```


<h1 align="left"><font color='red'>11</font></h1>

# Chapter 11: Sorting Data in a DataFrame
## 11.1 Sorting by Columns
#### In this subchapter, we will explore how to sort a DataFrame based on one or more columns. We can specify the sorting order as ascending or descending.

```python
# Example: Sorting a DataFrame by one column
import pandas as pd

# Sort the DataFrame by a single column in ascending order
sorted_df = df.sort_values(by='column_name')

# Sort the DataFrame by a single column in descending order
sorted_df = df.sort_values(by='column_name', ascending=False)
```


## 11.2 Sorting by Multiple Columns
#### We can also sort a DataFrame based on multiple columns. The sorting order can be specified independently for each column.

```python
# Example: Sorting a DataFrame by multiple columns
import pandas as pd

# Sort the DataFrame by multiple columns in ascending order
sorted_df = df.sort_values(by=['column1', 'column2'])

# Sort the DataFrame by multiple columns with different sorting orders
sorted_df = df.sort_values(by=['column1', 'column2'], ascending=[True, False])
```


<h1 align="left"><font color='red'>12</font></h1>


# Chapter 12: Adding, Updating, and Deleting Columns
## 12.1 Adding Columns
#### In this subchapter, we will learn how to add new columns to a DataFrame. We can assign constant values or perform computations based on existing columns.

```python
# Example: Adding a new column to a DataFrame
import pandas as pd

# Add a new column with a constant value
df['new_column'] = 10

# Add a new column based on computations with existing columns
df['total'] = df['column1'] + df['column2']

```

## 12.2 Updating Columns
#### We can update the values of existing columns in a DataFrame by assigning new values or applying functions to the column values.

```python
# Example: Updating column values in a DataFrame
import pandas as pd

# Update column values based on a condition
df.loc[df['column'] > 10, 'column'] = 20

# Update column values using a function
df['column'] = df['column'].apply(lambda x: x * 2)
```

## 12.3 Deleting Columns
#### In this subchapter, we will explore how to delete columns from a DataFrame. We can remove columns using the drop() method or the del statement.

```python
# Example: Deleting columns from a DataFrame
import pandas as pd

# Remove columns using the drop() method
df.drop(['column1', 'column2'], axis=1, inplace=True)

# Remove columns using the del statement
del df['column']
```


<h1 align="left"><font color='red'>13</font></h1>


# Chapter 13: Handling Missing Data: isnull(), dropna(), fillna()
## 13.1 Checking for Missing Data
#### In this subchapter, we will learn how to identify missing or null values in a DataFrame using the isnull() method. This will help us in understanding the presence of missing data in our dataset.

```python
# Example: Checking for missing data in a DataFrame
import pandas as pd

# Check for missing values in the entire DataFrame
missing_values = df.isnull().sum()

# Check for missing values in a specific column
missing_values = df['column'].isnull().sum()
```


## 13.2 Dropping Rows or Columns with Missing Data
#### We can remove rows or columns that contain missing data using the dropna() method. This allows us to clean our dataset and ensure data integrity.



```python
# Example: Dropping rows or columns with missing data
import pandas as pd

# Drop rows with missing values
df.dropna(axis=0, inplace=True)

# Drop columns with missing values
df.dropna(axis=1, inplace=True)
```


## 13.3 Filling Missing Data
#### In this subchapter, we will explore how to fill missing values in a DataFrame using the fillna() method. This allows us to impute or replace missing data with appropriate values.

```python
# Example: Filling missing data in a DataFrame
import pandas as pd

# Fill missing values with a constant value
df.fillna(0, inplace=True)

# Fill missing values with the mean of the column
df.fillna(df.mean(), inplace=True)
```

<h1 align="left"><font color='red'>14</font></h1>


# Chapter 14: Removing Duplicate Rows: duplicated(), drop_duplicates()
## 14.1 Checking for Duplicate Rows
#### In this subchapter, we will learn how to identify duplicate rows in a DataFrame using the duplicated() method. This will help us in detecting and handling duplicate data.

```python
# Example: Checking for duplicate rows in a DataFrame
import pandas as pd

# Check for duplicate rows in the entire DataFrame
duplicate_rows = df.duplicated()

# Check for duplicate rows in specific columns
duplicate_rows = df.duplicated(subset=['column1', 'column2'])
```


## 14.2 Dropping Duplicate Rows
#### We can remove duplicate rows from a DataFrame using the drop_duplicates() method. This ensures that our dataset only contains unique rows.

```python
# Example: Dropping duplicate rows from a DataFrame
import pandas as pd

# Drop duplicate rows based on all columns
df.drop_duplicates(inplace=True)

# Drop duplicate rows based on specific columns
df.drop_duplicates(subset=['column1', 'column2'], inplace=True)

```

<h1 align="left"><font color='red'>15</font></h1>


# Chapter 15: Data Type Conversion and Handling
## 15.1 Converting Data Types
#### In this subchapter, we will explore how to convert the data types of columns in a DataFrame. This can be useful when the original data types need to be adjusted for specific operations or analysis.

```python
# Example: Converting data types in a DataFrame
import pandas as pd

# Convert a column to a different data type
df['column'] = df['column'].astype('new_data_type')

# Convert multiple columns to different data types
df = df.astype({'column1': 'new_data_type1', 'column2': 'new_data_type2'})
```


## 15.2 Handling Categorical Data
#### Categorical data can be represented as strings or numerical codes. In this subchapter, we will learn how to handle categorical data in a DataFrame, including converting strings to categories and performing operations on categorical columns.


```python
# Example: Handling categorical data in a DataFrame
import pandas as pd

# Convert a column to categorical data type
df['column'] = df['column'].astype('category')

# Perform operations on categorical columns
df['column'] = df['column'].cat.codes
```

<h1 align="left"><font color='red'>16</font></h1>


# Chapter 16: Dealing with Outliers
## 16.1 Identifying Outliers
#### In this subchapter, we will learn how to identify outliers in a DataFrame. Outliers are extreme values that deviate significantly from the majority of the data points.

```python
# Example: Identifying outliers in a DataFrame
import pandas as pd

# Calculate the z-score for each data point
z_scores = (df - df.mean()) / df.std()

# Identify outliers based on a threshold
outliers = df[z_scores > threshold]
````


## 16.2 Handling Outliers
#### Once outliers are identified, we can choose to handle them in different ways. Some common approaches include removing the outliers, capping/extending the outliers, or transforming the data.

```python
# Example: Handling outliers in a DataFrame
import pandas as pd

# Remove outliers from the DataFrame
df = df[~(z_scores > threshold).any(axis=1)]

# Cap or extend the outliers to a specific range
df[z_scores > threshold] = cap_value

# Transform the data using a suitable transformation
df['column'] = np.log(df['column'])
```

<h1 align="left"><font color='red'>17</font></h1>


# Chapter 17: Concatenating and Merging DataFrames
## 17.1 Concatenating DataFrames
#### In this subchapter, we will explore how to concatenate multiple DataFrames along different axes (rows or columns). This allows us to combine data from multiple sources into a single DataFrame.

```python
# Example: Concatenating DataFrames along rows
import pandas as pd

# Concatenate DataFrames along rows
concatenated_df = pd.concat([df1, df2, df3], axis=0)

# Example: Concatenating DataFrames along columns
import pandas as pd

# Concatenate DataFrames along columns
concatenated_df = pd.concat([df1, df2, df3], axis=1)
```


## 17.2 Merging DataFrames
#### We can merge DataFrames based on common columns or indices using the merge() function. This allows us to combine data from multiple DataFrames into a single DataFrame based on specified merge keys.

```python
# Example: Merging DataFrames based on common columns
import pandas as pd

# Merge DataFrames based on common columns
merged_df = pd.merge(df1, df2, on='common_column')

# Example: Merging DataFrames based on common indices
import pandas as pd

# Merge DataFrames based on common indices
merged_df = pd.merge(df1, df2, left_index=True, right_index=True)
```

<h1 align="left"><font color='red'>18</font></h1>


## Chapter 18: Reshaping Data: Pivot, Stack, Unstack
## 18.1 Reshaping with Pivot
#### In this subchapter, we will learn how to reshape data using the pivot() function. Pivot allows us to convert data from a long format to a wide format, reorganizing the data based on specified columns.


```python
# Example: Reshaping data using pivot
import pandas as pd

# Reshape the data from long to wide format
pivot_df = df.pivot(index='index_column', columns='columns_column', values='values_column')
```

## 18.2 Reshaping with Stack and Unstack
#### We can also reshape data using the stack() and unstack() methods. These methods transform data between wide and long formats, allowing us to manipulate hierarchical index levels.

```python
# Example: Reshaping data using stack and unstack
import pandas as pd

# Reshape data from wide to long format using stack
stacked_df = df.stack()

# Reshape data from long to wide format using unstack
unstacked_df = df.unstack().
```


<h1 align="left"><font color='red'>19</font></h1>

# Chapter 19: Grouping and Aggregating Data: groupby(), aggregating functions
## 19.1 Grouping Data
#### In this subchapter, we will explore how to group data in a DataFrame using the groupby() function. Grouping allows us to split the data into groups based on specified criteria.

```python
# Example: Grouping data in a DataFrame
import pandas as pd

# Group the data based on a column
grouped_data = df.groupby('column')

# Group the data based on multiple columns
grouped_data = df.groupby(['column1', 'column2'])
```


## 19.2 Aggregating Data
#### Once the data is grouped, we can apply aggregating functions to obtain summary statistics for each group. Common aggregating functions include mean, sum, count, max, min, etc.

```python
# Example: Aggregating data in grouped DataFrame
import pandas as pd

# Calculate the mean of each group
mean_values = grouped_data.mean()

# Calculate the sum of each group
sum_values = grouped_data.sum()

# Calculate the count of each group
count_values = grouped_data.size()
```


<h1 align="left"><font color='red'>20</font></h1>


# Chapter 20: Applying Functions to Data: apply(), applymap()
## 20.1 Applying Functions to Series
#### In this subchapter, we will learn how to apply functions to Series objects using the apply() method. This allows us to perform custom operations on each element of a Series.

```python
# Example: Applying a function to a Series
import pandas as pd

# Apply a function to each element of a Series
result = series.apply(function)
```


## 20.2 Applying Functions to DataFrames
#### We can also apply functions to DataFrames using the apply() method. This allows us to apply custom operations to either rows or columns of a DataFrame.

```python
# Example: Applying a function to a DataFrame
import pandas as pd

# Apply a function to each row of a DataFrame
result = df.apply(function, axis=1)

# Apply a function to each column of a DataFrame
result = df.apply(function, axis=0)
```


## 20.3 Applying Element-wise Functions
#### To apply functions element-wise to a DataFrame, we can use the applymap() method. This allows us to perform custom operations on each element of a DataFrame.