# Python Tutorial: Data Wrangling

Data wrangling, also known as data munging, is the process of cleaning, transforming, and enriching raw data into a more usable format for analysis. Python offers several libraries for data wrangling, with pandas being one of the most commonly used ones.


## Installation
  
You can install pandas using pip:


In [None]:
pip install pandas


In [None]:
import pandas as pd


## Example 1: Loading Data


In [None]:
# Read a CSV file into a DataFrame
df = pd.read_csv('data.csv')
print(df.head(50))


## Example 2: Cleaning Data


In [None]:
# Remove duplicates
df = df.drop_duplicates()
print(df.head(50))


In [None]:
# Handle missing values
df = df.dropna()
print(df.head(50))


In [None]:
# Convert data types
df['Date'] = pd.to_datetime(df['Date'])
print(df.head(50))


## Example 3: Transforming Data


In [None]:
# Create a new column
df['Total'] = df['Quantity'] * df['Price']

# Group by a categorical variable and calculate statistics
summary = df.groupby('Category')['Total'].sum()
print(summary)


## Exercise 1: 

Read a CSV file named 'sales-data.csv' into a DataFrame and display the first 5 rows.


In [None]:
# Solution


## Exercise 2: 

Handle missing values in the DataFrame created in Exercise 1 by filling them with the mean of the respective column.


In [None]:
# Solution


## Exercise 3: 

Create a new column in the DataFrame created in Exercise 1, which calculates the profit (Revenue - Cost).


In [None]:
# Solution


## Summary

Data wrangling is an essential step in the data analysis process, as it ensures that the data is clean, consistent, and in the right format for analysis. Python, with libraries like pandas, provides powerful tools for performing various data wrangling tasks efficiently.
                                                                                                                                                                                                                     

<details>
<summary><b>Instructor Notes</b></summary>

Nothing to add...

</details>