# Pandas

Pandas is a powerful and widely used open-source Python library for data manipulation and analysis. It provides easy-to-use data structures and functions to work with structured data, making it an essential tool for data scientists, analysts, and researchers.

Key features of the Pandas library include:

* **DataFrame:** 
The primary data structure in Pandas is the DataFrame, which is a two-dimensional labeled data structure with columns of potentially different data types. It can be thought of as a spreadsheet or SQL table. DataFrames allow for easy manipulation and analysis of data, including filtering, sorting, grouping, and aggregation.

In [1]:
import pandas as pd

# Creating a DataFrame from a dictionary
data = {'Name': ['Alice', 'Bob', 'Charlie', 'David'],
        'Age': [25, 30, 35, 40],
        'City': ['New York', 'Los Angeles', 'Chicago', 'Houston']}
df = pd.DataFrame(data)
print(df)

      Name  Age         City
0    Alice   25     New York
1      Bob   30  Los Angeles
2  Charlie   35      Chicago
3    David   40      Houston


* **Reading Data from a File:**
Pandas can read data from various file formats such as CSV, Excel, JSON, SQL, and more.

In [None]:
# Reading data from a CSV file
df = pd.read_csv('data.csv')
print(df.head())  # Print first few rows of the DataFrame

* **Basic Data Manipulation:**
Pandas provides functions for data manipulation, including filtering, sorting, grouping, and aggregation.

In [None]:
# Filtering rows based on a condition
young_people = df[df['Age'] < 30]

# Sorting DataFrame by Age in descending order
df_sorted = df.sort_values(by='Age', ascending=False)

# Grouping data by City and calculating average Age
avg_age_by_city = df.groupby('City')['Age'].mean()

* **Handling Missing Values:**
Pandas provides functions to handle missing values, including checking for missing values, dropping rows with missing values, and filling missing values with a specified value.

In [None]:
# Checking for missing values
print(df.isnull().sum())

# Dropping rows with missing values
df_cleaned = df.dropna()

# Filling missing values with a specified value
df_filled = df.fillna({'Age': df['Age'].mean()})

* **Exporting Data to a File:**
Pandas can export DataFrame to various file formats such as CSV, Excel, JSON, etc.
python


In [None]:
# Exporting DataFrame to a CSV file
df.to_csv('output.csv', index=False)

Overall, Pandas is an essential tool for data analysis in Python, providing powerful data manipulation and analysis capabilities that streamline the process of working with structured data. Its intuitive syntax and extensive functionality make it a popular choice for data professionals.