# Exploring Data Using Pandas

Exploring data using Pandas involves various operations to gain insights and perform analysis on the dataset. Here are some common techniques for data exploration using Pandas:

**Loading Data:**
   - Read data from various file formats using Pandas' `read_csv()`, `read_excel()`, or `read_sql()` functions.
   - Explore the structure and initial rows of the DataFrame using `head()`, `tail()`, or `sample()` methods.

**Understanding the Data:**
   - Check the dimensions of the DataFrame using `shape`.
   - Inspect the column names using `columns`.
   - Obtain summary statistics of numeric columns with `describe()`.
   - Get information about the DataFrame, including data types and missing values, using `info()`.

**Data Selection and Filtering:**
   - Select specific columns using indexing (`df['column_name']`) or dot notation (`df.column_name`).
   - Filter rows based on conditions using boolean indexing (`df[condition]`).
   - Use logical operators (e.g., `&` for AND, `|` for OR) for complex filtering.

**Data Aggregation and Grouping:**
   - Group data based on one or more columns using `groupby()`.
   - Apply aggregation functions like `sum()`, `mean()`, `count()`, etc., to calculate statistics for grouped data.
   - Perform multiple aggregations simultaneously using `agg()`.

**Data Visualization:**
   - Generate plots using Pandas' integration with Matplotlib or other visualization libraries.
   - Use functions like `plot()`, `hist()`, `boxplot()`, `scatter()`, etc., to create visual representations of data.

**Handling Missing Data:**
   - Identify missing values in the DataFrame using `isnull()` or `isna()`.
   - Handle missing data by either dropping rows/columns using `dropna()` or filling missing values with `fillna()`.

**Data Transformation:**
   - Perform data cleaning operations such as removing duplicates using `duplicated()` and `drop_duplicates()`.
   - Convert data types of columns using `astype()`.
   - Extract information from text or date columns using string methods or datetime functions.

These are just a few examples of how you can explore data using Pandas. Depending on your specific dataset and analysis goals, you may need to apply additional operations and techniques. Pandas provides a comprehensive set of tools to manipulate, analyze, and visualize data, making it a powerful library for data exploration and analysis in Python.

In [None]:
import pandas as pd

# Creating a DataFrame
data = {'Name': ['John', 'Jane', 'Mike'],
        'Age': [25, 30, 35],
        'City': ['New York', 'London', 'Sydney']}
df = pd.DataFrame(data)

# Performing data manipulation
filtered_df = df[df['Age'] > 28]
sorted_df = df.sort_values('Name')

# Displaying the DataFrame
print(df)
print(filtered_df)
print(sorted_df)