# **Pandas**

The `Pandas library` in Python is an `open-source` `data manipulation and analysis library` that is widely used in `data science`, `AI`, and `machine learning`. It offers high-level data structures and tools that make it easy to work with structured data, especially when dealing with `large datasets`.

The `Pandas library` in Python is like a `super-organized tool` that helps you `manage` and `understand data`. Think of it as a combination of a powerful spreadsheet (like Excel) and a database, but all controlled through `Python`.

> ### Pandas helps you work with two main types of data:

`Series`: Imagine a single column in Excel, where each item has a label or index. This is useful for lists or single columns of data.

`DataFrame`: Think of a DataFrame as a full table with rows and columns, like a whole spreadsheet page where each column can hold different types of data (like numbers, words, or dates).

> ### Pandas is designed to help you do things like:

- Clean up data (fixing or filling in missing values, removing bad data)

- Organize and filter data (find specific rows or groups)

- Calculate summaries (like averages or counts)

- Combine data from different tables

- Visualize data (Pandas works well with chart libraries)

> ### Where Do We Use Pandas?
You can use Pandas anytime you’re dealing with a lot of structured data, like:

- `Data Science`: Preparing and exploring data before you build a model

- `Finance: `Analyzing stock prices, trends, or economic data

- `Web Apps:` Storing, processing, and displaying data on the backend

- `Machine Learning:` Prepping data so models can use it effectively

- `Research:` Organizing experimental data for analysis

> ### Why is Pandas Important for Data Science and AI?
- `Cleaning and Preparing Data:` Data usually needs fixing before you can analyze it, like filling in missing info or removing outliers. Pandas makes this fast and easy.

- `Exploring and Understanding Data:` Before making predictions, you need to understand your data (patterns, relationships, trends). Pandas makes it easy to explore and find insights.

- `Handling Big Data Efficiently:` Pandas is optimized to handle large datasets so that you can work with big data without the usual slowdowns.

- `Converting Data for AI:` Pandas can help format and transform data into a form that AI and machine learning models can understand.

- `Works Well with Other Tools:` Pandas works smoothly with other libraries like NumPy (for math), Matplotlib (for charts), and Scikit-learn (for machine learning), making it a core part of data science in Python.

In short, Pandas makes it easy to turn messy or complex data into organized, ready-to-use data for data science and AI.

You can learn more about `Pandas` [here](https://pandas.pydata.org/about/index.html)


---

## **Download Pandas Library**

In [2]:
# !pip install pandas -q

## **Importing Libraries**

In [3]:
import pandas as pd

## **Series**


A `Series` in Pandas is a `one-dimensional array-like structure` with `labels`, also known as an `index`, for each element. Think of a Series as a single column of data, similar to a list or column in a spreadsheet, where each entry has a specific label (index) that helps you locate it. A Series can hold various types of data, such as numbers, text, dates, or even more complex data types.

> ### Key Features of a Pandas Series:

- `Values:` The actual data in the Series (e.g., numbers, strings).

- `Index:` A set of labels that allows you to access each value quickly. If not specified, Pandas will create a 
default integer index starting from 0.

- `Homogeneity:` Generally, all elements in a Series are of the same type, though it can technically hold mixed 
types.

- `Flexible Indexing:` You can use the index to retrieve or manipulate specific elements quickly, similar to a dictionary.

### **Series from list**

In [4]:
mobiles = ['Iphone', 'Samsung', 'Oppo', 'Vivo']

In [5]:
type(mobiles)

list

In [6]:
import numpy as np

type(np.array(mobiles))

numpy.ndarray

In [7]:
pd.Series(mobiles)

0     Iphone
1    Samsung
2       Oppo
3       Vivo
dtype: object

In [8]:
# string
type(pd.Series(mobiles))

pandas.core.series.Series

In [9]:
# integers
marks = [100, 30, 40, 20, 'Hasnain', "Ahmad"]
pd.Series(marks)

0        100
1         30
2         40
3         20
4    Hasnain
5      Ahmad
dtype: object

In [10]:
import numpy as np

In [11]:
np.array(marks)

array(['100', '30', '40', '20', 'Hasnain', 'Ahmad'], dtype='<U21')

In [12]:
# custom index
marks = [100, 30, 40, 20]
index = ['Maths', 'Urdu', 'Science', 'English']

pd.Series(marks, index=index)

Maths      100
Urdu        30
Science     40
English     20
dtype: int64

In [13]:
# setting a name
marks = [100, 30, 40, 20]
index = ['Maths', 'Urdu', 'Science', 'English']

pd.Series(marks, index=index, name='Hasnain ke Marks')

Maths      100
Urdu        30
Science     40
English     20
Name: Hasnain ke Marks, dtype: int64

#### `Examples of Series with Questions:`

In [14]:
# Student Scores 
pd.Series(['90%', "100%", '0%'], index=['', '', ''])

     90%
    100%
      0%
dtype: object

In [15]:
# Monthly Expenses
pd.Series([100, 80, 90, 40], index=['Tamatar', 'Piyaz', 'Alu', 'Adrak'])

Tamatar    100
Piyaz       80
Alu         90
Adrak       40
dtype: int64

In [16]:
# Product Prices