# Pandas Examples
Pandas is a powerful, open-source Python library widely used for data manipulation and analysis. Built on top of the NumPy library, Pandas provides flexible and efficient data structures that make working with structured data intuitive and fast. It's a cornerstone for data scientists and analysts, enabling tasks from data cleaning and transformation to aggregation and visualization.</br>
[pandas documentation](https://pandas.pydata.org/)</br>
### Pandas Features
* Fast and efficient data structures: Primarily Series and DataFrame, designed for handling large datasets.
* Integrated handling of missing data: Simplifies dealing with NaN values.
* Flexible reshaping and pivoting of data sets.
* Powerful, group-by functionality for performing split-apply-combine operations on data sets.
* Intelligent label-based slicing, fancy indexing, and subsetting of large data sets.
* Intuitive merging and joining of data sets.
* Robust I/O tools for loading data from flat files (CSV, delimited), Excel files, databases, and more.


#### Pandas Series

In [3]:
import pandas as pd
import numpy as np

# 1. From a list
data_list = [10, 20, 30, np.nan, 40, 50]
s1 = pd.Series(data_list)
print("Series from list:")
print(s1)
print("-" * 30)

# 2. From a NumPy array
data_array = np.array([100, 200, 300, 400])
s2 = pd.Series(data_array)
print("Series from NumPy array:")
print(s2)
print("-" * 30)

# 3. From a dictionary (keys become the index)
data_dict = {'a': 1, 'b': 2, 'c': 3, 'd': 4}
s3 = pd.Series(data_dict)
print("Series from dictionary:")
print(s3)
print("-" * 30)

# 4. Series with a custom index
s4 = pd.Series([10, 20, 30], index=['x', 'y', 'z'])
print("Series with custom index:")
print(s4)


Series from list:
0    10.0
1    20.0
2    30.0
3     NaN
4    40.0
5    50.0
dtype: float64
------------------------------
Series from NumPy array:
0    100
1    200
2    300
3    400
dtype: int64
------------------------------
Series from dictionary:
a    1
b    2
c    3
d    4
dtype: int64
------------------------------
Series with custom index:
x    10
y    20
z    30
dtype: int64


#### Pandas DataFrames

In [2]:
import pandas as pd
import numpy as np

# 1. From a dictionary of lists (most common)
# Each key becomes a column name, and its value (a list) becomes the column's data.
data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'David'],
    'Age': [25, 30, 35, 40],
    'City': ['New York', 'Los Angeles', 'Chicago', 'Houston']
}
df1 = pd.DataFrame(data)
print("DataFrame from dictionary of lists:")
print(df1)
print("-" * 30)

# 2. From a dictionary of Series
# Similar to a dictionary of lists, but values are Pandas Series.
data_series = {
    'ColA': pd.Series([1, 2, 3]),
    'ColB': pd.Series([4, 5, 6])
}
df2 = pd.DataFrame(data_series)
print("DataFrame from dictionary of Series:")
print(df2)
print("-" * 30)

# 3. From a list of dictionaries (each dictionary represents a row)
# The keys of the dictionaries become column names. If a key is missing in a dict, NaN is used.
data_rows = [
    {'Name': 'Eve', 'Age': 28},
    {'Name': 'Frank', 'Age': 32, 'City': 'Miami'}
]
df3 = pd.DataFrame(data_rows)
print("DataFrame from list of dictionaries:")
print(df3)
print("-" * 30)

# 4. From a NumPy array (requires specifying column names and optionally an index)
# The array provides the values, and you provide the labels for rows (index) and columns.
df4 = pd.DataFrame(np.random.rand(3, 2), # 3 rows, 2 columns of random numbers
                   index=['row1', 'row2', 'row3'],
                   columns=['colA', 'colB'])
print("DataFrame from NumPy array with custom index and columns:")
print(df4)

DataFrame from dictionary of lists:
      Name  Age         City
0    Alice   25     New York
1      Bob   30  Los Angeles
2  Charlie   35      Chicago
3    David   40      Houston
------------------------------
DataFrame from dictionary of Series:
   ColA  ColB
0     1     4
1     2     5
2     3     6
------------------------------
DataFrame from list of dictionaries:
    Name  Age   City
0    Eve   28    NaN
1  Frank   32  Miami
------------------------------
DataFrame from NumPy array with custom index and columns:
          colA      colB
row1  0.931644  0.660143
row2  0.494502  0.648583
row3  0.562405  0.546478
