## Series

A Series is a one-dimensional labeled array-like object. Think of it like a single column in a spreadsheet or a single variable in a dataset. It holds a sequence of values (of any data type) along with an associated index.

**Creating a Pandas Series**

A Series is like a list or an array, but with labels (index) for each element.



In [7]:
import pandas as pd
data = pd.Series([0.25, 0.5, 0.75, 1.0])
print(data)

0    0.25
1    0.50
2    0.75
3    1.00
dtype: float64



This creates a Series with default integer indices.

**Series with Custom Index:**

We can create a Series with custom indices.


In [13]:
data = pd.Series([0.25, 0.5, 0.75, 1.0], index=['a', 'b', 'c', 'd'])
print(data)

a    0.25
b    0.50
c    0.75
d    1.00
dtype: float64


**Series as a Dictionary:**

A Series can be created from a dictionary, where keys become the index.


In [17]:
population_dict = {'California': 38332521, 'Texas': 26448193, 'New York': 19651127}
population = pd.Series(population_dict)
print(population)

California    38332521
Texas         26448193
New York      19651127
dtype: int64


## Pandas DataFrame

**Creating a DataFrame**

A DataFrame is like a table or a spreadsheet in Python. It can be created from a dictionary of lists.



In [20]:
data = {'state': ['California', 'Texas', 'New York'], 'population': [38332521, 26448193, 19651127]}
df = pd.DataFrame(data)
print(df)


        state  population
0  California    38332521
1       Texas    26448193
2    New York    19651127


**DataFrame from Series**

A DataFrame can also be created from multiple Series.



In [23]:
population = pd.Series({'California': 38332521, 'Texas': 26448193, 'New York': 19651127})
area = pd.Series({'California': 423967, 'Texas': 695662, 'New York': 141297})
df = pd.DataFrame({'population': population, 'area': area})
print(df)

            population    area
California    38332521  423967
Texas         26448193  695662
New York      19651127  141297


## DataFrame Operations

**Adding Columns** 

You can add new columns to a DataFrame.



In [28]:
df['density'] = df['population'] / df['area']
print(df)

            population    area     density
California    38332521  423967   90.413926
Texas         26448193  695662   38.018740
New York      19651127  141297  139.076746




**Accessing Data**

You can access rows and columns in a DataFrame using indices and labels.



In [33]:
print(df['population'])  # Access a column
print(df.loc['California'])  # Access a row


California    38332521
Texas         26448193
New York      19651127
Name: population, dtype: int64
population    3.833252e+07
area          4.239670e+05
density       9.041393e+01
Name: California, dtype: float64


## Pandas Index Object

**Index**

The Index object in Pandas holds the axis labels for Series and DataFrame.




In [37]:
ind = pd.Index([2, 3, 5, 7, 11])
print(ind)

Index([2, 3, 5, 7, 11], dtype='int64')


**Set Operations** 

The Index object supports set operations like intersection, union, and difference.



In [65]:
a = pd.Index([1,3,5,7,9])
b = pd.Index([2,3,5,7,11])
print(a&b) #Union
print(a|b) #Intersection


Index([0, 3, 5, 7, 9], dtype='int64')
Index([3, 3, 5, 7, 11], dtype='int64')


In [67]:
import pandas as pd

indA = pd.Index([1, 3, 5, 7, 9])
indB = pd.Index([2, 3, 5, 7, 11])


In [69]:
sym_diff = indA ^ indB
print(sym_diff)


Index([3, 0, 0, 0, 2], dtype='int64')
