### Pandas Series
A Pandas Series is a one-dimensional labeled array capable of holding any data type (integers, strings, floats, etc.).

### Key Characteristics:
* Similar to a NumPy array, but with labels (indexes).
* Default index starts from 0 if not specified.
* Built using pd.Series() constructor

### 1.Create a Series
Syntax = pd.Series(data, index=None, dtype=None)
| Parameter | Description                                    |
| --------- | ---------------------------------------------- |
| `data`    | List, NumPy array, dictionary, or scalar value |
| `index`   | Optional labels for elements                   |
| `dtype`   | Data type (optional)                           |


### Methods of Creating Series
#### 1. From a List,Tuples



In [52]:
import pandas as pd
import numpy as np
s1 = pd.Series([10, 20, 30])
print(s1)

0    10
1    20
2    30
dtype: int64


In [24]:
sss1 = pd.Series((10,20,30))
sss1

0    10
1    20
2    30
dtype: int64

In [27]:
# With mixed types
s2 = pd.Series(['a', 2.5, True])
print(s2)

0       a
1     2.5
2    True
dtype: object


In [29]:
# 3: With float values
s3 = pd.Series([1.1, 2.2, 3.3])
print(s3)

0    1.1
1    2.2
2    3.3
dtype: float64


#### 2.From a List with Custom Index

In [32]:
s1 = pd.Series([100, 200, 300], index=['a', 'b', 'c'])
print(s1)

a    100
b    200
c    300
dtype: int64


In [36]:
s2 = pd.Series(('apple', 'banana', 'grape'), index=['x', 'y', 'z'])
print(s2)


x     apple
y    banana
z     grape
dtype: object


In [38]:
s3 = pd.Series([10.5, 20.5, 30.5], index=[101, 102, 103])
print(s3)

101    10.5
102    20.5
103    30.5
dtype: float64


This is useful when your data has labels like product IDs or dates.

#### 3. From a Dictionary

In [42]:
s1 = pd.Series({'a': 1, 'b': 2, 'c': 3})
print(s1)

a    1
b    2
c    3
dtype: int64


In [44]:
s2 = pd.Series({'x': ['red','yellow'], 'y': 'green'})
print(s2)


x    [red, yellow]
y            green
dtype: object


In [46]:
#  Missing index will show NaN
s3 = pd.Series({'a': 10, 'b': 20}, index=['a', 'b', 'c'])
print(s3)

a    10.0
b    20.0
c     NaN
dtype: float64


If index is provided and a key is missing, its value becomes NaN.

#### 4.Using np.array
You can create a Pandas Series using a NumPy array. This is useful when you already have data in NumPy format and want to leverage Pandas functionalities.

In [54]:
# 1: Basic numeric array
s1 = pd.Series(np.array([10, 20, 30]))
print(s1)

0    10
1    20
2    30
dtype: int64


In [56]:
 # 2: With custom index
s2 = pd.Series(np.array([100, 200, 300]), index=['x', 'y', 'z'])
print(s2)

x    100
y    200
z    300
dtype: int64


In [58]:
# 3: Mixed data types
s3 = pd.Series(np.array(['apple', 10.5, True]))
print(s3)

0    apple
1     10.5
2     True
dtype: object


#### 5. Using Scalar Values
You can also create a Series with a single scalar value, repeated across a specified index.

In [61]:
# 1: Fill Series with same number
s1 = pd.Series(5, index=['a', 'b', 'c'])
print(s1)

a    5
b    5
c    5
dtype: int64


In [63]:
 # 2: Fill Series with a string
s2 = pd.Series("Hello", index=[0, 1, 2])
print(s2)

0    Hello
1    Hello
2    Hello
dtype: object


In [65]:
 # 3: Fill with Boolean
s3 = pd.Series(True, index=['yes', 'no', 'maybe'])
print(s3)

yes      True
no       True
maybe    True
dtype: bool


##### Scalar-based Series must include the index parameter, otherwise it creates a single-element Series.

In [72]:
s4 = pd.Series('Welcome')
s4

0    Welcome
dtype: object

### Series Functions and Methods

### A. Basic Inspection Methods

#### .head(n) – Returns first n elements
* Purpose
  * Returns the first n elements of the Series.
* Why Use It
    * Used to preview the top records (especially large datasets).
* Notes
  * Default value of n is 5

In [80]:
s = pd.Series([10, 20, 30, 40, 50,34,56,23,17,89])
s

0    10
1    20
2    30
3    40
4    50
5    34
6    56
7    23
8    17
9    89
dtype: int64

In [86]:
s.head()

0    10
1    20
2    30
3    40
4    50
dtype: int64

In [82]:
s.head(2)

0    10
1    20
dtype: int64

In [84]:
s.head(3)

0    10
1    20
2    30
dtype: int64

#### .tail(n) – Get last n rows
* Purpose:
    * Returns the last n elements of the Series.
* Why Use It:
    * Used to inspect recent values, often useful with time-series data.

In [89]:
s.tail()

5    34
6    56
7    23
8    17
9    89
dtype: int64

In [91]:
s.tail(4)

6    56
7    23
8    17
9    89
dtype: int64

#### .dtype – Data Type of Series Elements
* Purpose:
    * Returns the data type (dtype) of the Series elements.
* Why Use It:
    * Important to understand how data is stored internally — e.g., int64, object.
##### Comman Datatypes
| Data Type        | Description                 |
| ---------------- | --------------------------- |
| `int64`          | Integer values              |
| `float64`        | Decimal numbers             |
| `object`         | Strings or mixed types      |
| `bool`           | Boolean values (True/False) |
| `datetime64[ns]` | Date and time values        |


In [94]:
s.dtype

dtype('int64')

In [108]:
s2 = pd.Series(['a', 'b'])
print(s2.dtype)

object


In [110]:
pd.Series([1.5, 2.5])
print(s3.dtype)

bool


In [112]:
s4 = pd.Series(['a', 2.5, True])
s4.dtype

dtype('O')

### .index - Returns the index (labels) of the Series.
* Why Use It:
    * To inspect or manipulate row labels of the Series.

In [116]:
s = pd.Series([10, 20, 30], index=['a', 'b', 'c'])
print(s.index)

Index(['a', 'b', 'c'], dtype='object')


In [120]:
print(list(s.index))

['a', 'b', 'c']


In [122]:
s.index = ['x', 'y', 'z']
s.index

Index(['x', 'y', 'z'], dtype='object')

### .values - Returns the underlying data as a NumPy array.
* Why Use It:
    * To perform raw numerical operations or NumPy integrations.

In [127]:
s

x    10
y    20
z    30
dtype: int64

In [129]:
s.values

array([10, 20, 30])

In [131]:
print(type(s.values))

<class 'numpy.ndarray'>


### B. Mathematical & Statistical Functions

#### .sum(), .mean(), .median() 
* These methods compute aggregate statistics.
* Used for quick numeric summaries — common in EDA.

In [137]:
s = pd.Series([10, 20, 30])
print(s.sum())    
print(s.mean())    
print(s.median()) 


60
20.0
20.0


#### .min(), .max(), .std()

In [140]:
print(s.min())  
print(s.max())   
print(s.std())   


10
30
10.0


#### .describe() – Summary statistics
* Gives a summary of statistics for the Series.

In [143]:
print(s.describe())


count     3.0
mean     20.0
std      10.0
min      10.0
25%      15.0
50%      20.0
75%      25.0
max      30.0
dtype: float64


In [258]:
s2 = pd.Series([True, False, False])
print(s2.describe())
# Works differently for object and bool types:

count         3
unique        2
top       False
freq          2
dtype: object


In [155]:
s3 = pd.Series(['c', 'b', 'b', 'a'])
print(s3.describe())


count     4
unique    3
top       b
freq      2
dtype: object


### C. Data Cleaning & Missing Values

### .isnull() and .notnull()
* Detect missing (NaN) values.

In [261]:
s = pd.Series([1, None, 3, np.nan])
#True for NaN
print(s.isnull())

0    False
1     True
2    False
3     True
dtype: bool


In [263]:
# True for valid
print(s.notnull())

0     True
1    False
2     True
3    False
dtype: bool


In [267]:
# Filter non-null values
print(s[s.notnull()])

0    1.0
2    3.0
dtype: float64


### .fillna(value) - Fill Missing Data
* Replaces all NaN values with a specific value or method.
* syntax:-
  * series.fillna(value)
  * series.fillna(method='ffill' or 'bfill')

In [166]:
print(s.fillna(0))

0    1.0
1    0.0
2    3.0
3    0.0
dtype: float64


* When working with missing data (NaN), we often want to fill those gaps with nearby values. Two powerful methods in Pandas to do this are:
| Method           | Meaning           | How it works                                     |
| ---------------- | ----------------- | ------------------------------------------------ |
| `method='ffill'` | **Forward fill**  | Fills `NaN` with the **previous** non-null value |
| `method='bfill'` | **Backward fill** | Fills `NaN` with the **next** non-null value     |


Use Case:
* Time-series or sequential data where a value may be missing but nearby values are good approximations.
* Stock prices, sensor readings, form submissions, etc.

####  Forward fill

In [168]:
s = pd.Series([1, None, 3, np.nan])
print(s.fillna(method='ffill'))  

0    1.0
1    1.0
2    3.0
3    3.0
dtype: float64


  print(s.fillna(method='ffill'))  # Forward fill


#### Backward fill

In [173]:
s = pd.Series([1, None, 3, np.nan])
print(s.fillna(method='bfill'))

0    1.0
1    3.0
2    3.0
3    NaN
dtype: float64


  print(s.fillna(method='bfill'))


In [289]:
# print(s.fillna(method='ffill', limit=2))
# Only 2 NaNs are filled after the first value due to limit=2.
# Trailing NaNs (at end) remain NaN because there's no later value to pull back.

| Method    | Action                    | Use Case                            |
| --------- | ------------------------- | ----------------------------------- |
| `ffill`   | Fill with previous value  | Fill future blanks with past values |
| `bfill`   | Fill with next value      | Fill past blanks with future values |
| `limit=n` | Limit the number of fills | Avoid overfitting missing data      |


### When to Use Which?
| Scenario                             | Recommended Fill |
| ------------------------------------ | ---------------- |
| Time-series going forward            | Forward fill     |
| Backfilling previous missing records | Backward fill    |
| Avoiding overwriting early/late NaNs | Use limit param  |


### D. Transformation Methods

### .apply(function) - Apply a function to each element
* Used to apply any custom function row-wise.

In [275]:
s = pd.Series([15, 27, 39])
s

0    15
1    27
2    39
dtype: int64

In [191]:
s22 = s.apply(lambda x: x + 5)
s22

0    20
1    32
2    44
dtype: int64

In [193]:
s

0    15
1    27
2    39
dtype: int64

In [185]:
s.apply(lambda x: x**2)

0     225
1     729
2    1521
dtype: int64

In [197]:
s.apply(str)

0    15
1    27
2    39
dtype: object

In [277]:
def square(x): 
    return x**2
print(s.apply(square))

0     225
1     729
2    1521
dtype: int64


#### .map(dict/function) - Element-wise Transformation
* Transforms each value in a Series using a function or mapping dict.

In [216]:
s = pd.Series([1, 2, 3])
print(s.map({1: 'One', 2: 'Two', 3: 'Three'}))

0      One
1      Two
2    Three
dtype: object


In [208]:
print(s.map(lambda x: x * 10))

0    10
1    20
2    30
dtype: int64


In [218]:
s2 = pd.Series(['dog', 'cat'])
print(s2.map(str.upper))
s

0    DOG
1    CAT
dtype: object


0    1
1    2
2    3
dtype: int64

### .astype(type) – Convert data type

In [221]:
s = pd.Series([1, 2, 3])
print(s.astype(float))
print(s.astype(str))
print(s.astype("int32"))


0    1.0
1    2.0
2    3.0
dtype: float64
0    1
1    2
2    3
dtype: object
0    1
1    2
2    3
dtype: int32


### E. Sorting & Uniqueness

#### .sort_values(), .sort_index()
* Sort the Series by values or index.

In [225]:
s = pd.Series([50, 10, 30,90,40], index=['c', 'a', 'b','d','e'])
print(s.sort_values())

a    10
b    30
e    40
c    50
d    90
dtype: int64


In [227]:
print(s.sort_index())

a    10
b    30
c    50
d    90
e    40
dtype: int64


In [229]:
print(s.sort_values(ascending=False))

d    90
c    50
e    40
b    30
a    10
dtype: int64


### .unique(), .nunique()
* .unique() returns unique values as array.
* .nunique() counts number of unique values.

In [232]:
s = pd.Series([1, 2, 2, 3, 3, 3])
print(s.unique()) 

[1 2 3]


In [234]:
print(s.nunique())   

3


In [236]:
print(len(s.unique())) 

3


### F. Miscellaneous

#### .copy() – Create a copy
* Creates a deep copy of the Series (not linked to original memory).

In [241]:
s = pd.Series([10, 20, 30])
s_copy = s.copy()

In [243]:
print(s_copy)

0    10
1    20
2    30
dtype: int64


In [245]:
s_copy[0] = 99
print(s[0], s_copy[0])

10 99


#### .equals() – Compare two series
* Checks if two Series are exactly equal (including index and values).

In [248]:
print(s.equals(s_copy)) 

False


In [250]:
s1 = pd.Series([1, 2])
s2 = pd.Series([1, 2])
print(s1.equals(s2))

True


In [252]:
s3 = pd.Series([2, 1])
print(s1.equals(s3))

False


In [254]:
print(s1.equals(s1.copy()))

True
