### SESSION 16 - PANDAS SERIES

### What is Pandas
- Pandas is a fast, powerful, flexible and easy to use open source data analysis and manipulation tool, built on top of the Python programming language.
- https://pandas.pydata.org/about/index.html

### Pandas Series
- A Pandas Series is like a column in a table. It is a 1-D array holding data of any type.
- **Syntax : pd.Series( data, index, name )**
- Index generate automatically if you want to generate custom index we can use `index` parameters

### Importing Pandas


In [2]:
import numpy as np
import pandas as pd

**Create a Series from lists:**
1. string
2. integers
3. custom index
4. setting a name

In [4]:
# string
countries = ['India', 'Nepal', 'Srilanka', 'Bhutan']
print(pd.Series(countries))
# Note : dtype : object all countries data type and we consider as string dtype

0       India
1       Nepal
2    Srilanka
3      Bhutan
dtype: object


In [6]:
# integer
runs = [54,76,100,74]
print(pd.Series(runs))

0     54
1     76
2    100
3     74
dtype: int64


In [9]:
# Custom index
marks = [58,93,89,60]
subjects = ['C++','Python','NumPy','Java']
print(pd.Series(marks, index=subjects))

C++       58
Python    93
NumPy     89
Java      60
dtype: int64


**Create a Series from Dictionary:**

In [4]:
import pandas as pd
marks = {
    'maths':78,
    'english': 70,
    'science': 89
}
print(pd.Series(marks, name='Student Score'))

maths      78
english    70
science    89
Name: Student Score, dtype: int64


### Series Attribute :
- In Pandas, a Series object has several important attributes that provide information about the data it holds and how it's structured. - Some of the commonly used attributes of a Pandas Series include:
- It is most common series attribute use in pandas.

**size:**
- The size attribute returns the number of elements in a Series, including any elements that might contain missing or NaN (Not-a-Number) values.

In [29]:
import pandas as pd
data = [10, 20, 30, None, 50]
print(pd.Series(data).size) 

data1 = [10, 20, 30, 50]
print(pd.Series(data1).size) 

5
4


**dtype:** 
- This attribute returns the data type of the elements in the Series. 
- It can be used to check the data type of the data within the Series.

In [30]:
import pandas as pd
data = [10, 20, 30, 40, 50]
print(pd.Series(data).dtype)

int64


**name:** 
- You can assign a name to a Series when creating it or later using the name attribute. 
- The name is typically used in the context of DataFrames, where a Series can represent a column.


In [31]:
import pandas as pd
data = [10, 20, 30, 40, 50]
print(pd.Series(data, name='MyData').name)

MyData


**is_unique:**
- The is_unique attribute returns a boolean value indicating whether all the values in the Series are unique (no duplicates) or not.

In [32]:
import pandas as pd
data1 = [10, 20, 30, 40, 50]
print(pd.Series(data).is_unique)  # Returns True

data2 = [10, 10, 30, 50, 50]
print(pd.Series(data2).is_unique)  # Returns False

True
False


**index:** 
- This attribute returns the index labels associated with the Series. 
- The index labels can be integers, strings, or any other data type. 
- If no index labels were explicitly provided when creating the Series, a default integer index will be generated.

In [33]:
import pandas as pd
data = [10, 20, 30, 40, 50]
index_labels = ['A', 'B', 'C', 'D', 'E']
print(pd.Series(data, index=index_labels).index)

Index(['A', 'B', 'C', 'D', 'E'], dtype='object')


**values:** 
- This attribute returns the data in the Series as a NumPy array.


In [39]:
import pandas as pd
marks = {'maths':78,'english': 70,'science': 89}
print(pd.Series(marks).values)

[78 70 89]
