# Series

The Series object is a one dimensional data structure. It can hold numerical data, time data, strings, or arbitrary Python objects. If you are dealing with numeric data, using pandas rather than a Python list will give you additional benefits as it is faster, consumes less memory, and comes with built-in methods that are very useful to manipulate the data.

In [1]:
import pandas as pd
import numpy as np

Construct a series using python list

In [2]:
scores = [90, 67, 60, 70, 50]
names = ['Saheen', 'Dom', 'Jia', 'Kamal', 'Tim']

data_series = pd.Series(data=scores, index=names, name='scores')

In [3]:
data_series

Saheen    90
Dom       67
Jia       60
Kamal     70
Tim       50
Name: scores, dtype: int64

In [4]:
print(data_series.shape)
print(data_series.ndim)

(5,)
1


Accessing the index

In [5]:
data_series.index

Index(['Saheen', 'Dom', 'Jia', 'Kamal', 'Tim'], dtype='object')

like python dist and numpy array pandas series support indexing

In [6]:
data_series['Saheen']

90

Append values to pandas series (with nulls)

In [7]:
new_data = pd.Series(data=[78, None], index=['Kim', 'Jimmy'])
data_series = data_series.append(new_data)

In [8]:
data_series

Saheen    90.0
Dom       67.0
Jia       60.0
Kamal     70.0
Tim       50.0
Kim       78.0
Jimmy      NaN
dtype: float64

Count null values

In [22]:
data_series.isnull().sum()

1

## Like numpy series work with operation

In [20]:
print(f'sum:{data_series.sum()}')
print(f'mean:{data_series.mean():.2f}')

sum:415.0
mean:69.17


## Filter data

In [10]:
score = 60
# let say if a person got 60 or above only will able pass 
passed = data_series >= score
passed # which is mask

Saheen     True
Dom        True
Jia        True
Kamal      True
Tim       False
Kim        True
Jimmy     False
dtype: bool

In [11]:
# filter out data using the mask
data_series[passed]

Saheen    90.0
Dom       67.0
Jia       60.0
Kamal     70.0
Kim       78.0
dtype: float64

Numpy work more simlilar to pandas

In [12]:
numpy_series = np.array(scores, dtype=float)
filtered_data = numpy_series[numpy_series >= score]
filtered_data

array([90., 67., 60., 70.])

Hint - numpy and pandas are more similar

# Series CRUD