<h1 style="">Pandas Series Object</h1>
<img src="../../../../outfit/images/logos/software-web@256x256.png" style="position:absolute; right:2em; top:0">	




References:

1. Series @ pandas user guide: https://pandas.pydata.org/docs/user_guide/dsintro.html#basics-series
1. Series @ pandas API: https://pandas.pydata.org/docs/reference/api/pandas.Series.html


In [1]:
import pandas as pd

## Create Series Object

Series is a one-dimensional **labeled array** capable of holding any data type (integers, strings, floating point numbers, Python objects, etc.). 

The basic method to create a Series is to call:

```
s = pd.Series(data, index=index)
```

### Create Series with Explicit Indexing

The passed index is a list of *axis labels*.

In [2]:
s1 = pd.Series([1,2,3,4], index=['a', 'b', 'c', 'd'])
s1

a    1
b    2
c    3
d    4
dtype: int64

### Create Series with Implicit Indexing

When no index is passed, one will be created implicitly, having values [0, ..., len(data) - 1].

In [3]:
s1 = pd.Series([1,2,3,4])
s1

0    1
1    2
2    3
3    4
dtype: int64

### Create Series from dictionary

In [4]:
data_dict = {
    "d":4,
    "a":1,
    "c":3,
    "b":2,
    "e":5
}
s2 = pd.Series(data_dict)
s2

d    4
a    1
c    3
b    2
e    5
dtype: int64

In [5]:
s2 = pd.Series(data_dict, index=['a','c',3,4,5])
s2

a    1.0
c    3.0
3    NaN
4    NaN
5    NaN
dtype: float64

## Access Elements (Series Indxing)

Let's have next series:

In [6]:
prices_ds = pd.Series(
    [1.5, 2, 2.5, 3],
    index=["apples", "oranges", "bananas", "strawberries"]
)
prices_ds

apples          1.5
oranges         2.0
bananas         2.5
strawberries    3.0
dtype: float64

Access elements by position (implicit index)

In [7]:
prices_ds[1]

2.0

Access elements by index labels (explicit index)

In [8]:
# using dot notation (works only when labels are valid identifiers)
prices_ds.oranges

2.0

In [9]:
# using square brackets notation:
prices_ds['oranges']

2.0

#### Show the index

In [10]:
prices_ds.index

Index(['apples', 'oranges', 'bananas', 'strawberries'], dtype='object')

In [11]:
# and here we use the implicit index (the one python is giving)
prices_ds[1]

2.0

#### List indexes
**note the need of double brackets**, as we pass a list of index

In [12]:
prices_ds[ ["apples", "oranges"] ]

apples     1.5
oranges    2.0
dtype: float64

### Series Slicing

In [13]:
# slicing is exclusive on implicit indexing
prices_ds[0:2]

apples     1.5
oranges    2.0
dtype: float64

In [14]:
# Note that slicing is inclusive on exmplicit index
prices_ds['apples':'bananas']

apples     1.5
oranges    2.0
bananas    2.5
dtype: float64

In [15]:
prices_ds[2:]

bananas         2.5
strawberries    3.0
dtype: float64

In [16]:
prices_ds['bananas':]

bananas         2.5
strawberries    3.0
dtype: float64

### Indexing a Series using .loc[ ] and .iloc[]:

.loc[ ] is primarily **label based**

.iloc[] is primarily **integer position based**

Both .loc[] and .ilock[] can be used with a boolean array (see 'filtering by value (masking)' bellow)

In [17]:
prices_ds.loc['bananas']

2.5

In [18]:
prices_ds.iloc[2]

2.5

## Series Operations

### Arithemtic operations are point-to-point

In [19]:
prices_ds + 2

apples          3.5
oranges         4.0
bananas         4.5
strawberries    5.0
dtype: float64

### Logical operations are point-to-point

In [20]:
prices_ds>2

apples          False
oranges         False
bananas          True
strawberries     True
dtype: bool

### Filtering by value (masking)

In [21]:
mask = [False, False, True, True]
prices_ds[mask]

bananas         2.5
strawberries    3.0
dtype: float64

In [22]:
prices_ds[prices_ds>2]

bananas         2.5
strawberries    3.0
dtype: float64

### Dictionary like operation on Series

In [23]:
"apples" in prices_ds

True

In [24]:
3 in prices_ds

False

### Missing Data

In [25]:
s1 = pd.Series([1,3], index=["a","c"], dtype="int32")
s2 = pd.Series([2,3], index=["b","c"], dtype="int32")

In [26]:
s1+s2

a    NaN
b    NaN
c    6.0
dtype: float64