In [1]:
import numpy as np
import pandas as pd

*Series* is a one-dimensional labeled array capable of holding any data type. The axis labels are collectively referred to as the index. The basic method to create *Series* is to call:

In [None]:
s = pd.Series(data, index=index)

*data* can be many different things:
- a Python dict
- an ndarray
- a scalar value

## From ndarray

In [22]:
# Here, we specify the index
s = pd.Series(np.random.randn(5), index=['a', 'b', 'c', 'd', 'e'])
s

a   -0.387112
b   -0.340261
c    1.989447
d   -0.381424
e   -0.344201
dtype: float64

In [23]:
s.index

Index(['a', 'b', 'c', 'd', 'e'], dtype='object')

In [24]:
# Here, we let Pandas create a default index
pd.Series(np.random.randn(5))

0   -0.485756
1   -0.439825
2   -0.132159
3    2.006349
4    0.343410
dtype: float64

***

## From dict
*Series* can be created from dicts:

In [25]:
d = {'b': 1, 'a': 0, 'c': 2}

pd.Series(d)

b    1
a    0
c    2
dtype: int64

***

## From scalar value
If *data* is a scalar value, an index must be provided. The value will be repeated to match the length of the index

In [26]:
pd.Series(5., index=['a', 'b', 'c', 'd', 'e'])

a    5.0
b    5.0
c    5.0
d    5.0
e    5.0
dtype: float64

***

## Series is *ndarray*-like
Series act similarly to a *ndarray* from *NumPy* and is a valid argument to most NumPy functions. Operations such as slicing will also slice the index

In [27]:
s[0]

-0.38711206338345133

In [28]:
s[:3]

a   -0.387112
b   -0.340261
c    1.989447
dtype: float64

In [29]:
s[s > s.median()]

b   -0.340261
c    1.989447
dtype: float64

In [30]:
s[[4, 3, 1]]

e   -0.344201
d   -0.381424
b   -0.340261
dtype: float64

In [32]:
np.exp(s)

a    0.679015
b    0.711585
c    7.311493
d    0.682888
e    0.708787
dtype: float64

In [33]:
s.dtype

dtype('float64')

In [34]:
# While Series is ndarray-like, if you need an
# actual ndarray, then use Series.to_numpy()
s.to_numpy()

array([-0.38711206, -0.34026066,  1.98944748, -0.38142445, -0.34420066])

***

## Series is *dict*-like
A *Series* is like a fixed-size dict in which you can get and set values by an index label

In [35]:
s['a']

-0.38711206338345133

In [36]:
s['e'] = 12.
s

a    -0.387112
b    -0.340261
c     1.989447
d    -0.381424
e    12.000000
dtype: float64

In [37]:
'e' in s

True

In [38]:
'f' in s

False

In [49]:
# If a label is not contained, an exception is raised:
try:
    s['f']
except Exception as e:
    print('Error occurred with:', e)

Error occurred with: 'f'


***

## Vectorized Operations
A key difference between *Series* and *ndarray* is that operations between *Series* automatically align data based on the label. Thus, you can write computations without considering whether the *Series* involved have the same labels

In [50]:
s + s

a    -0.774224
b    -0.680521
c     3.978895
d    -0.762849
e    24.000000
dtype: float64

In [51]:
s * 2

a    -0.774224
b    -0.680521
c     3.978895
d    -0.762849
e    24.000000
dtype: float64

In [52]:
np.exp(s)

a         0.679015
b         0.711585
c         7.311493
d         0.682888
e    162754.791419
dtype: float64

In [53]:
s1 = s[1:]
s2 = s[:-1]
s1 + s2

a         NaN
b   -0.680521
c    3.978895
d   -0.762849
e         NaN
dtype: float64

***

## Name Attribute
*Series* can also have a *name* attribute

In [54]:
s = pd.Series(np.random.randn(5), name='something')
s

0    0.884453
1   -1.274602
2   -0.140365
3   -0.495843
4    1.705437
Name: something, dtype: float64

In [55]:
s.name

'something'