# Working with Pandas `Series`

## Creating `Series`

In [3]:
import numpy as np
import pandas as pd

### From array

#### `index`: must be same size as `ndarray`/`list`

In [4]:
a = pd.Series([1, 3, 5, 7], index=['a', 'c', 'd', 'b'])
a

a    1
c    3
d    5
b    7
dtype: int64

#### No `index`: default integer index starting at 0

In [5]:
b = pd.Series([1, 3, 5, np.nan, 6, 8])
b

0    1.0
1    3.0
2    5.0
3    NaN
4    6.0
5    8.0
dtype: float64

### From `dict`

#### No `index`: use keys

In [7]:
c = pd.Series({"b": 1, "a": 0, "c": 2})
c

b    1
a    0
c    2
dtype: int64

#### `index`: pull corresponding values from data

In [8]:
d = pd.Series({"b": 1, "a": 0, "c": 2}, index=["b", "d", "c"])
d

b    1.0
d    NaN
c    2.0
dtype: float64

### From scalar

#### `index` (required): set every value to data value

In [9]:
e = pd.Series(2, index=["a", "b", "c"])
e

a    2
b    2
c    2
dtype: int64

## Working with `Series`

### `Series` is `ndarray`-like

`Series` acts similarly to a `ndarray` and is a valid argument to most NumPy functions.
Operations such as slicing will also slice the index.

In [12]:
f = pd.Series(range(5), index=['a', 'b', 'c', 'd', 'e'])
f

a    0
b    1
c    2
d    3
e    4
dtype: int64

In [14]:
f[0]

0

In [15]:
f[:2]

a    0
b    1
dtype: int64

In [17]:
f[f > f.median()]

d    3
e    4
dtype: int64

In [18]:
f[[4, 3, 1]]

e    4
d    3
b    1
dtype: int64

In [19]:
np.exp(f)

a     1.000000
b     2.718282
c     7.389056
d    20.085537
e    54.598150
dtype: float64

### `Series` is `dict`-like

You can get and set values by index label.

In [20]:
f['a']

0

In [21]:
f['a'] = 100
f

a    100
b      1
c      2
d      3
e      4
dtype: int64

In [22]:
f['f']

KeyError: 'f'

In [23]:
f.get('f', np.nan)

nan

In [25]:
f.a

100

## Working with `Series`

In [27]:
f + f

a    200
b      2
c      4
d      6
e      8
dtype: int64

In [29]:
f ** 2

a    10000
b        1
c        4
d        9
e       16
dtype: int64

### Label considerations

`Series` automatically align the data based on label.
The result of an operation between unaligned `Series` will have the union of the indexes involved. If a label is not found in one `Series` or the other, the result will be marked as missing `NaN`.

In [36]:
g = pd.Series(.5, index=['b', 'c', 'e', 'f'])
f + g

a    NaN
b    1.5
c    2.5
d    NaN
e    4.5
f    NaN
dtype: float64

## name attribute

You can label `Series` with a `name`

In [40]:
s = pd.Series(np.random.randn(5), name="random series")
s

0   -0.784135
1   -0.354688
2   -0.888289
3   -0.051576
4   -0.938784
Name: random series, dtype: float64

In [41]:
s.name

'random series'

In a `DataFrame`, the `Series` name will be the column label.