# Pandas series

1. Creating a series
2. Indexes (Pandas vs. NumPy)
3. `.loc` and `.iloc`
4. dtypes
5. `nan` and Pandas
6. Methods, mask indexes

In [1]:
import numpy as np
import pandas as pd
from pandas import Series, DataFrame

In [2]:
a = np.array([10, 20, 30, 40, 50], dtype=np.int8)
a

array([10, 20, 30, 40, 50], dtype=int8)

In [3]:
# Create a series in a similar way

s = Series([10, 20, 30, 40, 50])
s

0    10
1    20
2    30
3    40
4    50
dtype: int64

In [4]:
# NumPy methods work!

In [5]:
s.min()

np.int64(10)

In [6]:
s.max()

np.int64(50)

In [7]:
s.mean()

np.float64(30.0)

In [8]:
s.std()

np.float64(15.811388300841896)

In [9]:
# retrieve values from our series with []
s[2]

np.int64(30)

In [10]:
# fancy indexing

s[ [2, 4] ]

2    30
4    50
dtype: int64

In [11]:
s[ [2 ]]

2    30
dtype: int64

In [12]:
# comparisons

s == 30

0    False
1    False
2     True
3    False
4    False
dtype: bool

In [13]:
# mask index / boolean index
s[ s == 30  ]

2    30
dtype: int64

In [15]:
s[ s <= 30] 

0    10
1    20
2    30
dtype: int64

In [16]:
s = Series(a)
s

0    10
1    20
2    30
3    40
4    50
dtype: int8

# Exercise: Pandas series

1. Create a series for the expected high temperatures in the next 10 days.
2. What are the min and max high temperatures?
3. On how many days will we have temperatures below the mean?
4. What is the mean in the first 5 days? In the last 5 days?

In [17]:
s = Series([18, 15, 12, 11, 10, 9, 12, 15, 17, 18])
s

0    18
1    15
2    12
3    11
4    10
5     9
6    12
7    15
8    17
9    18
dtype: int64

In [18]:
# 2. What are the min and max high temperatures?

s.min()

np.int64(9)

In [20]:
s.max()

np.int64(18)

In [21]:
help(s.max)  

Help on method max in module pandas.core.series:

max(
    axis: 'Axis | None' = 0,
    skipna: 'bool' = True,
    numeric_only: 'bool' = False,
    **kwargs
) method of pandas.core.series.Series instance
    Return the maximum of the values over the requested axis.

    If you want the *index* of the maximum, use ``idxmax``. This is the equivalent of the ``numpy.ndarray`` method ``argmax``.

    Parameters
    ----------
    axis : {index (0)}
        Axis for the function to be applied on.
        For `Series` this parameter is unused and defaults to 0.

        For DataFrames, specifying ``axis=None`` will apply the aggregation
        across both axes.

        .. versionadded:: 2.0.0

    skipna : bool, default True
        Exclude NA/null values when computing the result.
    numeric_only : bool, default False
        Include only float, int, boolean columns. Not implemented for Series.

    **kwargs
        Additional keyword arguments to be passed to the function.

    Returns


In [22]:
s.describe()

count    10.000
mean     13.700
std       3.335
min       9.000
25%      11.250
50%      13.500
75%      16.500
max      18.000
dtype: float64

In [None]:
s.agg

In [None]:
# 3. On how many days will we have temperatures below the mean?
# 4. What is the mean in the first 5 days? In the last 5 days?