In [1]:
%load_ext nb_black

<IPython.core.display.Javascript object>

# Indexing Series

## Setup

In [2]:
import pandas as pd

<IPython.core.display.Javascript object>

## Creation

In [3]:
data = {
    "Capital": {
        "Spain": "Madrid",
        "Belgium": "Brussels",
        "France": "Paris",
        "Italy": "Roma",
        "Germany": "Berlin",
        "Portugal": "Lisbon",
        "Norway": "Oslo",
        "Greece": "Athens",
    },
    "Population": {
        "Spain": 46733038,
        "Belgium": 11449656,
        "France": 67076000,
        "Italy": 60390560,
        "Germany": 83122889,
        "Portugal": 10295909,
        "Norway": 5391369,
        "Greece": 10718565,
    },
    "Monarch": {
        "Spain": "Felipe VI",
        "Belgium": "Philippe",
        "Norway": "Harald V",
    },
    "Area": {
        "Spain": 505990,
        "Belgium": 30688,
        "France": 640679,
        "Italy": 301340,
        "Germany": 357022,
        "Portugal": 92212,
        "Norway": 385207,
        "Greece": 131957,
    },
}

<IPython.core.display.Javascript object>

In [4]:
# For now, let's forget about these steps:
df = pd.DataFrame(data)
df["Capital"] = df["Capital"].astype("string")
df["Monarch"] = df["Monarch"].astype("string")

<IPython.core.display.Javascript object>

In [5]:
df

Unnamed: 0,Capital,Population,Monarch,Area
Spain,Madrid,46733038,Felipe VI,505990
Belgium,Brussels,11449656,Philippe,30688
France,Paris,67076000,,640679
Italy,Roma,60390560,,301340
Germany,Berlin,83122889,,357022
Portugal,Lisbon,10295909,,92212
Norway,Oslo,5391369,Harald V,385207
Greece,Athens,10718565,,131957


<IPython.core.display.Javascript object>

In [6]:
df.index

Index(['Spain', 'Belgium', 'France', 'Italy', 'Germany', 'Portugal', 'Norway',
       'Greece'],
      dtype='object')

<IPython.core.display.Javascript object>

Create a `Series` with an index containing (non-integers) labels:

In [7]:
s_labels = df["Capital"]

<IPython.core.display.Javascript object>

In [8]:
s_labels

Spain         Madrid
Belgium     Brussels
France         Paris
Italy           Roma
Germany       Berlin
Portugal      Lisbon
Norway          Oslo
Greece        Athens
Name: Capital, dtype: string

<IPython.core.display.Javascript object>

Create a `Series` with an index containing integers (i.e. the default index):

In [9]:
s_integers = df.reset_index()["Capital"]

<IPython.core.display.Javascript object>

In [10]:
s_integers

0      Madrid
1    Brussels
2       Paris
3        Roma
4      Berlin
5      Lisbon
6        Oslo
7      Athens
Name: Capital, dtype: string

<IPython.core.display.Javascript object>

## Bonus: Indexing Series

Indexing work exactly as for DataFrames:

In [11]:
s_labels[1]

'Brussels'

<IPython.core.display.Javascript object>

In [12]:
s_integers[1]

'Brussels'

<IPython.core.display.Javascript object>

However, to avoid confusion, only series with (non-integer) labels allow for this construct:

In [13]:
s_labels[-1]

'Athens'

<IPython.core.display.Javascript object>

In [14]:
# Raises an error, because the intent is ambiguous:
s_integers[-1]

KeyError: -1

<IPython.core.display.Javascript object>

The `.loc[]` and `.iloc[]` methods work exactly as for DataFrames:

In [15]:
s_labels.iloc[1]

'Brussels'

<IPython.core.display.Javascript object>

In [16]:
s_integers.iloc[1]

'Brussels'

<IPython.core.display.Javascript object>

Indexing with slices also work exactly as for DataFrames:

In [17]:
s_labels[:1]

Spain    Madrid
Name: Capital, dtype: string

<IPython.core.display.Javascript object>

In [18]:
s_integers[:1]

0    Madrid
Name: Capital, dtype: string

<IPython.core.display.Javascript object>

Note that these 4 approaches give the same result, because they all refer to integers:

In [19]:
s_labels.iloc[:1]

Spain    Madrid
Name: Capital, dtype: string

<IPython.core.display.Javascript object>

In [20]:
s_integers.iloc[:1]

0    Madrid
Name: Capital, dtype: string

<IPython.core.display.Javascript object>

In [21]:
s_labels[:1]

Spain    Madrid
Name: Capital, dtype: string

<IPython.core.display.Javascript object>

In [22]:
s_integers[:1]

0    Madrid
Name: Capital, dtype: string

<IPython.core.display.Javascript object>

However, this approach refers to labels, and hence gives a different result:

In [23]:
s_integers.loc[:1]

0      Madrid
1    Brussels
Name: Capital, dtype: string

<IPython.core.display.Javascript object>

In [24]:
# Raises an error, because this Series does not have matching element in its index:
s_labels.loc[:1]

TypeError: cannot do slice indexing on Index with these indexers [1] of type int

<IPython.core.display.Javascript object>