<a href="https://colab.research.google.com/github/owaisahmad315/pandas/blob/main/series_indexing.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [2]:
import pandas as pd


In [6]:
george = pd.Series([10, 7],
                   index = ['1968', '1969'],
                   name = 'George Songs')
george

1968    10
1969     7
Name: George Songs, dtype: int64

In [8]:
"""
(pandas indicates that strings index
entries are objects), note the dtype of the index attribute:
"""

george.index

Index(['1968', '1969'], dtype='object')

In [13]:
dupe = pd.Series([10, 2, 7],
                 index = ['1968', '1968', '1969'],
                 name = 'George Songs')
dupe

1968    10
1968     2
1969     7
Name: George Songs, dtype: int64

In [14]:
dupe.index.is_unique

False

In [15]:
george.index.is_unique

True

In [17]:
"""
NOTE
If the index is already using integer labels, then the fallback to
position based indexing does not work!:

"""
george_i = pd.Series([10,7],
                     index = ['1968', '1969'],
                      name = 'George Songs')
george_i[-1]


7

In [18]:
# .iloc and .loc
"""
When we perform an index operation on the .iloc attribute, it does
lookup based on index position (in this case pandas behaves similar to a
Python list). pandas will raise an IndexError if there is no index at that
location:
"""
george.iloc[0]

10

In [20]:
george.iloc[-1]

7

In [None]:
# here is the error, run and check it
george.iloc[4]

In [None]:
george.iloc['1968']
# Traceback (most recent call last):


In [25]:
"""
In addition to pulling out a single item, we can slice just like in normal
Python:

"""

george.iloc[0:3] # slide

1968    10
1969     7
Name: George Songs, dtype: int64

In [27]:
# You can pass in a list of index locations to the index operation:
george.iloc[[0,1]]


1968    10
1969     7
Name: George Songs, dtype: int64

In [None]:
george.loc['1970']


In [32]:
# .at and .iat
"""
The .at and .iat index accessors are analogous to .loc and .iloc. The
difference being that they will return a numpy.ndarray when pulling out a
duplicate value, whereas .loc and .iloc return a Series:

"""
george_dupe = pd.Series([10, 7, 1, 22],
                        index=['1968', '1969', '1970', '1970'],
                        name = 'George Songs')
george_dupe

1968    10
1969     7
1970     1
1970    22
Name: George Songs, dtype: int64

In [38]:
george_dupe.at['1970']
# it will return numpy.ndarray

1970     1
1970    22
Name: George Songs, dtype: int64

In [40]:
george_dupe.loc['1970']
# .loc and .iloc return a Series:

1970     1
1970    22
Name: George Songs, dtype: int64