# Series

The first main data type we will learn about for pandas is the Series data type. Let us import pandas and explore the Series object.

A Series is very similar to a NumPy array (in fact it is built on the top of the NumPy array object). What differenciates a Series object from a NumPy array is that, a Series can have axis labels, meaning it can be indexed by a label, instead of a number location.It also doesn't need to numeric data, it can hold any arbitrary Python object.

Let's explore this concept through some examples: -

In [1]:
import numpy as np
import pandas as pd

##  Creating a Series

You can convert a python list, a numPy array or a python dictionary to a Pandas series object.

In [2]:
# Create a list of characters which will act as labels
labels= ['a', 'b', 'c']

# Create a list of numbers which will be associated to the labels
my_list = [10, 20, 30]

# Create a numPy array from the list 
arr = np.array(my_list)

# Create a Python dictionary with labels and values
d = {'a':10, 'b': 20, 'c':30}

Creating a Series object using lists 


In [3]:
# Convert a Python list to a Pandas series without specifying the index labels
pd.Series(data=my_list)

0    10
1    20
2    30
dtype: int64

The above Pandas Series object looks a lot similar to the NumPy array where 0, 1, 2 are the indices and [10, 20, 30] are data values. The key to a pandas series is that we can specify the index labels.

In [4]:
# Convert a Python list to a Pandas series. Also specify the index labels of our choice.
pd.Series(data=my_list, index=labels)

a    10
b    20
c    30
dtype: int64

This series object is almost same as the one we created before it except that here we have specified the index labels by our choice. Consequently, we can access the data values in a series using the labelled indices.

In [5]:
# Convert a numPy array to a series object without specifying the index labels
pd.Series(data=arr)

0    10
1    20
2    30
dtype: int32

In [6]:
# Convert a numPy array to a pandas series object. Also specify the index labels of our choice.
pd.Series(data=arr, index=labels)

a    10
b    20
c    30
dtype: int32

We can observe after having converted both Python lists as well as NumPy arrays to Pandas series that both of them work exactly the same as far passing them to series function is concerned.

In [7]:
# Convert a Python dictionary into a Pandas Series object.
pd.Series(data=d)

a    10
b    20
c    30
dtype: int64

Here we can see that Pandas is intelligent enough to automatically use the keys of the dictionary as the index labels of the subsequent series object.

## Data in a Series 

Another thing that sets apart a pandas series from a NumPy array is that a series can hold a wide variety of data types.

In [8]:
# A pandas series holding numbers
pd.Series(data=arr)

0    10
1    20
2    30
dtype: int32

In [9]:
# A pandas series holding characters
pd.Series(data=labels)

0    a
1    b
2    c
dtype: object

As we can see, the default indexing is whole-number-based in pandas series.

In [10]:
# We can even packup functions into a series object
pd.Series(data=[sum, print, len])
# Although it is highly unlikely that we will use it often

0      <built-in function sum>
1    <built-in function print>
2      <built-in function len>
dtype: object

The above shown packing up of built-in methods function cannot be done in a series object

## Using an index 

The key to using a Series is understanding its index. Pandas makes use of these index names or numbers by allowing for fast look ups of information (works like a hash table or dictionary)

Let us see some examples of how to grab information from a series. Let us create two series, ser1 and ser2:

In [11]:
ser1 = pd.Series(data=[1, 2, 3, 4], index=['USA', 'Germany', 'USSR', 'Japan'])

In [12]:
ser1

USA        1
Germany    2
USSR       3
Japan      4
dtype: int64

In [13]:
ser2 = pd.Series(data=[1, 2, 5, 4], index=['USA', 'Germany', 'Italy', 'Japan'])

In [14]:
ser2

USA        1
Germany    2
Italy      5
Japan      4
dtype: int64

Now, if we want to grab information out of a series object, it will work very similar to how we get information out of a dictionary.

In [15]:
# To extract the data value 1 from the first series (ser1), we specify the index label 'USA'
ser1['USA']

# Here, the index label USA is mentioned in quotes because we know that the indices are 
# strings. By default they are whole numbers

1

In [16]:
ser3= pd.Series(data=labels)

In [17]:
ser3

0    a
1    b
2    c
dtype: object

The ** dtype ** is object which refers to the fact that the data values are string

In [19]:
# To extract the first data value 'a' we use the index 0
ser3[0]

'a'

Basic operations that can be performed on the pandas series object are also index based.

In [20]:
ser1 + ser2

Germany    4.0
Italy      NaN
Japan      8.0
USA        2.0
USSR       NaN
dtype: float64

The series operator tries to match index values in the two series (operands) and add the corresponding data values. So data values assoicated with indices like 'USA', 'Japan' and 'Germany' get added wherease those associated with non-matching indices like 'italy' and 'USSR' get a NaN next to them