- The first main data type we will learn about for pandas is the Series data type. Let's import Pandas and explore the Series object.

- A Series is very similar to a NumPy array (in fact it is built on top of the NumPy array object). What differentiates the NumPy array from a Series, is that a Series can have axis labels, meaning it can be indexed by a label, instead of just a number location. It also doesn't need to hold numeric data, it can hold any arbitrary Python Object.

Let's explore this concept through some examples:

In [2]:
import numpy as np
import pandas as pd

# <u>Definition:

    - A Series is very similar to a NumPy array (in fact it is built on top of the NumPy array object). What differentiates the NumPy array from a Series, is that a Series can be indexed by a label, instead of just a number location.
    - It can hold any arbitrary Python Object.

# <u>Creating a Series.

    We can convert a list, numpy array, or dictionary to a Series.

    Syntax: pd.Series(data = None, index = None)

    There are more arguments but for now we are exploring just data and index.

In [5]:
labels = ['a', 'b', 'c']
my_list = [10, 20, 30]
arr = np.array([10, 20, 30])
d = {'a':10, 'b':20, 'c':30}

- <u>**Using Lists**

In [7]:
pd.Series(data = my_list)

0    10
1    20
2    30
dtype: int64

- <u>NOTE:

    - The Series looks a lot like NumPy except it has index.

The key to a panda series is that we can actually specify what we want that index to be (**Distinguished label index series**):

In [10]:
pd.Series(data = my_list, index = labels)

a    10
b    20
c    30
dtype: int64

- <u>NOTE:

    - We have distinguished labels and actual data points now.

Data and index are actually in order as far as placing them in as parameters. So we don't have to say data equals or index equals.

In [13]:
pd.Series(my_list, labels)

a    10
b    20
c    30
dtype: int64

- <u>**NumPy Arrays**

In [15]:
pd.Series(arr)

0    10
1    20
2    30
dtype: int32

In [16]:
pd.Series(arr, labels)

a    10
b    20
c    30
dtype: int32

- <u>**Dictionary**

In [18]:
pd.Series(d)

a    10
b    20
c    30
dtype: int64

- <u>NOTE:

    - Pandas is going to automatically take the keys of that dictionary as the index and then set the value of that key to the corresponding data point.

### <u>Data in a Series.

    A pandas Series can hold a variety of object types:

In [21]:
# Strings.

pd.Series(data = labels)

0    a
1    b
2    c
dtype: object

In [22]:
# Even built-in functions (although unlikely that you will use this).
# Something that distinguishes Pandas from NumPy Array. 

pd.Series([sum, print, len])

0      <built-in function sum>
1    <built-in function print>
2      <built-in function len>
dtype: object

---

# <u>Using an Index.

The key to using a Series is understanding its index. Pandas makes use of these index names or numbers by allowing for fast look ups of information (works like a hash table or dictionary).

Let's see some examples of how to grab information from a Series. Let us create two sereis, ser1 and ser2:

In [25]:
ser1 = pd.Series(data = [1,2,3,4], index = ['USA','Germany','USSR','Japan'])                                   

In [26]:
ser1

USA        1
Germany    2
USSR       3
Japan      4
dtype: int64

- <u>NOTE:

    - data type = int64, since all data points are integers.

In [28]:
ser2 = pd.Series([1,2,5,4], ['USA','Germany','Italy','Japan'])                                   

In [29]:
ser2

USA        1
Germany    2
Italy      5
Japan      4
dtype: int64

In [30]:
# Work very similar to grabbing information out of a python dictionary.

ser1['USA']

1

- <u>NOTE:

    - We are passing USA as a string since the index is a string. It depends on what data type is your actual index.

In [32]:
ser3 = pd.Series(labels)

In [33]:
ser3

0    a
1    b
2    c
dtype: object

In [34]:
ser3[1]

'b'

- <u>NOTE:

    - dtype: object

### Basic Operation:

Operations are then also done based off of matching index, wherever a matching index isn't found, it puts a null(NaN) there:

In [37]:
ser1

USA        1
Germany    2
USSR       3
Japan      4
dtype: int64

In [38]:
ser2

USA        1
Germany    2
Italy      5
Japan      4
dtype: int64

In [39]:
ser1 + ser2

Germany    4.0
Italy      NaN
Japan      8.0
USA        2.0
USSR       NaN
dtype: float64

- <u>NOTE:

    - Something to note here is that when you're performing operations with a Pandas series and NumPy Arrays integers are always converted to floats.(Don't have to worry about that in Python 3.)

---