# Series

The first main data type we will learn about for pandas is the **Series** data type.

A _Series_ is very similar to a NumPy array (in fact it is built on top of the NumPy array object). What differentiates the NumPy array from a Series, is that _a Series can have axis labels_, meaning it can be indexed by a label, instead of just a number location. It also doesn't need to hold numeric data, _it can hold any arbitrary Python Object_.

Let's explore this concept:

In [1]:
# First install the needed libraries
import numpy as np
import pandas as pd

### I. Creating a Series

You can convert a list, numpy array, or dictionary to a Series:

In [7]:
labels = ['a','b','c']

# Creating the variable objects
my_list = [10,20,30]
arr = np.array([10,20,30])
d = {'a':10,'b':20,'c':30}

#### i. Using Lists

In [3]:
pd.Series(data=my_list)

0    10
1    20
2    30
dtype: int64

In [4]:
# We can customize the index with the labels variable we created
pd.Series(data=my_list,index=labels)

a    10
b    20
c    30
dtype: int64

#### ii. Using NumPy Arrays

In [5]:
# Customize the index with the labels variable
pd.Series(arr,labels)

a    10
b    20
c    30
dtype: int64

#### iii. Using Dictionary

In [8]:
# Index doesnt need to be specified bc the key gets used.
pd.Series(d)

a    10
b    20
c    30
dtype: int64

### II. Data in a Series

A pandas Series can hold a variety of object types. Up to this point the values were all numerical. But it can be strings and anything else:

In [9]:
pd.Series(data=labels)

0    a
1    b
2    c
dtype: object

In [10]:
# Even functions (though unlikely to be used)
pd.Series([sum,print,len])

0      <built-in function sum>
1    <built-in function print>
2      <built-in function len>
dtype: object

## 1. Using an Index

The key to using a Series is understanding its index. Pandas makes use of these index names or numbers by allowing for fast look ups of information (like a hash table or dictionary).

Let's see some examples of how to grab information from a Series:

In [11]:
ser1 = pd.Series([1,2,3,4],index = ['USA', 'Germany','USSR', 'Japan'])
ser1

USA        1
Germany    2
USSR       3
Japan      4
dtype: int64

In [12]:
ser2 = pd.Series([1,2,5,4],index = ['USA', 'Germany','Italy', 'Japan'])
ser2

USA        1
Germany    2
Italy      5
Japan      4
dtype: int64

In [13]:
# Extract the values for the USA variable
ser1['USA']

1

Operations are then also done based off of index:

In [17]:
# NOTE: 
# NaN values will appear if value doesnt exist in all series being operated on
# 
# NOTE:
# Performing operations on any numpy or pandas object will convert intergers to floats
# 
ser1 * ser2

Germany     4.0
Italy       NaN
Japan      16.0
USA         1.0
USSR        NaN
dtype: float64

Let's move on to _DataFrames_, which will expand on the concept of Series...