# Exercise 13 (Pandas - Series)
Dated: May 1, 2020

***A good resource***
https://www.geeksforgeeks.org/python-pandas-series/

# Series
The first main data type we will learn about for pandas is the Series data type.

A Series is very similar to a NumPy array (in fact it is built on top of the NumPy array object). What differentiates the NumPy array from a Series, is that a Series can have axis labels, meaning it can be indexed by a label, instead of just a number location. It also doesn't need to hold numeric data, it can hold any arbitrary Python Object.


### Creating a Series

You can convert a list,numpy array, or dictionary to a Series:

In [1]:
# Import Pandas and NumPy

In [2]:
# Create following 
# A simple list of 'labels' string containing (a,b,c)
# A simple list of integers containing (10,20,30)
# A Numpy array containing (10,20,30)
# A dictionary containing (10,20,30) as values and (a,b,c) as labels

In [54]:
import pandas as pd 
import numpy as np

# A simple list of 'labels' string containing (a,b,c)
S1 = pd.Series(['a','b','c'])
print(S1)
print('\n')

# A simple list of integers containing (10,20,30)
S2 = pd.Series([10,20,30])
print(S2)
print('\n')

# A Numpy array containing (10,20,30)
x = np.arange(10,40,10)
print(x)
print('\n')

# A dictionary containing (10,20,30) as values and (a,b,c) as labels
D = dict(zip(S1, S2))
print(D)

0    a
1    b
2    c
dtype: object


0    10
1    20
2    30
dtype: int64


[10 20 30]


{'a': 10, 'b': 20, 'c': 30}


In [3]:
# Create a series from the simple(not numpy) list of integers

0    10
1    20
2    30
dtype: int64

In [35]:
list = [10,20,30]
S3 = pd.Series(list)
S3

0    10
1    20
2    30
dtype: int64

In [4]:
# Set the indexes equal to the list of labels that we created before

a    10
b    20
c    30
dtype: int64

In [43]:
list = [10,20,30]
S3 = pd.Series(list)
S3.index = S1
S3

a    10
b    20
c    30
dtype: int64

**** Numpy arrays to Series***

In [5]:
# Convert the numpy array to Series

0    10
1    20
2    30
dtype: int64

In [44]:
data = np.array([10,20,30])
S2 = pd.Series(data)
print(S2)

0    10
1    20
2    30
dtype: int32


In [6]:
# Convert the numpy array to Series and set the indexes with the labels list

a    10
b    20
c    30
dtype: int64

In [55]:
data1 = ['a','b','c']
S1 = pd.Series(data1)

data2 = np.array([10,20,30])
S2 = pd.Series(data2)

S2.index = S1
S2


a    10
b    20
c    30
dtype: int32

**** Dictionary to Series***

In [7]:
# Convert the dictionary to Series

a    10
b    20
c    30
dtype: int64

In [58]:
D = dict(zip(S1, S2))
print(D)

S4 = pd.Series(D)
print(S4)


{'a': 10, 'b': 20, 'c': 30}
a    10
b    20
c    30
dtype: int64


### Data in a Series

A pandas Series can hold a variety of object types:

In [8]:
# Create a Series of the labels list that we created above.

In [9]:
S5 = pd.Series(S4.index)
S5

NameError: name 'S4' is not defined

In [10]:
# A Series can contains even functions as well (although unlikely that you will use this)
pd.Series([sum,print,len])
# Leaving the code here intentionally 

0      <built-in function sum>
1    <built-in function print>
2      <built-in function len>
dtype: object

## Using an Index

The key to using a Series is understanding its index. Pandas makes use of these index names or numbers by allowing for fast look ups of information (works like a dictionary).

Let's see how to grab information from a Series. Let us create two sereis, ser1 and ser2:

In [11]:
# Your code goes here


In [12]:
# Recreate this Series (ser1)

In [19]:
# Your code goes here
import pandas as pd
import numpy as np

country = ['USA','Germany','USSR','Japan']
S6 = pd.Series(country)

Data = np.array([1,2,3,4])
Ser1 = pd.Series(Data)

Ser1.index = S6
Ser1

USA        1
Germany    2
USSR       3
Japan      4
dtype: int32

In [20]:
# Recreate this Series (ser2)

In [21]:
country = ['USA','Germany','Italy','Japan']
S6 = pd.Series(country)

Data = np.array([1,2,5,4])
Ser2 = pd.Series(Data)

Ser2.index = S6
Ser2

USA        1
Germany    2
Italy      5
Japan      4
dtype: int32

In [22]:

# put in notes
import pandas as pd
Ser3 = pd.Series([1,2,5,4], index = ['USA','Germany','Italy','Japan'])
Ser3



USA        1
Germany    2
Italy      5
Japan      4
dtype: int64

In [23]:
#Alternative - I am not sure how to replace an entire line, so i did it in 2 steps 
# it is probably better to find where 'USSR' is based on iloc formulae in order to change the value

Ser1.rename(index={'USSR':'Italy'},inplace=True)
Ser1.replace(3,5)

USA        1
Germany    2
Italy      5
Japan      4
dtype: int32

In [24]:
# From series (ser1), grab the following information

In [25]:
Ser1[0]

1

In [27]:
Ser1['USA']
Ser1['Japan']

4

In [19]:
# Same as above

4

In [78]:
Ser1[3]

4

## Following is a list of operations that can be performed on Series.
The resource shared above contains all the information you need. It's a ***very good*** resource.

In [20]:
# Add Ser1 and Ser2 and observe the results

Germany    4.0
Italy      NaN
Japan      8.0
USA        2.0
USSR       NaN
dtype: float64

In [28]:
Ser1 = pd.Series([1,2,3,4], index = ['USA','Germany','USSR','Japan'])
Ser2 = pd.Series([1,2,5,4], index = ['USA','Germany','Italy','Japan'])

Ser1.add(Ser2)

Germany    4.0
Italy      NaN
Japan      8.0
USA        2.0
USSR       NaN
dtype: float64

***Write a short note on what you observed***

In [84]:
# Your answer goes here.

# - The output is a correct sum of the two series
# - Italy and USSR both have values in Ser2 and Ser1 respectively
# - However, when adding the series, given empty values, the add provides NaM values for add.
# - one way to mitigate the issue of missing values is to use 'fill_value =0'
Ser1.add(Ser2,fill_value=0)


Germany    4.0
Italy      5.0
Japan      8.0
USA        2.0
USSR       3.0
dtype: float64

### Series to list conversion

In [26]:
# Recreate the following Series

a    1
b    2
c    3
dtype: int64

In [85]:
data1 = ['a','b','c']
S1 = pd.Series(data1)
data2 = np.array([1,2,3])
S2 = pd.Series(data2)
S2.index = S1
S2

a    1
b    2
c    3
dtype: int32

In [86]:
# Convert Series into list 
S2.tolist()

[1, 2, 3]

In [33]:
# Create a list of values

[1, 2, 3]

In [None]:
I do not understand the question (how different it is from the question above)

In [35]:
# Create a list of indices

['a', 'b', 'c']

In [89]:
S3= S2.index.tolist()
S3

['a', 'b', 'c']

In [39]:
# Check type of newly created list

list

In [90]:
type(S3)

list

## Good luck!