Pandas library was first developed by Wes McKinney in 2008 for data manipulation and analysis.

#### References:
    www.python.org
    www.numpy.org
    www.matplotlib.org
    https://pandas.pydata.org

#### Questions/feedback: petert@digipen.edu

# Chapter07: Pandas Series
## pandas
    - Series, Index, Values
    - Selection and Filtering
    
A **Series** is a one-dimensional labeled array capable of holding data of any type. It is similar to a column or a row in a spreadsheet.

### Import pandas:
    using 'pd'  is standard by Python users
    import frequently used DataFrame and Series onto local namespace is a good practice

In [None]:
import pandas as pd                     # using 'pd'  is standard by Python users
#from pandas import DataFrame            # optional, good practice
#from pandas import Series               # optional, good practice

import numpy as np

### Series
    - one dimensional, similar to an array
    - a sequence (series) of values
    - associated (series of) data labels
##### Examples:

In [None]:
['apple', 2.7, 5, 'Friday', 42]

In [None]:
pd.Series(['apple', 2.7, 5, 'Friday', 42])

In [None]:
# create Series from a list of values
myseries = pd.Series(['apple', 2.7, 5, 'Friday', 42])
print(type(myseries))
myseries

In [None]:
l = ['apple', 2.7, 5, 'Friday', 42]

In [None]:
l[2]

In [None]:
print(['apple', 2.7, 5, 'Friday', 42])

In [None]:
type(myseries)

In [None]:
myseries[2]

Notes:
- the data type is shown as object, the elements are ints, floats and strings
- there is an index line starting from 0

Let's see what happens if we create a series using integers and floats:

In [None]:
# create Series from a numerical array of values (ints)
myseries = pd.Series(range(4))
myseries

In [None]:
# create Series from a numerical array of values (floats)
myseries = pd.Series(np.arange(4.0))  # using numpy array this time
myseries

Now let's mix: load a list of integers and floats:

In [None]:
# specify a series of values and associated indices
myseries = pd.Series([3.14, 2.71, 42, 101])
myseries

Now let's load a list of strings when creating a series:

In [None]:
# specify a series of values and associated indices
myseries = pd.Series(["first", "second", "third", "fourth"])
myseries

In [None]:
myseries[2]

In [None]:
print(type(myseries[2]))

Notes:
- The type of the series is recognized as object
- Individual elements of the series are still recognized as string:

In [None]:
print("The type of the third element in the series:", type(myseries[2]))

##### Specifying and reordering the index
Specify using a list if indices:

In [None]:
# specify a series of values and associated indices
myseries = pd.Series(["first | a", "second | c", "third | d", "fourth | b"], index=['a', 'c', 'd', 'b'])
myseries

Reorder using a list if indices:

In [None]:
myseries.index

In [None]:
# reassign associated indices, note that the order of the values are not changing, only the index names
myseries.index = ['a', 'b', 'c', 'd']
myseries

Note the indices were reordered, though the order of the series did not change

##### Reference values and indices of a Series:

In [None]:
# recreate the same Series
myseries = pd.Series(["first | a", "second | c", "third | d", "fourth | b"], index=['a', 'c', 'd', 'b'])
print("The series:")
print(myseries, "\n")

# retrieve index values
print('Index values:')
print(myseries.index)

# retrieve values of the series
print('\nValues in the series:')
print(myseries.values)

Both the indices and the values are iterable:

In [None]:
print("The second index is:", myseries.index[1])
print("The second value is:", myseries.values[1])

Reference and retrieve by index:

In [None]:
myseries['c']

In [None]:
# reference and retrieve by index
print('Reference a single value by its index:')
print(myseries['c'])
print('\nReference multiple values by their indices:')
# reference and retrieve by indeces
print(myseries[['c', 'a']])

In [None]:
l = ['a', 'c', 'a']
myseries[l]

In [None]:
pd.Series([3.14, 42, 2.71, 101])

In [None]:
numberSeries = pd.Series([3.14, 42, 2.71, 101])

In [None]:
numberSeries

In [None]:
numberSeries > 40

In [None]:
numberSeries[numberSeries > 40]

In [None]:
# retrieve values based on condition
numberSeries = pd.Series([3.14, 42, 2.71, 101])
print('List all values if they are greater than 40:')
numberSeries[numberSeries>40]

Lets go back to the indices and values

Swap the values so they follow their original indices:

In [None]:
# recreate the same Series
myseries = pd.Series(["first | a", "second | c", "third | d", "fourth | b"], 
                     index=['a', 'c', 'd', 'b'])
myseries

In [None]:
myseries[1]

In [None]:
# swap 2nd and 4th elements
t = myseries[1]   # store 2nd temporarily
myseries[1] = myseries[3]
myseries[3] = t
myseries

In [None]:
# now swap 3rd and 4th elements
t = myseries[2]
myseries[2] = myseries[3]
myseries[3] = t
myseries

Notice that now the indices have not changed

How to handle such operation if there is no built in method? Write your own function!

In [None]:
# swap takes two numbers as indices and a series
def swap(n,m,s):
    tv = s[n]
    ti = s.index.values[n]
    
    s[n] = s[m]
    s.index.values[n] = s.index.values[m]
    
    s[m] = tv
    s.index.values[m] = ti
    
    print(s)

In [None]:
# specify the original series again
myseries = pd.Series(["first | a", "second | c", "third | d", "fourth | b"], index=['a', 'c', 'd', 'b'])
myseries

Swap 2nd and 4th elements:

In [None]:
swap(1, 3, myseries)

Swap 3rd and 4th elements:

In [None]:
swap(2, 3, myseries)

Now both values and indices are swapped at the same time.

In [None]:
myseries = pd.Series(["first | a", "second first | a", "second | c", "third | d", "fourth | b"], 
                     index=['a', 'a', 'c', 'd', 'b'])
myseries

In [None]:
myseries['a']

In [None]:
myseries[2]

In [None]:
myseries.values

In [None]:
myseries.index

#### Exercise 7.1:
Create a series:
- create a series using 5 random integers between 1 and 9
- display a pie chart using pd.Series.plot.pie()

In [None]:
# Exercise 7.1 code:



#### Exercise 7.2:
- create a series using 20 random integers between 1 and 9
- display a histogram using pd.Series.plot.hist()

In [None]:
# Exercise 7.2 code:



#### Exercise 7.3:
- create a series using 20 random integers between 1 and 9
- display a box plot using pd.Series.plot.box()

In [None]:
# Exercise 7.3 code:

