<h2>The Series Object</h2>
<h3>This is part 2 of pandas learning</h3>

In this notebook I will practice:

- Modules, Classes, and Instances
- Populating a Series with values
- Customizing the index
- Creating a Series with Missing Values
- Creating a series from:
  - Dictionaries
  - Tuples
  - Sets
  - NumPy Arrays
- Retrieving First and Last Rows
- Mathmemathical Operations
  - Arithmetic
  - Broadcasting
- Passing Series to Built-in Functions

In [1]:
import numpy as np
import pandas as pd

<h4>Modules, Classes, and Instances

In [2]:
#Instantiate a Series Object
pd.Series()

  


Series([], dtype: float64)

<h4>Populating The Series with Values

In [3]:
#Can pass a list into the series constructor 
pet_list = ["Dog","Cat","Rat","Gecko"]
series = pd.Series(list)

series

0    <class 'list'>
dtype: object

<h4>Customizing the Index

Pandas index can have both an elements position and a label for it.

In [4]:
#Create a new series object with days of the week as indexers
ice_cream_flavors = ['Chocolate', 'Vanilla', 'Strawberry', 'Raisin']

days_of_week = ("Monday", "Wednesday", "Friday", "Saturday")

pd.Series(data=ice_cream_flavors,index=days_of_week)

Monday        Chocolate
Wednesday       Vanilla
Friday       Strawberry
Saturday         Raisin
dtype: object

In [5]:
#Repeat series construction, but using duplicate index labels
days_of_week_2 = ('Monday','Wednesday','Friday','Monday')
pd.Series(data=ice_cream_flavors,index=days_of_week_2)

Monday        Chocolate
Wednesday       Vanilla
Friday       Strawberry
Monday           Raisin
dtype: object

<h4>Creating a Series with Missing Values

In [6]:
#Even if there are null values, or NaNs, series can construct and infer data type
pets = ['dog','cat',np.nan]
pd.Series(pets)

0    dog
1    cat
2    NaN
dtype: object

<h3>Creating a Series from Python Objects

<h5>Dictionaries

In [7]:
#Passing a Dictionary into series constructor will use the keys as index labels

color_moods = {
    "Blue":"Sad",
    "Red":"Energetic",
    "Yellow":"Happy",
    "Purple":"Calm"
}

colors = pd.Series(color_moods)
colors

Blue            Sad
Red       Energetic
Yellow        Happy
Purple         Calm
dtype: object

In [8]:
colors.values

array(['Sad', 'Energetic', 'Happy', 'Calm'], dtype=object)

In [9]:
colors.index

Index(['Blue', 'Red', 'Yellow', 'Purple'], dtype='object')

<h5>Tuples

In [10]:
#the Series Constructor can also pass in tuples
tuple_ex = ("Red","Blue","Green")
pd.Series(tuple_ex)

0      Red
1     Blue
2    Green
dtype: object

<h5>Sets

In [11]:
#A set is an unordered collection of unique values
#Declared similarly to a dictionary, but does not contain pairings
#the Series Constructor can not pass in sets

s = {"Blue","Black"}
pd.Series(s)

TypeError: ignored

In [12]:
#need to transform set into a list if passing into Series constructor
s2 = list(s)
pd.Series(s2)

0     Blue
1    Black
dtype: object

<h5>NumPy Arrays

In [13]:
#Series constructor can also accept a 1D array of values
sample_list = np.random.randint(1,101,10)
pd.Series(sample_list) 

0    70
1    29
2    89
3     2
4    78
5    70
6    16
7    72
8    88
9    42
dtype: int64

<h4>Retrieving the First and Last Rows

In [14]:
#First Create a sample list of values, using range
#range(lower, upper, step)
#To get first rows we use .head() method
values = range(0,500,5)
nums = pd.Series(values)
nums.head(3)

0     0
1     5
2    10
dtype: int64

In [15]:
#To get Last rows we use .tail() method
nums.tail(3)

97    485
98    490
99    495
dtype: int64

<h3>Mathematical Operations

In [16]:
#There are many different methods for mathematical operations
numbers = range(1,11)
nums = pd.Series(numbers)
nums

0     1
1     2
2     3
3     4
4     5
5     6
6     7
7     8
8     9
9    10
dtype: int64

In [17]:
#can get the sum of all  the numbers 
nums.sum()

55

In [18]:
#the product of all the values
nums.product()

3628800

In [20]:
#And the statistics 
the_mean = nums.mean() #Average
the_median = nums.median() #Middle Value
the_mode = nums.mode() #repeating value
the_std = nums.std() #standard deviation
the_max = nums.max() #the maximum value
the_min = nums.min() #the min value

In [21]:
#or alternatively can use the .describe() method 
nums.describe()

count    10.00000
mean      5.50000
std       3.02765
min       1.00000
25%       3.25000
50%       5.50000
75%       7.75000
max      10.00000
dtype: float64

In [23]:
#Can also do a small random sample with .sample(n) method
nums.sample(3)

4    5
6    7
7    8
dtype: int64

In [24]:
#The .nunique() method returns number of unique values
nums.nunique()

10

<h4>Passing the Series to Built-in functions

In [31]:
#some helpful prebuilt functions modified for pandas include

#the 'in' key, which only examines the index and not the values by default
sample_group = pd.Series(["blue","red","black","yellow","white","green"])

"blue" in sample_group #Returns false since "blue" is a value, not an index


indexgroup = ["boo","foo","boo","foo","boo","foo"]
sample_group = pd.Series(["blue","red","black","yellow","white","green"], 
                         index=indexgroup)
"foo" in sample_group #returns true since we have it indexed

True

In [32]:
#To use 'in' on the values you have to attach .values property

"blue" in  sample_group.values

True