# Pandas Series Part II Lesson

In [1]:
import pandas as pd

<hr style="border:2px solid gray">

In this lesson, we will be covering:
- Subsetting
- String Attributes ```.str()```
    - ```.lower()```
    - ```.replace()```
    - ```.any()```
    - ```.all()```
    - ```.isin()```
    - ```.apply()```

### Subsetting
<b>Subsetting</b>: We can subset by label, index, or boolean sequences
- We use subsetting if we only want to use a certain part of our series.

In [2]:
#create a series of numbers
pi_series = pd.Series([3, 1, 4, 1, 5, 9, 2, 6, 5, 3, 5])

#call that variable
pi_series

0     3
1     1
2     4
3     1
4     5
5     9
6     2
7     6
8     5
9     3
10    5
dtype: int64

In [3]:
#create a new variable
##for any number in that series greater than 5
booleans = pi_series > 5

#return the boolean values
booleans

0     False
1     False
2     False
3     False
4     False
5      True
6     False
7      True
8     False
9     False
10    False
dtype: bool

In [4]:
#now, I'm saying- return to me 
##only the values that are true
pi_series[booleans]

5    9
7    6
dtype: int64

We can make these statements different by narrowing or expanding our subsetting options.

- The pipe ```|``` character is used to <b>or</b>
- The ```&``` character is used for <b>and

In [5]:
# Find the numbers that are even or greater than 5
pi_series[(pi_series % 2 == 0) | (pi_series > 5)]

2    4
5    9
6    2
7    6
dtype: int64

In [7]:
#using and
pi_series[(pi_series % 2 == 0) & (pi_series > 5)]

7    6
dtype: int64

What if we did not want to use parentheses?

In [9]:
#assign even variable
is_even = pi_series % 2 == 0

#assign greater than five variable
greater_than_five = pi_series > 5

#use the | to say either, or
pi_series[is_even | greater_than_five]

2    4
5    9
6    2
7    6
dtype: int64

As we can see, we get the same output as above.

<hr style="border:1px solid black">

## The ```.str``` Attribute

In [10]:
#define the series of ds instructors
ds_team_series = pd.Series(['Adam', 'Amanda', 'Andrew','Brooke', 'John', 'John', 
                            'Madeleine','Margaret', 'Misty', 'Ryan', 'Tasha', 
                           ])

### ```.lower()```

In [11]:
ds_team_series.str.lower()

0          adam
1        amanda
2        andrew
3        brooke
4          john
5          john
6     madeleine
7      margaret
8         misty
9          ryan
10        tasha
dtype: object

In [12]:
#assign a new series
string_series = pd.Series(['Hello', 'CodeuP', 'StUDenTs'])

#take a look
string_series

0       Hello
1      CodeuP
2    StUDenTs
dtype: object

In [13]:
#we can even lowercase if it's not the first letter
string_series.str.lower()

0       hello
1      codeup
2    students
dtype: object

<hr style="border:1px solid grey">

### ```.replace()```

In [14]:
ds_team_series

0          Adam
1        Amanda
2        Andrew
3        Brooke
4          John
5          John
6     Madeleine
7      Margaret
8         Misty
9          Ryan
10        Tasha
dtype: object

In [15]:
ds_team_series = ds_team_series.str.replace('rgaret', 'ggie')

ds_team_series

0          Adam
1        Amanda
2        Andrew
3        Brooke
4          John
5          John
6     Madeleine
7        Maggie
8         Misty
9          Ryan
10        Tasha
dtype: object

In [16]:
#replace all the e's with _
string_series.str.replace('e', '_')

0       H_llo
1      Cod_uP
2    StUD_nTs
dtype: object

<hr style="border:1px solid grey">

### Chaining methods

<b>We can also link methods together in a series!

In [17]:
#assign a new series
string_series = pd.Series(['Hello', 'CodeuP', 'StUDenTs'])
string_series

0       Hello
1      CodeuP
2    StUDenTs
dtype: object

In [20]:
#lowercase AND replace
string_series.str.lower().str.replace('e', '_')

0       h_llo
1      cod_up
2    stud_nts
dtype: object

<b>We can even use method chaining and indexing!

<hr style="border:1px solid grey">

## More Series Methods

### ```any()```
returns a single boolean...do <b>any</b> values in the series meet the condition

In [22]:
pi_series

0     3
1     1
2     4
3     1
4     5
5     9
6     2
7     6
8     5
9     3
10    5
dtype: int64

In [21]:
(pi_series > 3).any()

True

### ```all()```
returns a single boolean...do <b>all</b> values in the series meet the condition?

In [23]:
(pi_series > 3).all()

False

### ```.isin()```
comparing string of each item in series to a list of strings. Is the string in your series found in the list of strings? Returns a series of boolean values.

In [24]:
# Use `isin()` to tell whether each value is in a set of known values. 
vowels = list('aeiouy')
vowels

['a', 'e', 'i', 'o', 'u', 'y']

In [25]:
#create a list of letters
letters = list('abcdefghijkeliminnow')
letters

['a',
 'b',
 'c',
 'd',
 'e',
 'f',
 'g',
 'h',
 'i',
 'j',
 'k',
 'e',
 'l',
 'i',
 'm',
 'i',
 'n',
 'n',
 'o',
 'w']

In [26]:
#turn that into a series
letter_series = pd.Series(letters)
letter_series

0     a
1     b
2     c
3     d
4     e
5     f
6     g
7     h
8     i
9     j
10    k
11    e
12    l
13    i
14    m
15    i
16    n
17    n
18    o
19    w
dtype: object

In [27]:
#is that letter a vowel, count how many
#we have 7 vowels and 13 others
letter_series.isin(vowels).value_counts()

False    13
True      7
dtype: int64

### ```.apply()```
apply a function to each item in a series.
- define the function -> series.apply(fcn)
- using a lambda: series.apply(lambda n: )

Define the function the .apply(your_function)

Below we define a function, even_or_odd, then reference that function when we call .apply. Notice that when we reference the even_or_odd function, we are not calling the function, rather, we are passing the even_or_odd function itself to the .apply method as an argument, which pandas will then call on every element of the Series.

In [28]:
def even_or_odd(n):
    '''
    this function takes a number and returns a string indicating 
    whether the passed number is even or odd
    '''
    if n % 2 == 0:
        return 'even'
    else:
        return 'odd'

pi_series.apply(even_or_odd)

0      odd
1      odd
2     even
3      odd
4      odd
5      odd
6     even
7     even
8      odd
9      odd
10     odd
dtype: object

<b>Using a lambda instead</b>
- It is also very common to see lambda functions used along with ```.apply()```. We could re-write the above example with a lambda function like so:

In [29]:
pi_series.apply(lambda n: 'even' if n % 2 == 0 else 'odd')

0      odd
1      odd
2     even
3      odd
4      odd
5      odd
6     even
7     even
8      odd
9      odd
10     odd
dtype: object