# Series

In [1]:
import pandas as pd

## Create a Series Object from a List
- A pandas **Series** is a one-dimensional labelled array.
- A Series combines the best features of a list and a dictionary.
- A Series maintains a single collection of ordered values (i.e. a single column of data).
- We can assign each value an identifier, which does not have to *be* unique.

In [3]:
ice_scream= ['chocolate', 'vanilla', 'strawbery', 'rum raisin']
pd.Series(ice_scream)

0     chocolate
1       vanilla
2     strawbery
3    rum raisin
dtype: object

- The *dtype* final row is expressing the type of the data stored in our pandas series. Although in this case we have only strings, the *object* type is the default representation pandas uses to store strings and more complex data types on our Series (Sequence)
- It is important to mention, however, that object type only applies to strings and mix of data. Other data types follow the expected behavior (integers and booleans, for example)

In [4]:
lottery_numbers= [10,64,44,32,98]
pd.Series(lottery_numbers)

0    10
1    64
2    44
3    32
4    98
dtype: int64

In [6]:
registrations= [True, False, False, True, True]
pd.Series(registrations)

0     True
1    False
2    False
3     True
4     True
dtype: bool

- In general, pandas series uses identifiers to correlate every value we put on it, but when we don't specify those identifiers, it automatically places integer numbers starting from 0 (as well as list objects)

## Create a Series Object from a Dictionary

In [9]:
sushi= {
    'Salmon': 'Orange',
    'Tuna': 'Red',
    'Eel': 'Brown'
}

pd.Series(sushi)

Salmon    Orange
Tuna         Red
Eel        Brown
dtype: object

- It is combining both features from lists and dictitionaries: the property of keeping values order from the list and the property of associating key-value correlations from a dictitionary. In other words, in this case the series knows both informations at the same time: the keys correlation and the order of the stored values 

## Intro to Series Methods
- The syntax to invoke a method on any object is `object.method()`.
- The `sum` method adds together the **Series'** values.
- The `product` method multiplies the **Series'** values.
- The `mean` method finds the average of the **Series'** values.
- The `std` method finds the standard deviation of the **Series'** values.

In [10]:
prices= pd.Series([2.44, 4.00, 8.92])
prices

0    2.44
1    4.00
2    8.92
dtype: float64

In [11]:
prices.sum()

15.36

In [14]:
prices.product()

87.0592

In [12]:
prices.mean()

5.12

In [13]:
prices.std()

3.3820703718284753

## Intro to Attributes
- An **attribute** is a piece of data that lives on an object.
- An **attribute** is a fact, a detail, a characteristic of the object.
- Access an attribute with `object.attribute` syntax.
- The `size` attribute returns a count of the number of values in the **Series**.
- The `is_unique` attribute returns True if the **Series** has no duplicate values.
- The `values` and `index` attributes return the underlying objects that holds the **Series'** values and index labels.

In [15]:
adjectives= pd.Series(['Smart', 'Handsome', 'Charming', 'Brilliant', 'Humble', 'Smart'])
adjectives

0        Smart
1     Handsome
2     Charming
3    Brilliant
4       Humble
5        Smart
dtype: object

In [16]:
adjectives.size

6

In [17]:
adjectives.is_unique

False

In [19]:
adjectives.values

array(['Smart', 'Handsome', 'Charming', 'Brilliant', 'Humble', 'Smart'],
      dtype=object)

In [21]:
adjectives.index

RangeIndex(start=0, stop=6, step=1)

In [24]:
type(adjectives.values) # an element from another library       

numpy.ndarray

In [25]:
type(adjectives.index)

pandas.core.indexes.range.RangeIndex

## Parameters and Arguments
- A **parameter** is the name for an expected input to a function/method/class instantiation.
- An **argument** is the concrete value we provide for a parameter during invocation.
- We can pass arguments either sequentially (based on parameter order) or with explicit parameter names written out.
- The first two parameters for the **Series** constructor are `data` and `index`, which represent the values and the index labels.

In [26]:
fruits= ['Apple', 'Watermelon', 'Orange', 'Grapefruit', 'Pineapple', 'Strawberry']
weekdays= ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday']

In [30]:
pd.Series(fruits)
pd.Series(weekdays)
pd.Series(fruits, weekdays)
pd.Series(weekdays, fruits)

Apple            Monday
Watermelon      Tuesday
Orange        Wednesday
Grapefruit     Thursday
Pineapple        Friday
Strawberry     Saturday
dtype: object

In [32]:
pd.Series(data= fruits, index= weekdays)
pd.Series(index= weekdays, data= fruits)

Monday            Apple
Tuesday      Watermelon
Wednesday        Orange
Thursday     Grapefruit
Friday        Pineapple
Saturday     Strawberry
dtype: object

In [33]:
pd.Series(fruits, index= weekdays)

Monday            Apple
Tuesday      Watermelon
Wednesday        Orange
Thursday     Grapefruit
Friday        Pineapple
Saturday     Strawberry
dtype: object

In [34]:
pd.Series(fruits)

0         Apple
1    Watermelon
2        Orange
3    Grapefruit
4     Pineapple
5    Strawberry
dtype: object

In [35]:
pd.Series()

Series([], dtype: object)

## Import Series with the pd.read_csv Function
- A **CSV** is a plain text file that uses line breaks to separate rows and commas to separate row values.
- Pandas ships with many different `read_` functions for different types of files.
- The `read_csv` function accepts many different parameters. The first one specifies the file name/path.
- The `read_csv` function will import the dataset as a **DataFrame**, a 2-dimensional table.
- The `usecols` parameter accepts a list of the column(s) to import.
- The `squeeze` method converts a **DataFrame** to a **Series**.

In [10]:
df= pd.read_csv('pokemon.csv') # a dataframe
df

Unnamed: 0,Name,Type
0,Bulbasaur,"Grass, Poison"
1,Ivysaur,"Grass, Poison"
2,Venusaur,"Grass, Poison"
3,Charmander,Fire
4,Charmeleon,Fire
...,...,...
1005,Iron Valiant,"Fairy, Fighting"
1006,Koraidon,"Fighting, Dragon"
1007,Miraidon,"Electric, Dragon"
1008,Walking Wake,"Water, Dragon"


In [12]:
# we can convert an entire dataframe into a series by using the following method:

new_series= df['Name']#.squeeze('columns')
new_series

# the simple call of a single column from pandas dataframe returns to us a Series object 

0          Bulbasaur
1            Ivysaur
2           Venusaur
3         Charmander
4         Charmeleon
            ...     
1005    Iron Valiant
1006        Koraidon
1007        Miraidon
1008    Walking Wake
1009     Iron Leaves
Name: Name, Length: 1010, dtype: object

In [8]:
# or, another alternative treatment could be
series = pd.read_csv('google_stock_price.csv', usecols=['Price']).squeeze('columns')

series

0         2.490664
1         2.515820
2         2.758411
3         2.770615
4         2.614201
           ...    
4788    132.080002
4789    132.998001
4790    135.570007
4791    137.050003
4792    138.429993
Name: Price, Length: 4793, dtype: float64

## The head and tail Methods
- The `head` method returns a number of rows from the top/beginning of the `Series`.
- The `tail` method returns a number of rows from the bottom/end of the `Series`.

In [14]:
pokemon= pd.read_csv('pokemon.csv', usecols=['Name']).squeeze('columns')
google= pd.read_csv('google_stock_price.csv', usecols=['Price']).squeeze('columns')

In [17]:
pokemon.head() # 5 is default

0     Bulbasaur
1       Ivysaur
2      Venusaur
3    Charmander
4    Charmeleon
Name: Name, dtype: object

In [18]:
google.tail(10) # 5 is default

4783    127.849998
4784    129.130005
4785    130.850006
4786    134.727005
4787    130.139999
4788    132.080002
4789    132.998001
4790    135.570007
4791    137.050003
4792    138.429993
Name: Price, dtype: float64

## Passing Series to Python's Built-In Functions
- The `len` function returns the length of the **Series**.
- The `type` function returns the type of an object.
- The `list` function converts the **Series** to a list.
- The `dict` function converts the **Series** to a dictionary.
- The `sorted` function converts the **Series** to a sorted list.
- The `max` function returns the largest value in the **Series**.
- The `min` function returns the smalllest value in the **Series**.

In [19]:
pokemon= pd.read_csv('pokemon.csv', usecols=['Name']).squeeze('columns')
google= pd.read_csv('google_stock_price.csv', usecols=['Price']).squeeze('columns')

In [32]:
len(pokemon)
type(pokemon)
list(pokemon)
sorted(pokemon)
type(sorted(pokemon))
sorted(google)
dict(pokemon)

# max and min can be used both as functions or methods
max(google)
google.min()

2.47049

In [31]:
google.max()

151.863495

## Check for Inclusion with Python's in Keyword
- The `in` keyword checks if a value exists within an object.
- The `in` keyword will look for a value in the **Series's** index.
- Use the `index` and `values` attributes to access "nested" objects within the **Series**.
- Combine the `in` keyword with `values` to search within the **Series's** values.

In [38]:
pokemon= pd.read_csv('pokemon.csv', usecols=['Name']).squeeze('columns')
google= pd.read_csv('google_stock_price.csv', usecols=['Price']).squeeze('columns')

In [39]:
pokemon.head()

0     Bulbasaur
1       Ivysaur
2      Venusaur
3    Charmander
4    Charmeleon
Name: Name, dtype: object

In [42]:
'Venusaur' in pokemon

False

- This happens because, by default, *in* keyword is settled to search for the passed argument in the index of the series. In this case we need to specify we want it to check over the series values

In [43]:
'Venusaur' in pokemon.values

True

In [44]:
print(5 in pokemon.index)
print(1999999 in pokemon.index)

True
False


## The sort_values Method
- The `sort_values` method sorts a **Series** values in order.
- By default, pandas applies an ascending sort (smallest to largest).
- Customize the sort order with the `ascending` parameter.

In [45]:
pokemon= pd.read_csv('pokemon.csv', usecols=['Name']).squeeze('columns')
google= pd.read_csv('google_stock_price.csv', usecols=['Price']).squeeze('columns')

google.head()

0    2.490664
1    2.515820
2    2.758411
3    2.770615
4    2.614201
Name: Price, dtype: float64

In [51]:
google.sort_values() # we are generating a complete new series
google.sort_values(ascending= True)
google.sort_values(ascending= False).head()

4395    151.863495
4345    151.000000
4346    150.141754
4341    150.000000
4336    150.000000
Name: Price, dtype: float64

In [50]:
sorted_series = google.sort_values()
sorted_series
google

0         2.490664
1         2.515820
2         2.758411
3         2.770615
4         2.614201
           ...    
4788    132.080002
4789    132.998001
4790    135.570007
4791    137.050003
4792    138.429993
Name: Price, Length: 4793, dtype: float64

In [58]:
pokemon.sort_values()
pokemon.sort_values(ascending= True)
pokemon.sort_values(ascending= False).tail()

680    Aegislash
616     Accelgor
358        Absol
62          Abra
459    Abomasnow
Name: Name, dtype: object

## The sort_index Method
- The `sort_index` method sorts a **Series** by its index.
- The `sort_index` method also accepts an `ascending` parameter to set sort order.

In [69]:
pokemon= pd.read_csv('pokemon.csv', usecols=['Name', 'Type'], index_col=['Name']).squeeze('columns')
pokemon

Name
Bulbasaur          Grass, Poison
Ivysaur            Grass, Poison
Venusaur           Grass, Poison
Charmander                  Fire
Charmeleon                  Fire
                      ...       
Iron Valiant     Fairy, Fighting
Koraidon        Fighting, Dragon
Miraidon        Electric, Dragon
Walking Wake       Water, Dragon
Iron Leaves       Grass, Psychic
Name: Type, Length: 1010, dtype: object

In [63]:
pokemon.sort_index()
pokemon.sort_index(ascending= True)
pokemon.sort_index(ascending= False)

Name
Zygarde      Dragon, Ground
Zweilous       Dark, Dragon
Zubat        Poison, Flying
Zorua                  Dark
Zoroark                Dark
                  ...      
Aegislash      Steel, Ghost
Accelgor                Bug
Absol                  Dark
Abra                Psychic
Abomasnow        Grass, Ice
Name: Type, Length: 1010, dtype: object

## Extract Series Value by Index Position
- Use the `iloc` accessor to extract a **Series** value by its index position.
- `iloc` is short for "index location".
- Python's list slicing syntaxes (slices, slices from start, slices to end, etc.) are supported with **Series** objects.

In [75]:
pokemon= pd.read_csv('pokemon.csv', usecols=['Name']).squeeze('columns')
pokemon.head()

0     Bulbasaur
1       Ivysaur
2      Venusaur
3    Charmander
4    Charmeleon
Name: Name, dtype: object

In [77]:
pokemon.iloc[0] # we get the pokemon name at position 0
pokemon.iloc[500] # we get the pokemon name at position 500

pokemon.iloc[[1,10,100]] # new series from a list of indexes we want to check

pokemon.iloc[10:34] # ordered list of indexes and their values we want to see

10       Metapod
11    Butterfree
12        Weedle
13        Kakuna
14      Beedrill
15        Pidgey
16     Pidgeotto
17       Pidgeot
18       Rattata
19      Raticate
20       Spearow
21        Fearow
22         Ekans
23         Arbok
24       Pikachu
25        Raichu
26     Sandshrew
27     Sandslash
28      Nidoran♀
29      Nidorina
30     Nidoqueen
31      Nidoran♂
32      Nidorino
33      Nidoking
Name: Name, dtype: object

## Extract Series Value by Index Label
- Use the `loc` accessor to extract a **Series** value by its index label.
- Pass a list to extract multiple values by index label.
- If one index label/position in the list does not exist, Pandas will raise an error.

In [78]:
guitars_dict = {
    "Fender Telecaster": "Baby Blue",
    "Gibson Les Paul": "Sunburst",
    "ESP Eclipse": "Dark Green"
}
guitars = pd.Series(guitars_dict)
guitars

Fender Telecaster     Baby Blue
Gibson Les Paul        Sunburst
ESP Eclipse          Dark Green
dtype: object

In [83]:
print(guitars.iloc[0])
print(guitars.loc['Fender Telecaster'])

Baby Blue
Baby Blue


- In spite of we don't see index numbers inside the Series defined with a new type of index column, this object still carries the idea of ordering indexes. That's why we can still apply iloc method on it

In [86]:
pokemon= pd.read_csv('pokemon.csv', usecols=['Name', 'Type'], index_col= ['Name']).squeeze('columns')
pokemon.head()

Name
Bulbasaur     Grass, Poison
Ivysaur       Grass, Poison
Venusaur      Grass, Poison
Charmander             Fire
Charmeleon             Fire
Name: Type, dtype: object

In [89]:
print(pokemon.loc['Charmander'])
print(pokemon.loc['Venusaur'])
print('-'*10)
print(pokemon.loc[['Charmander', 'Venusaur', 'Lucario']])

Fire
Grass, Poison
----------
Name
Charmander               Fire
Venusaur        Grass, Poison
Lucario       Fighting, Steel
Name: Type, dtype: object


## The get Method on a Series
- The `get` method extracts a **Series** value by index label. It is an alternative option to square brackets.
- The `get` method's second argument sets the fallback value to return if the label/position does not exist.

In [90]:
pokemon= pd.read_csv('pokemon.csv', index_col= ['Name']).squeeze('columns')
pokemon.head()

Name
Bulbasaur     Grass, Poison
Ivysaur       Grass, Poison
Venusaur      Grass, Poison
Charmander             Fire
Charmeleon             Fire
Name: Type, dtype: object

In [94]:
pokemon.get('Moltres') # works like loc method

'Fire, Flying'

In [99]:
# the advantage of get method is that it does not provide any errors when you search for a label that does not exist

#pokemon.loc['Digimon']
pokemon.get('Digimon')
pokemon.get('Digimon', 'NotFound') # string to be returned in case label not found
pokemon.get('Moltres', 'NotFound')

pokemon.get(['Moltres', 'Digimon'], 'One of the values in the list was not found')

'One of the values in the list was not found'

## Overwrite a Series Value
- Use the `loc/iloc` accessor to target an index label/position, then use an equal sign to provide a new value.

In [101]:
pokemon= pd.read_csv('pokemon.csv', usecols= ['Name']).squeeze('columns')
pokemon.head()

0     Bulbasaur
1       Ivysaur
2      Venusaur
3    Charmander
4    Charmeleon
Name: Name, dtype: object

In [103]:
pokemon.iloc[0]= 'Test'
pokemon.head()

0          Test
1       Ivysaur
2      Venusaur
3    Charmander
4    Charmeleon
Name: Name, dtype: object

In [104]:
pokemon.iloc[[1,2,3]] = ['Test2', 'Test3', 'Test5']
pokemon.head(10)

0          Test
1         Test2
2         Test3
3         Test5
4    Charmeleon
5     Charizard
6      Squirtle
7     Wartortle
8     Blastoise
9      Caterpie
Name: Name, dtype: object

In [105]:
pokemon= pd.read_csv('pokemon.csv', index_col= ['Name']).squeeze('columns')
pokemon.head()

Name
Bulbasaur     Grass, Poison
Ivysaur       Grass, Poison
Venusaur      Grass, Poison
Charmander             Fire
Charmeleon             Fire
Name: Type, dtype: object

In [107]:
pokemon.loc['Bulbasaur']= 'Veneno'
pokemon.head()

Name
Bulbasaur            Veneno
Ivysaur       Grass, Poison
Venusaur      Grass, Poison
Charmander             Fire
Charmeleon             Fire
Name: Type, dtype: object

In [108]:
pokemon.loc[['Charmander', 'Pikachu', 'Squirtle']] = ['Water', 'Watermelon', 'Pineapple, Juice']
pokemon.loc[['Charmander', 'Pikachu', 'Squirtle']]

Name
Charmander               Water
Pikachu             Watermelon
Squirtle      Pineapple, Juice
Name: Type, dtype: object

## The copy Method
- A **copy** is a duplicate/replica of an object.
- Changes to a copy do not modify the original object.
- A **view** is a different way of looking at the *same* data.
- Changes to a view *do* modify the original object.
- The `copy` method creates a copy of a pandas object.

In [134]:
pokemon_df = pd.read_csv('pokemon.csv', usecols= ['Name'])
pokemon_series= pokemon_df.squeeze('columns') 

# in this case our series is just a view of the original dataframe (like another pointer / referentiation of the same object)

In [119]:
pokemon_df

Unnamed: 0,Name
0,Bulbasaur
1,Ivysaur
2,Venusaur
3,Charmander
4,Charmeleon
...,...
1005,Iron Valiant
1006,Koraidon
1007,Miraidon
1008,Walking Wake


In [120]:
pokemon_series

0          Bulbasaur
1            Ivysaur
2           Venusaur
3         Charmander
4         Charmeleon
            ...     
1005    Iron Valiant
1006        Koraidon
1007        Miraidon
1008    Walking Wake
1009     Iron Leaves
Name: Name, Length: 1010, dtype: object

In [121]:
pokemon_series.iloc[0] = 'Whatever'
pokemon_series.head()

0      Whatever
1       Ivysaur
2      Venusaur
3    Charmander
4    Charmeleon
Name: Name, dtype: object

In [122]:
pokemon_df.head()

Unnamed: 0,Name
0,Whatever
1,Ivysaur
2,Venusaur
3,Charmander
4,Charmeleon


In [140]:
pokemon_df = pd.read_csv('pokemon.csv', usecols= ['Name'])
pokemon_series= pokemon_df.squeeze('columns').copy() 

# this will enforce pandas to create a complete new series object (it will not be a view of the dataframe anymore)

In [141]:
pokemon_series.iloc[0] = 'Whatever'
pokemon_series.head()

0      Whatever
1       Ivysaur
2      Venusaur
3    Charmander
4    Charmeleon
Name: Name, dtype: object

In [142]:
pokemon_df.head()

Unnamed: 0,Name
0,Bulbasaur
1,Ivysaur
2,Venusaur
3,Charmander
4,Charmeleon


## Math Methods on Series Objects
- The `count` method returns the number of values in the **Series**. It excludes missing values; the `size` attribute includes missing values.
- The `sum` method adds together the **Series's** values.
- The `product` method multiplies together the **Series's** values.
- The `mean` method calculates the average of the **Series's** values.
- The `std` method calculates the standard deviation of the **Series's** values.
- The `max` method returns the largest value in the **Series**.
- The `min` method returns the smallest value in the **Series**.
- The `median` method returns the median of the **Series** (the value in the middle).
- The `mode` method returns the mode of the **Series** (the most frequent alue).
- The `describe` method returns a summary with various mathematical calculations.

In [143]:
google= pd.read_csv('google_stock_price.csv', usecols=['Price']).squeeze('columns')
google

0         2.490664
1         2.515820
2         2.758411
3         2.770615
4         2.614201
           ...    
4788    132.080002
4789    132.998001
4790    135.570007
4791    137.050003
4792    138.429993
Name: Price, Length: 4793, dtype: float64

In [None]:
print(google.count()) # count the amount of not null values
print(google.size) # includes null values

In [144]:
google.sum()

192733.129338

In [145]:
google.product() # multiplies all the values of the series

  return umr_prod(a, axis, dtype, out, keepdims, initial, where)


inf

In [146]:
google.mean()

40.211376870018775

In [147]:
google.std()

37.274752943868094

In [148]:
google.max()

151.863495

In [149]:
google.min()

2.47049

In [150]:
google.median()

26.327717

In [151]:
google.mode() # values that occur the most in the series

0    14.719826
1    49.000000
Name: Price, dtype: float64

In [159]:
google[google== 14.719826]

934     14.719826
1332    14.719826
1595    14.719826
1650    14.719826
Name: Price, dtype: float64

In [160]:
google[google== 49.000000]

3222    49.0
3308    49.0
3309    49.0
3321    49.0
Name: Price, dtype: float64

In [162]:
google.describe() # summary of the series

count    4793.000000
mean       40.211377
std        37.274753
min         2.470490
25%        12.767395
50%        26.327717
75%        56.311001
max       151.863495
Name: Price, dtype: float64

## Broadcasting
- **Broadcasting** describes the process of applying an arithmetic operation to an array (i.e., a **Series**).
- We can combine mathematical operators with a **Series** to apply the mathematical operation to every value.
- There are also methods to accomplish the same results (`add`, `sub`, `mul`, `div`, etc.)

In [163]:
google = pd.read_csv('google_stock_price.csv', usecols=['Price']).squeeze('columns')
google.head()

0    2.490664
1    2.515820
2    2.758411
3    2.770615
4    2.614201
Name: Price, dtype: float64

In [166]:
google.add(10)
google+10  # applies 10 to every value of the series

# (this is a feature of numpy, from which pandas is based on)

0        12.490664
1        12.515820
2        12.758411
3        12.770615
4        12.614201
           ...    
4788    142.080002
4789    142.998001
4790    145.570007
4791    147.050003
4792    148.429993
Name: Price, Length: 4793, dtype: float64

In [168]:
google.sub(30)
google-30

0       -27.509336
1       -27.484180
2       -27.241589
3       -27.229385
4       -27.385799
           ...    
4788    102.080002
4789    102.998001
4790    105.570007
4791    107.050003
4792    108.429993
Name: Price, Length: 4793, dtype: float64

In [170]:
google.mul(2)
google*2

0         4.981328
1         5.031640
2         5.516822
3         5.541230
4         5.228402
           ...    
4788    264.160004
4789    265.996002
4790    271.140014
4791    274.100006
4792    276.859986
Name: Price, Length: 4793, dtype: float64

In [172]:
google.div(2)
google/2

0        1.245332
1        1.257910
2        1.379206
3        1.385307
4        1.307100
          ...    
4788    66.040001
4789    66.499000
4790    67.785004
4791    68.525002
4792    69.214996
Name: Price, Length: 4793, dtype: float64

## The value_counts Method
- The `value_counts` method returns the number of times each unique value occurs in the **Series**.
- The `normalize` parameter returns the relative frequencies/percentages of the values instead of the counts.

In [174]:
pokemon = pd.read_csv('pokemon.csv', index_col=['Name']).squeeze('columns')
pokemon.head()

Name
Bulbasaur     Grass, Poison
Ivysaur       Grass, Poison
Venusaur      Grass, Poison
Charmander             Fire
Charmeleon             Fire
Name: Type, dtype: object

In [179]:
pokemon.value_counts() # how many times the value appeared on series

Type
Water               74
Normal              74
Grass               46
Psychic             39
Fire                36
                    ..
Fighting, Ice        1
Fire, Dragon         1
Normal, Dragon       1
Psychic, Steel       1
Fighting, Dragon     1
Name: count, Length: 200, dtype: int64

In [180]:
pokemon.value_counts(normalize=True) *100 # to get a percentage basis

Type
Water               7.326733
Normal              7.326733
Grass               4.554455
Psychic             3.861386
Fire                3.564356
                      ...   
Fighting, Ice       0.099010
Fire, Dragon        0.099010
Normal, Dragon      0.099010
Psychic, Steel      0.099010
Fighting, Dragon    0.099010
Name: proportion, Length: 200, dtype: float64

## The apply Method
- The `apply` method accepts a function. It invokes that function on every `Series` value.

In [181]:
pokemon= pd.read_csv('pokemon.csv', usecols=['Name']).squeeze('columns')
pokemon.head()

0     Bulbasaur
1       Ivysaur
2      Venusaur
3    Charmander
4    Charmeleon
Name: Name, dtype: object

In [184]:
def upper_clause(x: str):
    return x.upper()

pokemon.apply(upper_clause)

0          BULBASAUR
1            IVYSAUR
2           VENUSAUR
3         CHARMANDER
4         CHARMELEON
            ...     
1005    IRON VALIANT
1006        KORAIDON
1007        MIRAIDON
1008    WALKING WAKE
1009     IRON LEAVES
Name: Name, Length: 1010, dtype: object

In [186]:
pokemon.apply(len) # the lenght of each value string of the series

0        9
1        7
2        8
3       10
4       10
        ..
1005    12
1006     8
1007     8
1008    12
1009    11
Name: Name, Length: 1010, dtype: int64

In [187]:
def count_of_a(pokemon):
    return pokemon.count('a')

pokemon.apply(count_of_a)

0       2
1       1
2       1
3       2
4       1
       ..
1005    2
1006    1
1007    1
1008    2
1009    1
Name: Name, Length: 1010, dtype: int64

## The map Method
- The `map` method "maps" or connects each **Series** values to another value.
- We can pass the method a dictionary or a **Series**. Both types connects keys to values.
- The `map` method uses our argument to connect or bridge together the values.

In [188]:
pokemon= pd.read_csv('pokemon.csv', index_col=['Name']).squeeze('columns')
pokemon.head()

Name
Bulbasaur     Grass, Poison
Ivysaur       Grass, Poison
Venusaur      Grass, Poison
Charmander             Fire
Charmeleon             Fire
Name: Type, dtype: object

In [190]:
pokemon.map({'Grass, Poison': 'Folha'}).head()

# NaN is the pandas designation for missing value
# (NaN = Not a Number, but really means a missing value)

Name
Bulbasaur     Folha
Ivysaur       Folha
Venusaur      Folha
Charmander      NaN
Charmeleon      NaN
Name: Type, dtype: object

In [193]:
pokemon.unique().size

200

In [194]:
pokemon.size

1010