## Series

In [1]:
import pandas as pd

## Create a Series Object from a List
- A pandas **Series** is a one-dimensional labelled array.
- A Series combines the best features of a list and a dictionary.
- A Series maintains a single collection of ordered values (i.e. a single column of data).
- We can assign each value an identifier, which does not have to *be* unique

In [3]:
ice_cream=["choclate","vanilla","strawberry","butterscotch"]
pd.Series(ice_cream)

0        choclate
1         vanilla
2      strawberry
3    butterscotch
dtype: object

In [4]:
integers=[10,11,112,13]
pd.Series(integers)

0     10
1     11
2    112
3     13
dtype: int64

In [5]:
boolean=[True,False,False,True]
pd.Series(boolean)

0     True
1    False
2    False
3     True
dtype: bool

### creating a series object from a dictionary

In [6]:
ages={
    "Sandeep":23,
    "Smith":36,
    "Cummins":35
}
pd.Series(ages)

Sandeep    23
Smith      36
Cummins    35
dtype: int64

## import series with pd.read_csv Function
- A CSV is a plain text that uses lines breaks to seperate rows and commas to seperate row values
- pandas ships with many different read_ functions for different types of files 
- The read_csv function accepts many different parameters. the first one specifies the file name/ path.
- The read_csv function will import the dataset as DataFrame, a 2-dimensional table.
- The usecols parameter accepts a list of the columns to import.
- the squeeze method converts a DataFrame to a series

In [7]:
pd.read_csv("pokemon.csv")

Unnamed: 0,Name,Type
0,Bulbasaur,"Grass, Poison"
1,Ivysaur,"Grass, Poison"
2,Venusaur,"Grass, Poison"
3,Charmander,Fire
4,Charmeleon,Fire
...,...,...
1005,Iron Valiant,"Fairy, Fighting"
1006,Koraidon,"Fighting, Dragon"
1007,Miraidon,"Electric, Dragon"
1008,Walking Wake,"Water, Dragon"


In [8]:
pd.read_csv("pokemon.csv",usecols=["Name"])

Unnamed: 0,Name
0,Bulbasaur
1,Ivysaur
2,Venusaur
3,Charmander
4,Charmeleon
...,...
1005,Iron Valiant
1006,Koraidon
1007,Miraidon
1008,Walking Wake


In [9]:
pd.read_csv("pokemon.csv",usecols=["Name"]).squeeze("columns")

0          Bulbasaur
1            Ivysaur
2           Venusaur
3         Charmander
4         Charmeleon
            ...     
1005    Iron Valiant
1006        Koraidon
1007        Miraidon
1008    Walking Wake
1009     Iron Leaves
Name: Name, Length: 1010, dtype: object

## The head and tail Methods
- The `head` method returns a number of rows from the top/beginning of the `Series`.
- The `tail` method returns a number of rows from the bottom/end of the `Series`.

In [10]:
pokemon=pd.read_csv("pokemon.csv",usecols=["Name"]).squeeze("columns")


In [11]:
pokemon.head()

0     Bulbasaur
1       Ivysaur
2      Venusaur
3    Charmander
4    Charmeleon
Name: Name, dtype: object

In [12]:
pokemon.tail(6)

1004    Roaring Moon
1005    Iron Valiant
1006        Koraidon
1007        Miraidon
1008    Walking Wake
1009     Iron Leaves
Name: Name, dtype: object

## Passing Series to Python's Built-In Functions
- The `len` function returns the length of the **Series**.
- The `type` function returns the type of an object.
- The `list` function converts the **Series** to a list.
- The `dict` function converts the **Series** to a dictionary.
- The `sorted` function converts the **Series** to a sorted list.
- The `max` function returns the largest value in the **Series**.
- The `min` function returns the smalllest value in the **Series**.

In [13]:
google = pd.read_csv("google_stock_price.csv", usecols=["Price"]).squeeze("columns")

In [14]:
len(pokemon)

1010

In [15]:
type(pokemon)

pandas.core.series.Series

In [16]:
list(pokemon)

['Bulbasaur',
 'Ivysaur',
 'Venusaur',
 'Charmander',
 'Charmeleon',
 'Charizard',
 'Squirtle',
 'Wartortle',
 'Blastoise',
 'Caterpie',
 'Metapod',
 'Butterfree',
 'Weedle',
 'Kakuna',
 'Beedrill',
 'Pidgey',
 'Pidgeotto',
 'Pidgeot',
 'Rattata',
 'Raticate',
 'Spearow',
 'Fearow',
 'Ekans',
 'Arbok',
 'Pikachu',
 'Raichu',
 'Sandshrew',
 'Sandslash',
 'Nidoran♀',
 'Nidorina',
 'Nidoqueen',
 'Nidoran♂',
 'Nidorino',
 'Nidoking',
 'Clefairy',
 'Clefable',
 'Vulpix',
 'Ninetales',
 'Jigglypuff',
 'Wigglytuff',
 'Zubat',
 'Golbat',
 'Oddish',
 'Gloom',
 'Vileplume',
 'Paras',
 'Parasect',
 'Venonat',
 'Venomoth',
 'Diglett',
 'Dugtrio',
 'Meowth',
 'Persian',
 'Psyduck',
 'Golduck',
 'Mankey',
 'Primeape',
 'Growlithe',
 'Arcanine',
 'Poliwag',
 'Poliwhirl',
 'Poliwrath',
 'Abra',
 'Kadabra',
 'Alakazam',
 'Machop',
 'Machoke',
 'Machamp',
 'Bellsprout',
 'Weepinbell',
 'Victreebel',
 'Tentacool',
 'Tentacruel',
 'Geodude',
 'Graveler',
 'Golem',
 'Ponyta',
 'Rapidash',
 'Slowpoke',
 'Sl

In [17]:
dict(pokemon)

{0: 'Bulbasaur',
 1: 'Ivysaur',
 2: 'Venusaur',
 3: 'Charmander',
 4: 'Charmeleon',
 5: 'Charizard',
 6: 'Squirtle',
 7: 'Wartortle',
 8: 'Blastoise',
 9: 'Caterpie',
 10: 'Metapod',
 11: 'Butterfree',
 12: 'Weedle',
 13: 'Kakuna',
 14: 'Beedrill',
 15: 'Pidgey',
 16: 'Pidgeotto',
 17: 'Pidgeot',
 18: 'Rattata',
 19: 'Raticate',
 20: 'Spearow',
 21: 'Fearow',
 22: 'Ekans',
 23: 'Arbok',
 24: 'Pikachu',
 25: 'Raichu',
 26: 'Sandshrew',
 27: 'Sandslash',
 28: 'Nidoran♀',
 29: 'Nidorina',
 30: 'Nidoqueen',
 31: 'Nidoran♂',
 32: 'Nidorino',
 33: 'Nidoking',
 34: 'Clefairy',
 35: 'Clefable',
 36: 'Vulpix',
 37: 'Ninetales',
 38: 'Jigglypuff',
 39: 'Wigglytuff',
 40: 'Zubat',
 41: 'Golbat',
 42: 'Oddish',
 43: 'Gloom',
 44: 'Vileplume',
 45: 'Paras',
 46: 'Parasect',
 47: 'Venonat',
 48: 'Venomoth',
 49: 'Diglett',
 50: 'Dugtrio',
 51: 'Meowth',
 52: 'Persian',
 53: 'Psyduck',
 54: 'Golduck',
 55: 'Mankey',
 56: 'Primeape',
 57: 'Growlithe',
 58: 'Arcanine',
 59: 'Poliwag',
 60: 'Poliwhirl',

In [18]:
sorted(pokemon)

['Abomasnow',
 'Abra',
 'Absol',
 'Accelgor',
 'Aegislash',
 'Aerodactyl',
 'Aggron',
 'Aipom',
 'Alakazam',
 'Alcremie',
 'Alomomola',
 'Altaria',
 'Amaura',
 'Ambipom',
 'Amoonguss',
 'Ampharos',
 'Annihilape',
 'Anorith',
 'Appletun',
 'Applin',
 'Araquanid',
 'Arbok',
 'Arboliva',
 'Arcanine',
 'Arceus',
 'Archen',
 'Archeops',
 'Arctibax',
 'Arctovish',
 'Arctozolt',
 'Ariados',
 'Armaldo',
 'Armarouge',
 'Aromatisse',
 'Aron',
 'Arrokuda',
 'Articuno',
 'Audino',
 'Aurorus',
 'Avalugg',
 'Axew',
 'Azelf',
 'Azumarill',
 'Azurill',
 'Bagon',
 'Baltoy',
 'Banette',
 'Barbaracle',
 'Barboach',
 'Barraskewda',
 'Basculegion',
 'Basculin',
 'Bastiodon',
 'Baxcalibur',
 'Bayleef',
 'Beartic',
 'Beautifly',
 'Beedrill',
 'Beheeyem',
 'Beldum',
 'Bellibolt',
 'Bellossom',
 'Bellsprout',
 'Bergmite',
 'Bewear',
 'Bibarel',
 'Bidoof',
 'Binacle',
 'Bisharp',
 'Blacephalon',
 'Blastoise',
 'Blaziken',
 'Blipbug',
 'Blissey',
 'Blitzle',
 'Boldore',
 'Boltund',
 'Bombirdier',
 'Bonsly',
 'Bo

In [20]:
max(google)

151.863495

In [21]:
max(pokemon)
## here for the strings the max will the highest alphabet with the longest length, i.e 'Z'

'Zygarde'

In [22]:
min(google)

2.47049

In [23]:
min(pokemon)
## here for the stings the min will be the lowest alphabet with the longest length, i.e 'A'

'Abomasnow'

## Check for Inclusion with Python's in Keyword
- The `in` keyword checks if a value exists within an object.
- The `in` keyword will look for a value in the **Series's** index.
- Use the `index` and `values` attributes to access "nested" objects within the **Series**.
- Combine the `in` keyword with `values` to search within the **Series's** values.

In [25]:
0 in pokemon

True

In [26]:
pokemon.head()

0     Bulbasaur
1       Ivysaur
2      Venusaur
3    Charmander
4    Charmeleon
Name: Name, dtype: object

In [28]:
"Bulbasaur" in pokemon
## here the output will be false because it checks at the python index


False

In [29]:
"Bulbasaur" in pokemon.values

True

In [30]:
23 in pokemon.index

True

## The sort_values Method
- The `sort_values` method sorts a **Series** values in order.
- By default, pandas applies an ascending sort (smallest to largest).
- Customize the sort order with the `ascending` parameter.

In [31]:
pokemon.sort_values(ascending=True)

459    Abomasnow
62          Abra
358        Absol
616     Accelgor
680    Aegislash
         ...    
570      Zoroark
569        Zorua
40         Zubat
633     Zweilous
717      Zygarde
Name: Name, Length: 1010, dtype: object

In [32]:
pokemon.sort_values(ascending=False)

717      Zygarde
633     Zweilous
40         Zubat
569        Zorua
570      Zoroark
         ...    
680    Aegislash
616     Accelgor
358        Absol
62          Abra
459    Abomasnow
Name: Name, Length: 1010, dtype: object

In [34]:
pokemon.sort_values(ascending=True).head(10)

459     Abomasnow
62           Abra
358         Absol
616      Accelgor
680     Aegislash
141    Aerodactyl
305        Aggron
189         Aipom
64       Alakazam
868      Alcremie
Name: Name, dtype: object

## The sort_index Method
- The `sort_index` method sorts a **Series** by its index.
- The `sort_index` method also accepts an `ascending` parameter to set sort order.

In [38]:
pokemon=pd.read_csv("pokemon.csv",index_col="Name").squeeze("columns")

In [39]:
pokemon.sort_index(ascending=True)

Name
Abomasnow        Grass, Ice
Abra                Psychic
Absol                  Dark
Accelgor                Bug
Aegislash      Steel, Ghost
                  ...      
Zoroark                Dark
Zorua                  Dark
Zubat        Poison, Flying
Zweilous       Dark, Dragon
Zygarde      Dragon, Ground
Name: Type, Length: 1010, dtype: object

## Extract Series Value by Index Position
- Use the `iloc` accessor to extract a **Series** value by its index position.
- `iloc` is short for "index location".
- Python's list slicing syntaxes (slices, slices from start, slices to end, etc.) are supported with **Series** objects.

In [40]:
pokemon=pd.read_csv("pokemon.csv",usecols=["Name"]).squeeze("columns")

In [41]:
pokemon.iloc[0:10]

0     Bulbasaur
1       Ivysaur
2      Venusaur
3    Charmander
4    Charmeleon
5     Charizard
6      Squirtle
7     Wartortle
8     Blastoise
9      Caterpie
Name: Name, dtype: object

## Extract Series Value by Index Label
- Use the `loc` accessor to extract a **Series** value by its index label.
- Pass a list to extract multiple values by index label.
- If one index label/position in the list does not exist, Pandas will raise an error.

In [42]:
pokemon = pd.read_csv("pokemon.csv", index_col="Name").squeeze("columns")
pokemon.head()

Name
Bulbasaur     Grass, Poison
Ivysaur       Grass, Poison
Venusaur      Grass, Poison
Charmander             Fire
Charmeleon             Fire
Name: Type, dtype: object

In [43]:
pokemon.iloc[0]

'Grass, Poison'

In [44]:
pokemon.loc["Mewtwo"]
pokemon.loc[["Charizard", "Jolteon", "Meowth"]]

# pokemon.loc["Digimon"]

# pokemon.loc[["Pikachu", "Digimon"]]

Name
Charizard    Fire, Flying
Jolteon          Electric
Meowth             Normal
Name: Type, dtype: object

### overwrite a series value

In [45]:
pokemon = pd.read_csv("pokemon.csv", usecols=["Name"]).squeeze("columns")
pokemon.head()

0     Bulbasaur
1       Ivysaur
2      Venusaur
3    Charmander
4    Charmeleon
Name: Name, dtype: object

In [46]:
pokemon.iloc[0]="centaraus"

In [47]:
pokemon.head()

0     centaraus
1       Ivysaur
2      Venusaur
3    Charmander
4    Charmeleon
Name: Name, dtype: object

In [48]:
pokemon.iloc[[1,2,3]]=["dinosaur","segasaur","triasaur"]

In [49]:
pokemon.head()

0     centaraus
1      dinosaur
2      segasaur
3      triasaur
4    Charmeleon
Name: Name, dtype: object

In [50]:
pokemon = pd.read_csv("pokemon.csv", index_col="Name").squeeze("columns")
pokemon.head()

Name
Bulbasaur     Grass, Poison
Ivysaur       Grass, Poison
Venusaur      Grass, Poison
Charmander             Fire
Charmeleon             Fire
Name: Type, dtype: object

In [54]:
pokemon.loc["Bulbasaur"]="segasaur"

In [55]:
pokemon.head()

Name
Bulbasaur          segasaur
Ivysaur       Grass, Poison
Venusaur      Grass, Poison
Charmander             Fire
Charmeleon             Fire
Name: Type, dtype: object

## The copy Method
- A **copy** is a duplicate/replica of an object.
- Changes to a copy do not modify the original object.
- A **view** is a different way of looking at the *same* data.
- Changes to a view *do* modify the original object.
- The `copy` method creates a copy of a pandas object.

In [56]:
#example for view method
pokemon_df=pd.read_csv("pokemon.csv",usecols=["Name"])
pokemon_series=pokemon_df.squeeze("columns")

In [57]:
pokemon_series.head()

0     Bulbasaur
1       Ivysaur
2      Venusaur
3    Charmander
4    Charmeleon
Name: Name, dtype: object

In [58]:
pokemon_series.iloc[0]="centaraus"

In [59]:
pokemon_df

Unnamed: 0,Name
0,centaraus
1,Ivysaur
2,Venusaur
3,Charmander
4,Charmeleon
...,...
1005,Iron Valiant
1006,Koraidon
1007,Miraidon
1008,Walking Wake


In [60]:
#example for copy method
#example for view method
pokemon_df=pd.read_csv("pokemon.csv",usecols=["Name"])
pokemon_series=pokemon_df.squeeze("columns").copy()

In [61]:
pokemon_series.iloc[0]="centaraus"

In [62]:
pokemon_df

Unnamed: 0,Name
0,Bulbasaur
1,Ivysaur
2,Venusaur
3,Charmander
4,Charmeleon
...,...
1005,Iron Valiant
1006,Koraidon
1007,Miraidon
1008,Walking Wake


## Math Methods on Series Objects
- The `count` method returns the number of values in the **Series**. It excludes missing values; the `size` attribute includes missing values.
- The `sum` method adds together the **Series's** values.
- The `product` method multiplies together the **Series's** values.
- The `mean` method calculates the average of the **Series's** values.
- The `std` method calculates the standard deviation of the **Series's** values.
- The `max` method returns the largest value in the **Series**.
- The `min` method returns the smallest value in the **Series**.
- The `median` method returns the median of the **Series** (the value in the middle).
- The `mode` method returns the mode of the **Series** (the most frequent alue).
- The `describe` method returns a summary with various mathematical calculations.

In [64]:
google=pd.read_csv("google_stock_price.csv",usecols=["Price"]).squeeze("columns")
google.head()


0    2.490664
1    2.515820
2    2.758411
3    2.770615
4    2.614201
Name: Price, dtype: float64

In [65]:
google.count()

4793

In [66]:
google.size()
#for integer object it is not callable , it is only callable for string object

TypeError: 'int' object is not callable

In [67]:
google.sum()
# it returns the total sum of values i.e price

192733.129338

In [69]:
google.product()

inf

In [70]:
google.mean()

40.211376870018775

In [71]:
google.std()

37.274752943868094

In [72]:
google.median()

26.327717

In [73]:
google.mode()

0    14.719826
1    49.000000
Name: Price, dtype: float64

In [74]:
google.min()

2.47049

In [75]:
google.max()

151.863495

In [76]:
google.describe()

count    4793.000000
mean       40.211377
std        37.274753
min         2.470490
25%        12.767395
50%        26.327717
75%        56.311001
max       151.863495
Name: Price, dtype: float64

## Broadcasting
- **Broadcasting** describes the process of applying an arithmetic operation to an array (i.e., a **Series**).
- We can combine mathematical operators with a **Series** to apply the mathematical operation to every value.
- There are also methods to accomplish the same results (`add`, `sub`, `mul`, `div`, etc.)

In [77]:
google = pd.read_csv("google_stock_price.csv", usecols=["Price"]).squeeze("columns")
google.head()

0    2.490664
1    2.515820
2    2.758411
3    2.770615
4    2.614201
Name: Price, dtype: float64

In [78]:
google.add(5)
# or we can also use the syntax google + 5

0         7.490664
1         7.515820
2         7.758411
3         7.770615
4         7.614201
           ...    
4788    137.080002
4789    137.998001
4790    140.570007
4791    142.050003
4792    143.429993
Name: Price, Length: 4793, dtype: float64

In [82]:
google.sub(2)
# or we can also use the syntax google - 5 

0         0.490664
1         0.515820
2         0.758411
3         0.770615
4         0.614201
           ...    
4788    130.080002
4789    130.998001
4790    133.570007
4791    135.050003
4792    136.429993
Name: Price, Length: 4793, dtype: float64

In [83]:
google.mul(5)

0        12.453320
1        12.579100
2        13.792055
3        13.853075
4        13.071005
           ...    
4788    660.400010
4789    664.990005
4790    677.850035
4791    685.250015
4792    692.149965
Name: Price, Length: 4793, dtype: float64

In [84]:
google.div(5)

0        0.498133
1        0.503164
2        0.551682
3        0.554123
4        0.522840
          ...    
4788    26.416000
4789    26.599600
4790    27.114001
4791    27.410001
4792    27.685999
Name: Price, Length: 4793, dtype: float64

## The value_counts Method
- The `value_counts` method returns the number of times each unique value occurs in the **Series**.
- The `normalize` parameter returns the relative frequencies/percentages of the values instead of the counts.

In [85]:
pokemon=pd.read_csv("pokemon.csv",index_col="Name").squeeze("columns")
pokemon.head()

Name
Bulbasaur     Grass, Poison
Ivysaur       Grass, Poison
Venusaur      Grass, Poison
Charmander             Fire
Charmeleon             Fire
Name: Type, dtype: object

In [86]:
pokemon.value_counts()

Type
Water               74
Normal              74
Grass               46
Psychic             39
Fire                36
                    ..
Fighting, Ice        1
Fire, Dragon         1
Normal, Dragon       1
Psychic, Steel       1
Fighting, Dragon     1
Name: count, Length: 200, dtype: int64

In [87]:
pokemon.value_counts(ascending=True)
# by default ascending = false 

Type
Ice, Ghost           1
Fire, Water          1
Fighting, Flying     1
Normal, Ground       1
Dragon, Electric     1
                    ..
Fire                36
Psychic             39
Grass               46
Normal              74
Water               74
Name: count, Length: 200, dtype: int64

In [88]:
pokemon.value_counts(normalize=True)
# by default normalize is false , normalize just give in percentile values of occuring

Type
Water               0.073267
Normal              0.073267
Grass               0.045545
Psychic             0.038614
Fire                0.035644
                      ...   
Fighting, Ice       0.000990
Fire, Dragon        0.000990
Normal, Dragon      0.000990
Psychic, Steel      0.000990
Fighting, Dragon    0.000990
Name: proportion, Length: 200, dtype: float64

## The apply Method
- The `apply` method accepts a function. It invokes that function on every `Series` value.

In [89]:
pokemon = pd.read_csv("pokemon.csv", usecols=["Name"]).squeeze("columns")
pokemon.head()

0     Bulbasaur
1       Ivysaur
2      Venusaur
3    Charmander
4    Charmeleon
Name: Name, dtype: object

In [90]:
pokemon.apply(len)

0        9
1        7
2        8
3       10
4       10
        ..
1005    12
1006     8
1007     8
1008    12
1009    11
Name: Name, Length: 1010, dtype: int64

In [92]:
def count_of_a(pokemon):
    return pokemon.count("a")

In [93]:
pokemon.apply(count_of_a)

0       2
1       1
2       1
3       2
4       1
       ..
1005    2
1006    1
1007    1
1008    2
1009    1
Name: Name, Length: 1010, dtype: int64

## The map Method
- The `map` method "maps" or connects each **Series** values to another value.
- We can pass the method a dictionary or a **Series**. Both types connects keys to values.
- The `map` method uses our argument to connect or bridge together the values.

In [94]:
pokemon = pd.read_csv("pokemon.csv", index_col="Name").squeeze("columns")
pokemon

Name
Bulbasaur          Grass, Poison
Ivysaur            Grass, Poison
Venusaur           Grass, Poison
Charmander                  Fire
Charmeleon                  Fire
                      ...       
Iron Valiant     Fairy, Fighting
Koraidon        Fighting, Dragon
Miraidon        Electric, Dragon
Walking Wake       Water, Dragon
Iron Leaves       Grass, Psychic
Name: Type, Length: 1010, dtype: object

In [95]:
attack_power={
    "Grass, Poison":5,
    "Fire":10,
    "Water, Dragon":15
}

In [96]:
pokemon.map(attack_power)

Name
Bulbasaur        5.0
Ivysaur          5.0
Venusaur         5.0
Charmander      10.0
Charmeleon      10.0
                ... 
Iron Valiant     NaN
Koraidon         NaN
Miraidon         NaN
Walking Wake    15.0
Iron Leaves      NaN
Name: Type, Length: 1010, dtype: float64