# 3. Series

A **series** is a single column of data. It combines the best features of lists and dictionaries, i.e, the elements of a series are stored in order *like a list*, and unique identifiers can be assigned to each value, *like a dictionary*.

Series have more extensive methods than lists or dictionaries.

In [None]:
%pip install pandas

In [None]:
import pandas as pd

## Creating a series from a list

In [3]:
ice_cream = ["Chocolate", "Vanilla", "Strawberry", "Rocky Road"]
pd.Series(ice_cream)

0     Chocolate
1       Vanilla
2    Strawberry
3    Rocky Road
dtype: object

Here, when instantiating the series, we did not mention what the keys should be. Hence, it defaults to numeric indices. The output shows `dtype` as object because pandas identifies strings and other complex data types as objects.

## Creating a series from a dictionary

In [4]:
sushi = {"Salmon": "orange",
         "Tuna": "red",
         "Eel": "brown"}

pd.Series(sushi)

Salmon    orange
Tuna         red
Eel        brown
dtype: object

It is important to note that Python dictionaries are unordered. However, when they are converted to series, they become ordered.

## Intro to Series Methods

In [5]:
prices = pd.Series([2.99, 4.45, 1.36])
prices

0    2.99
1    4.45
2    1.36
dtype: float64

In [6]:
print("The sum of the series is:", prices.sum())
print("The product of the series is:", prices.product())
print("The average of the series is:", prices.mean())
print("The standard deviation of the series is:", prices.std())

The sum of the series is: 8.8
The product of the series is: 18.095480000000006
The average of the series is: 2.9333333333333336
The standard deviation of the series is: 1.5457791994115246


## Intro to Attributes

**Attributes** are pieces of data that belong to the object. They are also known as object variables. An object's attribute can be accessed using `object.attribute`

In [7]:
adjectives = pd.Series(["Smart", "Handsome", "Charming", "Brilliant", "Humble", "Smart"])
adjectives

0        Smart
1     Handsome
2     Charming
3    Brilliant
4       Humble
5        Smart
dtype: object

In [8]:
adjectives.size # Attribute that shows how many elements are in the series

6

In [9]:
adjectives.is_unique # Attribute that shows if all elements in the series are unique

False

In [10]:
adjectives.values

array(['Smart', 'Handsome', 'Charming', 'Brilliant', 'Humble', 'Smart'],
      dtype=object)

In [11]:
adjectives.index

RangeIndex(start=0, stop=6, step=1)

This shows that the indices and the actual contents in a series both exist separately and are composed together to form a series object.

In [12]:
type(adjectives.values)

numpy.ndarray

In [13]:
type(adjectives.index)

pandas.core.indexes.range.RangeIndex

It is interesting to note that the `values` object is actually a numpy object, which is a different Python library, whereas the `index` object is a pandas object.

## Parameters and Arguments

**Parameters** are the names given to the expected inputs to a function/method/class instantiation.
<br>**Arguments** are the concrete values provided in place of the parameters at the time of invocation.

In [14]:
names = ["Lohit", "Nithya", "Sarthak", "Shivam", "Vishnu", "Nitesh"]
player_tags = ["cracc", "hydratedbicc", "Bauji", "sh1ver", "Flip", "Tauji"]

pd.Series(data=player_tags, index=names)

Lohit             cracc
Nithya     hydratedbicc
Sarthak           Bauji
Shivam           sh1ver
Vishnu             Flip
Nitesh            Tauji
dtype: object

In [15]:
pd.Series(index=names)

Lohit     NaN
Nithya    NaN
Sarthak   NaN
Shivam    NaN
Vishnu    NaN
Nitesh    NaN
dtype: float64

## Import a Series with the pd.read_csv Function

The `pd.read_csv()` reads in CSV files and loads it into the Python runtime.

In [16]:
df = pd.read_csv("datasets/pokemon.csv", usecols=["Name"])
df

Unnamed: 0,Name
0,Bulbasaur
1,Ivysaur
2,Venusaur
3,Charmander
4,Charmeleon
...,...
1005,Iron Valiant
1006,Koraidon
1007,Miraidon
1008,Walking Wake


The datatype that is returned upon reading a CSV is called **dataframe**.

A dataframe is a two-dimensional data structure whereas series is a one-dimensional data structure. You can use the `squeeze` method to coerce a single column dataframe into a series.

In [17]:
pokemon = df.squeeze("columns")
pokemon

0          Bulbasaur
1            Ivysaur
2           Venusaur
3         Charmander
4         Charmeleon
            ...     
1005    Iron Valiant
1006        Koraidon
1007        Miraidon
1008    Walking Wake
1009     Iron Leaves
Name: Name, Length: 1010, dtype: object

In [18]:
google_stocks = pd.read_csv("datasets/google_stock_price.csv", usecols=["Price"]).squeeze('columns')
google_stocks

0         2.490664
1         2.515820
2         2.758411
3         2.770615
4         2.614201
           ...    
4788    132.080002
4789    132.998001
4790    135.570007
4791    137.050003
4792    138.429993
Name: Price, Length: 4793, dtype: float64

## The Head and Tail Methods

The `head` method returns a specified number of rows from the top of a series whereas the `tail` method returns a specified number of rows from the bottom of the series.

If no argument is provided, pandas defaults to 5 as the number of rows.

In [19]:
pokemon.head(8)

0     Bulbasaur
1       Ivysaur
2      Venusaur
3    Charmander
4    Charmeleon
5     Charizard
6      Squirtle
7     Wartortle
Name: Name, dtype: object

In [20]:
google_stocks.tail(7)

4786    134.727005
4787    130.139999
4788    132.080002
4789    132.998001
4790    135.570007
4791    137.050003
4792    138.429993
Name: Price, dtype: float64

## Passing a Series to Python's Built-In Functions

Series can work well with some of Python's built-in functions like:
- `len` - returns the length of the series
- `type` - returns the datatype of the series
- `list` - converts the datatype to a list
- `dict` - converts the datatype to a dictionary
- `sorted` - converts the series into a sorted list
- `max` - returns the largest value in the series
- `min` - returns the smallest value in the series

In [21]:
len(pokemon)

1010

In [22]:
type(pokemon)

pandas.core.series.Series

In [23]:
list(pokemon)

['Bulbasaur',
 'Ivysaur',
 'Venusaur',
 'Charmander',
 'Charmeleon',
 'Charizard',
 'Squirtle',
 'Wartortle',
 'Blastoise',
 'Caterpie',
 'Metapod',
 'Butterfree',
 'Weedle',
 'Kakuna',
 'Beedrill',
 'Pidgey',
 'Pidgeotto',
 'Pidgeot',
 'Rattata',
 'Raticate',
 'Spearow',
 'Fearow',
 'Ekans',
 'Arbok',
 'Pikachu',
 'Raichu',
 'Sandshrew',
 'Sandslash',
 'Nidoran♀',
 'Nidorina',
 'Nidoqueen',
 'Nidoran♂',
 'Nidorino',
 'Nidoking',
 'Clefairy',
 'Clefable',
 'Vulpix',
 'Ninetales',
 'Jigglypuff',
 'Wigglytuff',
 'Zubat',
 'Golbat',
 'Oddish',
 'Gloom',
 'Vileplume',
 'Paras',
 'Parasect',
 'Venonat',
 'Venomoth',
 'Diglett',
 'Dugtrio',
 'Meowth',
 'Persian',
 'Psyduck',
 'Golduck',
 'Mankey',
 'Primeape',
 'Growlithe',
 'Arcanine',
 'Poliwag',
 'Poliwhirl',
 'Poliwrath',
 'Abra',
 'Kadabra',
 'Alakazam',
 'Machop',
 'Machoke',
 'Machamp',
 'Bellsprout',
 'Weepinbell',
 'Victreebel',
 'Tentacool',
 'Tentacruel',
 'Geodude',
 'Graveler',
 'Golem',
 'Ponyta',
 'Rapidash',
 'Slowpoke',
 'Sl

In [24]:
sorted(pokemon)

['Abomasnow',
 'Abra',
 'Absol',
 'Accelgor',
 'Aegislash',
 'Aerodactyl',
 'Aggron',
 'Aipom',
 'Alakazam',
 'Alcremie',
 'Alomomola',
 'Altaria',
 'Amaura',
 'Ambipom',
 'Amoonguss',
 'Ampharos',
 'Annihilape',
 'Anorith',
 'Appletun',
 'Applin',
 'Araquanid',
 'Arbok',
 'Arboliva',
 'Arcanine',
 'Arceus',
 'Archen',
 'Archeops',
 'Arctibax',
 'Arctovish',
 'Arctozolt',
 'Ariados',
 'Armaldo',
 'Armarouge',
 'Aromatisse',
 'Aron',
 'Arrokuda',
 'Articuno',
 'Audino',
 'Aurorus',
 'Avalugg',
 'Axew',
 'Azelf',
 'Azumarill',
 'Azurill',
 'Bagon',
 'Baltoy',
 'Banette',
 'Barbaracle',
 'Barboach',
 'Barraskewda',
 'Basculegion',
 'Basculin',
 'Bastiodon',
 'Baxcalibur',
 'Bayleef',
 'Beartic',
 'Beautifly',
 'Beedrill',
 'Beheeyem',
 'Beldum',
 'Bellibolt',
 'Bellossom',
 'Bellsprout',
 'Bergmite',
 'Bewear',
 'Bibarel',
 'Bidoof',
 'Binacle',
 'Bisharp',
 'Blacephalon',
 'Blastoise',
 'Blaziken',
 'Blipbug',
 'Blissey',
 'Blitzle',
 'Boldore',
 'Boltund',
 'Bombirdier',
 'Bonsly',
 'Bo

In [25]:
dict(pokemon)

{0: 'Bulbasaur',
 1: 'Ivysaur',
 2: 'Venusaur',
 3: 'Charmander',
 4: 'Charmeleon',
 5: 'Charizard',
 6: 'Squirtle',
 7: 'Wartortle',
 8: 'Blastoise',
 9: 'Caterpie',
 10: 'Metapod',
 11: 'Butterfree',
 12: 'Weedle',
 13: 'Kakuna',
 14: 'Beedrill',
 15: 'Pidgey',
 16: 'Pidgeotto',
 17: 'Pidgeot',
 18: 'Rattata',
 19: 'Raticate',
 20: 'Spearow',
 21: 'Fearow',
 22: 'Ekans',
 23: 'Arbok',
 24: 'Pikachu',
 25: 'Raichu',
 26: 'Sandshrew',
 27: 'Sandslash',
 28: 'Nidoran♀',
 29: 'Nidorina',
 30: 'Nidoqueen',
 31: 'Nidoran♂',
 32: 'Nidorino',
 33: 'Nidoking',
 34: 'Clefairy',
 35: 'Clefable',
 36: 'Vulpix',
 37: 'Ninetales',
 38: 'Jigglypuff',
 39: 'Wigglytuff',
 40: 'Zubat',
 41: 'Golbat',
 42: 'Oddish',
 43: 'Gloom',
 44: 'Vileplume',
 45: 'Paras',
 46: 'Parasect',
 47: 'Venonat',
 48: 'Venomoth',
 49: 'Diglett',
 50: 'Dugtrio',
 51: 'Meowth',
 52: 'Persian',
 53: 'Psyduck',
 54: 'Golduck',
 55: 'Mankey',
 56: 'Primeape',
 57: 'Growlithe',
 58: 'Arcanine',
 59: 'Poliwag',
 60: 'Poliwhirl',

In [26]:
max(google_stocks)

151.863495

In [27]:
min(google_stocks)

2.47049

## Checking for Inclusion in a Series

We already know that we can check for inclusion in Python using the `in` keyword. We also know that `series.values` returns a numpy ndarray object containing the values in a series. So to check if a value is in a series, we can use `value in series.values`

In [28]:
pokemon1 = "Bulbasaur"
pokemon2 = "Pikachu"
pokemon3 = "Oogly-Boogly"

print(f'1. {pokemon1} {"does" if pokemon1 in pokemon.values else "does not"} exist in the Pokemon series')
print(f'2. {pokemon2} {"does" if pokemon2 in pokemon.values else "does not"} exist in the Pokemon series')
print(f'3. {pokemon3} {"does" if pokemon3 in pokemon.values else "does not"} exist in the Pokemon series')

1. Bulbasaur does exist in the Pokemon series
2. Pikachu does exist in the Pokemon series
3. Oogly-Boogly does not exist in the Pokemon series


## Sorting Series

The `sort_values` sorts a series in ascending order of its values.

In [29]:
google_stocks.sort_values(ascending=False)

4395    151.863495
4345    151.000000
4346    150.141754
4341    150.000000
4336    150.000000
           ...    
12        2.515820
11        2.514326
13        2.509095
0         2.490664
10        2.470490
Name: Price, Length: 4793, dtype: float64

The `sort_index` sorts a series by its index and works similarly to `sort_values`

In [30]:
pokemon_and_type = pd.read_csv("datasets/pokemon.csv", index_col="Name", usecols=["Name","Type"]).squeeze("columns")
pokemon_and_type.sort_index(ascending=False).head()

Name
Zygarde     Dragon, Ground
Zweilous      Dark, Dragon
Zubat       Poison, Flying
Zorua                 Dark
Zoroark               Dark
Name: Type, dtype: object

## Extracting Series Values

### Extracting Series Value with Index Position

To extract a value of a series using its index position, we use the accessor `iloc`, which is short for 'index location'. It is an attribute, not a method. The syntax is: `series.iloc[index]`. Slicing works the same way as with other data structures like lists, dictionaries, and strings.

In [31]:
pokemon_and_type.iloc[[100, 200, 300]]

Name
Electrode    Electric
Unown         Psychic
Delcatty       Normal
Name: Type, dtype: object

In [32]:
pokemon_and_type.iloc[27:36:2]

Name
Sandslash            Ground
Nidorina             Poison
Nidoran♂             Poison
Nidoking     Poison, Ground
Clefable              Fairy
Name: Type, dtype: object

### Extracting Series Value with Index Label

To extract a value using the label instead, we use the accessor `loc`.

In [33]:
pokemon_and_type.loc["Pikachu"]

'Electric'

In [34]:
pokemon_and_type[["Charmander", "Geodude", "Snorlax"]]

Name
Charmander            Fire
Geodude       Rock, Ground
Snorlax             Normal
Name: Type, dtype: object

### Using the get Method

The `get` method can also be used to extract a series value by its label. You can also pass a second argument which behaves as a fallback value if the label does not exist.

In [35]:
print(pokemon_and_type.get("Celebi", "This pokemon does not exist"))
print(pokemon_and_type.get("Oogabooga", "This pokemon does not exist"))

Psychic, Grass
This pokemon does not exist


## Overwriting a Series Value

You can overwrite an existing value in a series by using the `iloc/loc` accessor.

In [36]:
pokemon.iloc[0] = 'Borisaur'
pokemon.iloc[[1, 2, 4]] = ["Firemon", "Flamemon", "Blazemon"]
pokemon.head()

0      Borisaur
1       Firemon
2      Flamemon
3    Charmander
4      Blazemon
Name: Name, dtype: object

In [37]:
pokemon_and_type.loc[["Bulbasaur", "Ivysaur", "Venusaur", "Charmeleon"]] = ["Grass, Poison", "Grass, Poison", "Grass, Poison", "Fire"]
pokemon_and_type.head()

Name
Bulbasaur     Grass, Poison
Ivysaur       Grass, Poison
Venusaur      Grass, Poison
Charmander             Fire
Charmeleon             Fire
Name: Type, dtype: object

## The copy Method

The `copy` method creates a replica of an object. Changes to an object will not be reflected in the original.

A **view** is just a different way of looking at the same data. If any change is made to a view, it does affect the original object.

In [38]:
pokemon_df = pd.read_csv("datasets/pokemon.csv", usecols=["Name"])
pokemon_df.head()

Unnamed: 0,Name
0,Bulbasaur
1,Ivysaur
2,Venusaur
3,Charmander
4,Charmeleon


In [39]:
pokemon_series = pokemon_df.squeeze("columns")
pokemon_series.iloc[0] = "Random"
pokemon_series.head()

0        Random
1       Ivysaur
2      Venusaur
3    Charmander
4    Charmeleon
Name: Name, dtype: object

In [40]:
pokemon_df.head()

Unnamed: 0,Name
0,Random
1,Ivysaur
2,Venusaur
3,Charmander
4,Charmeleon


This proves that a series created from a dataframe is a view of the dataframe and not a distinct copy.

In [41]:
pokemon_series_dup = pokemon_df.squeeze("columns").copy()
pokemon_series_dup.iloc[1] = "Whatever"
pokemon_series_dup.head()

0        Random
1      Whatever
2      Venusaur
3    Charmander
4    Charmeleon
Name: Name, dtype: object

In [42]:
pokemon_df.head()

Unnamed: 0,Name
0,Random
1,Ivysaur
2,Venusaur
3,Charmander
4,Charmeleon
