# Pandas Series

In this notebook you find an introduction to Pandas Series. Complete the tasks in the code cells below.

First things first, we should **import pandas**.

In [1]:
import pandas as pd  #pd commonly used abbreviation for pandas

## Pandas Series

A Pandas Series is a one-dimensional array-like object, which can hold data of any type (int, float, string, etc...). The data of a Pandas Series is labeled, meaning each element has an index. If no index is provided, they are labeled with their index number (starting from 0). \
\
Pandas Series can be created from lists, arrays, dictionaries, and existing Series objects and are a building block for the Pandas DataFrame. You can compare a Pandas Series with a single column of a database table. 

**Important** 
* The vast majority of Pandas methods **produce new objects**, leaving the input data untouched. 
* Immutability is favored where sensible.



## Creating a Pandas Series from a list

In [2]:
data = ["Mickey", "Minnie", "Pluto", "Donald Duck"]
series_from_list = pd.Series(data)
series_from_list 

0         Mickey
1         Minnie
2          Pluto
3    Donald Duck
dtype: object

* If no index labels are specified, they are labeled with their index numbers (starting from 0).
* As you notice, assigning a Series to a variable will not show the Series in the output field. You need to provide the Series variable on a seperate line to do so. Uncomment the third line.

### Creating a Series with a custom index

**🧰 Task**
* Create a list `data` containing several names
* Create a list `idx` of the same length as `data` containing "Participant1, Participant2, etc."
* Create the Series `series_custom_index` as we did above, but pass an extra argument `index = index`
* What happens when you `print(series_custom_index)`?

In [3]:
# Your code

data = ['John', 'Alice', 'Bob']

idx = ['Participant1', 'Participant2', 'Participant3']

series_custom_index = pd.Series(data, index=idx)

print(series_custom_index)

Participant1     John
Participant2    Alice
Participant3      Bob
dtype: object


### Changing the indices
Say we forgot to use custom indices and now the index is number based... Luckily we can still change the indices! \
\
First let's make a copy of `series_from_list`.

#### Making a copy

In [4]:
print("Original series_from_list:")
print(series_from_list)
copy = pd.Series(series_from_list)
copy[1] = "Winnie" #Assign Winnie to index 1
print("series_from_list:")
print(series_from_list)
print("copy:")
print(copy)


Original series_from_list:
0         Mickey
1         Minnie
2          Pluto
3    Donald Duck
dtype: object
series_from_list:
0         Mickey
1         Winnie
2          Pluto
3    Donald Duck
dtype: object
copy:
0         Mickey
1         Winnie
2          Pluto
3    Donald Duck
dtype: object


Well that did not work... What we did was assigning our `series_from_list` Serie to a new variable name `copy`. This way both `series_from_list` and `copy` refer to the same Series. Assigning Winnie to the second element of `copy` means `series_from_list` changes as well.

In order to create a copy of a Series, we need to make a deep copy.

In [5]:
new_copy = pd.Series.copy(series_from_list, deep=True)
new_copy[1] = "Timon"
print("Original:")
print(series_from_list)
print("New copy")
print(new_copy)

Original:
0         Mickey
1         Winnie
2          Pluto
3    Donald Duck
dtype: object
New copy
0         Mickey
1          Timon
2          Pluto
3    Donald Duck
dtype: object


#### 
Now we have a copy, we can custumize the index.
Say we want to change the indices to figure1, figure2 etc.

In [6]:
idx = ["figure1", "figure2", "figure3", "figure4"]
new_copy.index = idx
new_copy

figure1         Mickey
figure2          Timon
figure3          Pluto
figure4    Donald Duck
dtype: object

## Creating a Pandas Series from a dictionary

Given a dictionary `data`, with as keys the letters of the alphabet and as values their corresponding integer, starting at 1 (a:1, b:2 etc.), we create a Series `series_from_dict` from the dictionary `data` and display the Series.



In [7]:
data = {"a": 1, "b": 2, "c": 3, "d":4}
series_from_dict = pd.Series(data)
series_from_dict

a    1
b    2
c    3
d    4
dtype: int64

**🧰Task** 
* Create a Series `my_series` from a dictionary with keys London, Tripoli, Cairo and their values 10, 100, 10 respectively. Do this without storing the dictionary in a variable.

In [8]:
# Your code
my_series = pd.Series({'London': 10, 'Tripoli': 100, 'Cairo': 10})

Display the Series. You see the index labels of the Series are set to: 'London', 'Tripoli', 'Cairo'
and the values of the Series: 10, 100, 10.

In [9]:
# Displaying the Series
my_series

London      10
Tripoli    100
Cairo       10
dtype: int64

### Creating a Series with specified indices

Creating a new Series from `my_series`, we can specify which data we want to include based on indices. `lc_series` only contains the data from London and Cairo.

In [10]:
lc_series = pd.Series(my_series, index=["London", "Cairo"])
lc_series

London    10
Cairo     10
dtype: int64

### Creating a Series with values of different type 

**🧰Task** Find out if it is possible to create a Pandas Series with both integers and strings as values.

In [11]:
# Your code

data = [42, "hello", 3.14, "world"]
series = pd.Series(data)

print(series)
print(series.dtype)

0       42
1    hello
2     3.14
3    world
dtype: object
object


### 💼 Make exercise 1. train delay part 1

## Accessing data 
### Using the index label

In [12]:
# Accessing a specific element in the Series using the index label
my_series['Tripoli']

np.int64(100)

In [13]:
# more examples
my_series['Tripoli']
(series_from_list[0])
(series_from_dict["b"])


np.int64(2)

Notice 100 is not shown in the second output cell as it was followed by more code. Use `print()` to show multiple outcomes.

### Based on a condition

In [14]:
# Filtering elements based on a condition
my_series[my_series > 10] #shows all values (with their index), greater than 10.

Tripoli    100
dtype: int64

In [15]:
my_series

London      10
Tripoli    100
Cairo       10
dtype: int64

Notice how my_series remains unchanged. As said, the majority of Pandas methods **produce new objects**, leaving the input data untouched.

### Attributes
A Series object has several attributes: index, values, dtype, shape, ndim, size, name...

In [16]:
#Show the indices of the values of the Series
print(my_series.index)

#Show the index of the first element from the Series
print(my_series.index[0])

#Show the values of the series
print(my_series.values)

#Show the second value of the Series
print(my_series.values[1])

#Show how many elements the Series contains
print(my_series.size)
print(my_series.count())

print("__________________________")

#Show how many times each value occurs in the Series
print(my_series.value_counts())

#Show the data type of the elements of the Series
print(my_series.dtype)

#Remove a data entry based on index
print(my_series.drop(labels=["London", "Tripoli"]))


Index(['London', 'Tripoli', 'Cairo'], dtype='object')
London
[ 10 100  10]
100
3
3
__________________________
10     2
100    1
Name: count, dtype: int64
int64
Cairo    10
dtype: int64


Checkout the [API reference](https://pandas.pydata.org/docs/reference/series.html#constructor) for more functionality.

## Manipulating data 
Besides accessing data we can also change the data of a Series.
### Using the index label

In [17]:
#Assign a new value to the index London
my_series["London"] = 200
my_series

London     200
Tripoli    100
Cairo       10
dtype: int64

### Based on a condition

In [18]:
#Assign 00 to all values equal to 100
my_series[my_series==100] = 00
my_series

London     200
Tripoli      0
Cairo       10
dtype: int64

### Changing the index
Changing the indices of `my_series` to Brussels, Amsterdam and Berlin.

In [19]:
my_series.index = ["Brussels", "Amsterdam", "Berlin"]
my_series

Brussels     200
Amsterdam      0
Berlin        10
dtype: int64

### 💼 Make exercise 1. train delay part 2
### 💼 Make exercise 2. Zoo animals