## Panda Data structures
The two workhorse of panda data structures are **Series** and **DataFrame**. It is important to understand these two in order to work effectively with pandas. In what follows below we will take a look at Panda series with practical examples/applications


### Panda Series
A panda series is a 1-dimensional array containing a sequence of values(of the same type for istance intergers, floats, strings etc) and an index. The below provides the usefullness of panda series

- Panda series are utilized for data Analysis. They are quite fundamental for data analysis tasks. They allow for efficient performance of operations on data.
- They  provide methods to handle and clean data, making it easier to prepare data for analysis.
- They allow for  easily manipulation of data, such as changing values, renaming indices, and more.
- Integration: Series integrate seamlessly with other pandas data structures, like DataFrames, making it easy to work with big and complex datasets.

#### Characteristics of a Pandas Series
- Each element in a Series has a unique label (index), which can be used to access the data. This index can be numeric or a string.
- A Series can hold various data types, including integers, floats, strings, and even Python objects.
- You can perform various operations on Series, such as arithmetic operations, filtering, and more, using the index to align data.





We begin by importing the relevant libraries. We will use panda library in python.

In [89]:
import pandas as pd


We will create a simple panda series made up of 7 numbers just chosen at random

In [90]:
data_a = pd.Series([1,3,5,6,7,8])
data_a

0    1
1    3
2    5
3    6
4    7
5    8
dtype: int64

Note from the above output that the numbers beginning from 0 to 5 on the left column represent the index, whilst the values are generated on the right. Each value is assigned an index which begins from 0.
The data type is integer 

We can aslo assign the index of our choice to the data as displayed below. In this case, we have decided to use "strings" as index. I will show you two ways of doing this.

&nbsp;With the first approach, we will use the "index" command to assign our string indexes in the brackes as shown below

In [91]:
data_b = pd.Series([1,3,5,6,7,8], index = ["e", "f", "g", "h", "i", "j"])
data_b

e    1
f    3
g    5
h    6
i    7
j    8
dtype: int64



Alternatively, we can create a series and assign our own indexes to the series using a dictionary-like structure as shown below.

&nbsp;In the below, I decided to use the names of football teams as indeces and assigned each team with some random points (values)

In [92]:
data_c = pd.Series({"Arsenal": 30, "Chelsea":22, "Dortmund":42, "Liverpool":29, "Bochum":15, "Schalke_04":15, "FC_Koln":18} )
data_c

Arsenal       30
Chelsea       22
Dortmund      42
Liverpool     29
Bochum        15
Schalke_04    15
FC_Koln       18
dtype: int64

From above we can see all the teams (index) with their respective points. 

&nbsp;Once again the index can be verified with the scrit below

In [93]:
data_c.index

Index(['Arsenal', 'Chelsea', 'Dortmund', 'Liverpool', 'Bochum', 'Schalke_04',
       'FC_Koln'],
      dtype='object')


Now assuming we would like to slice or subset the Series. 

For instance, we may decide to find out how many points a particular team accumulated. Let's say we want to investigate the points accumulated by Chelsea

In [94]:
data_c["Chelsea"]

22

Or better still assuming we want to ascertain the points for  two or more teams. For instance Dortmund, Bochum and Arsenal

In [95]:
data_c[["Dortmund", "Bochum", "Arsenal"]]

Dortmund    42
Bochum      15
Arsenal     30
dtype: int64


Great. Now assuming that we would like to find out if a particular team is part of the series. In this case, we would like to knowif the index is in the series.

For instance, we can check if Juventus is part of the teams in the panda series. We can use the script below to check

In [96]:
"Juventus" in data_c

False

Python returns false because Juventus is not in the index/series. 
What if we decide to check if  Arsenal is part of the index/series.

In [97]:
"Arsenal" in data_c

True

This indicates that Arsenal is in the Series, since python returns a True. Now assuming we would like to add Juventus to the series and assign 35 points

First of all, we create a new single series for Juventus. Then we concatenate (Concatenation means linking items together in a series) to the original series in the steps below.

In [98]:
new_team = pd.Series([35], index = ["Juventus"])

data_new = pd.concat([data_c,new_team])
data_new

Arsenal       30
Chelsea       22
Dortmund      42
Liverpool     29
Bochum        15
Schalke_04    15
FC_Koln       18
Juventus      35
dtype: int64

This approach is rather long and can be a little confusing for python newcomers. We can use a simple way to add a new team to the series.
Lets ass Manchester City and assign   40 points

In [99]:
data_new["Manchester_city"] = 30
data_new

Arsenal            30
Chelsea            22
Dortmund           42
Liverpool          29
Bochum             15
Schalke_04         15
FC_Koln            18
Juventus           35
Manchester_city    30
dtype: int64

There you go, pretty straight forward and easy.

Now assuming we would like to delete/drop one of the rows. Let us choose to drop Bochum from the series

In [100]:
data_new = data_new.drop("Bochum")
data_new

Arsenal            30
Chelsea            22
Dortmund           42
Liverpool          29
Schalke_04         15
FC_Koln            18
Juventus           35
Manchester_city    30
dtype: int64

As can be seen, Bochum is no longer part of the series

Now, assuming we  made a  mistake with one of the indexes. Instead of Leicester, we typed Liverpool. How can we correct this? We can rename the index and replace the old name with the new name

In [101]:
data_new.rename(index = {"Liverpool": "Leicester"}, inplace = True)
data_new

Arsenal            30
Chelsea            22
Dortmund           42
Leicester          29
Schalke_04         15
FC_Koln            18
Juventus           35
Manchester_city    30
dtype: int64

We can do a simple descriptive statistics of our panda series. 
First, we use the describe function in python and then round to 3 decimal places.

The count, mean, median, 25th,50th and 75 percentiles are displayed. we can also see the minimum and maximum points

In [102]:
data_new.describe().round(3)

count     8.000
mean     27.625
std       8.927
min      15.000
25%      21.000
50%      29.500
75%      31.250
max      42.000
dtype: float64

Finally, let's give our new series and index name attributes



In [103]:
data_new.name = "Football Teams and points"
data_new.index.name = "Teams"

data_new

Teams
Arsenal            30
Chelsea            22
Dortmund           42
Leicester          29
Schalke_04         15
FC_Koln            18
Juventus           35
Manchester_city    30
Name: Football Teams and points, dtype: int64

Thats all for Panda Series. Thanks and see you next week!