**Module-2: Part-01**

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

# Python Pandas

**Pandas** is an open-source library that is made mainly for working with relational or labeled data very easily.This ibrary provides high-performance, easy-to-use data structures and data analysis tools for the Python programming language. In this module, we will see the various features of Python Pandas and how to use them in practice.

This Module is divided into two categories:

**-Python Pandas-Series**

**-Python Pandas-Dataframe**

## Python Pandas-Series

In this part of the module we will discuss about Python Pandas-Series

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)
## 1.Importing Modules

In [1]:
import pandas as pd
import numpy as np
print('Modules Imported')

Modules Imported


![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)
## 2.Pandas Series

We can generate a series using the function **pd.Series()**

In [2]:
# FIFA 20 Player Ratings
pl_rat = pd.Series([94, 93, 92, 89, 88, 83, 82])

In [3]:
pl_rat

0    94
1    93
2    92
3    89
4    88
5    83
6    82
dtype: int64

In [12]:
pl_rat.name = 'FIFA 20 Player Ratings'

In [5]:
pl_rat

0    94
1    93
2    92
3    89
4    88
5    83
6    82
Name: FIFA 20 Player Ratings, dtype: int64

Let's check the type of data stored using the **dtype** function:

In [13]:
pl_rat.dtype

dtype('int64')

In [14]:
pl_rat.values

array([94, 93, 92, 89, 88, 83, 82])

In [15]:
type(pl_rat.values)

numpy.ndarray

They _look_ like simple Python lists or Numpy Arrays. But they're actually more similar to Python `dict`s.

A Series has an `index`, that's similar to the automatic index assigned to Python's lists:

In [6]:
pl_rat

0    94
1    93
2    92
3    89
4    88
5    83
6    82
Name: FIFA 20 Player Ratings, dtype: int64

In [16]:
pl_rat[0]

94

In [17]:
pl_rat[1]

93

In [7]:
pl_rat.index = [
    'Lionel Messi',
    'Cristiano Ronaldo',
    'Neymar Jr',
    'Manuel Neuer',
    'Dani Alves',
    'Matuidi',
    'Oscar',
]

In [8]:
pl_rat

Lionel Messi         94
Cristiano Ronaldo    93
Neymar Jr            92
Manuel Neuer         89
Dani Alves           88
Matuidi              83
Oscar                82
Name: FIFA 20 Player Ratings, dtype: int64

#### We can create the whole Pandas-series in the following way:

In [9]:
pd.Series(
    [94, 93, 92, 89, 88, 83, 82],
    index=['Lionel Messi', 'Cristiano Ronaldo', 'Neymar Jr', 'Manuel Neuer', 'Dani Alves', 'Matuidi',
       'Oscar'],
    name='FIFA 20 Player Ratings')

Lionel Messi         94
Cristiano Ronaldo    93
Neymar Jr            92
Manuel Neuer         89
Dani Alves           88
Matuidi              83
Oscar                82
Name: FIFA 20 Player Ratings, dtype: int64

#### We can find specific elements in the series by the following method:

In [10]:
pd.Series(pl_rat, index=['Dani Alves', 'Matuidi', 'Oscar', 'Marcelo'])

Dani Alves    88.0
Matuidi       83.0
Oscar         82.0
Marcelo        NaN
Name: FIFA 20 Player Ratings, dtype: float64

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)
## 3.Indexing

#### Indexing works similarly to lists and dictionaries, we use the **index** of the element you're looking for:

In [11]:
pl_rat

Lionel Messi         94
Cristiano Ronaldo    93
Neymar Jr            92
Manuel Neuer         89
Dani Alves           88
Matuidi              83
Oscar                82
Name: FIFA 20 Player Ratings, dtype: int64

In [19]:
pl_rat['Oscar']

82

In [20]:
pl_rat['Neymar Jr']

92

In [23]:
pl_rat.iloc[0] #Prints the first element in the series

94

In [24]:
pl_rat.iloc[-1] #Prints the last element in the series

82

#### Selecting multiple elements at once:

In [25]:
pl_rat[['Oscar', 'Matuidi']]

Oscar      82
Matuidi    83
Name: FIFA 20 Player Ratings, dtype: int64

In [26]:
pl_rat.iloc[[0, 1]]

Lionel Messi         94
Cristiano Ronaldo    93
Name: FIFA 20 Player Ratings, dtype: int64

#### Incase of slicing, in Pandas, the upper limit is also included:

In [29]:
pl_rat['Lionel Messi': 'Oscar']

Lionel Messi         94
Cristiano Ronaldo    93
Neymar Jr            92
Manuel Neuer         89
Dani Alves           88
Matuidi              83
Oscar                82
Name: FIFA 20 Player Ratings, dtype: int64

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)
## 4.Boolean Arrays

In [30]:
pl_rat

Lionel Messi         94
Cristiano Ronaldo    93
Neymar Jr            92
Manuel Neuer         89
Dani Alves           88
Matuidi              83
Oscar                82
Name: FIFA 20 Player Ratings, dtype: int64

In [32]:
pl_rat > 90 # Returns True for ratings greater than 90

Lionel Messi          True
Cristiano Ronaldo     True
Neymar Jr             True
Manuel Neuer         False
Dani Alves           False
Matuidi              False
Oscar                False
Name: FIFA 20 Player Ratings, dtype: bool

In [46]:
pl_rat[(pl_rat > 80) | (pl_rat < 90)] #return ratings greater than 80 or less than 90

Lionel Messi         94
Cristiano Ronaldo    93
Neymar Jr            92
Manuel Neuer         89
Dani Alves           88
Matuidi              83
Oscar                82
Name: FIFA 20 Player Ratings, dtype: int64

In [47]:
pl_rat[(pl_rat > 80) & (pl_rat < 90)] #return ratings greater than 80 and less than 90

Manuel Neuer    89
Dani Alves      88
Matuidi         83
Oscar           82
Name: FIFA 20 Player Ratings, dtype: int64

In [34]:
pl_rat[pl_rat > 90] # Returns values which has ratings greater than 90

Lionel Messi         94
Cristiano Ronaldo    93
Neymar Jr            92
Name: FIFA 20 Player Ratings, dtype: int64

In [42]:
pl_rat.mean() #Calculates mean of total ratings

88.71428571428571

In [39]:
pl_rat[pl_rat > pl_rat.mean()] #Returns values which has rating greater than the total mean(i.e 88.714)

Lionel Messi         94
Cristiano Ronaldo    93
Neymar Jr            92
Manuel Neuer         89
Name: FIFA 20 Player Ratings, dtype: int64

In [41]:
pl_rat.std() #Returns the standard deviation of the ratings

4.7509397566616824

##### ~ not
##### | or
##### & and

In [43]:
pl_rat[(pl_rat > pl_rat.mean() - pl_rat.std() / 2) | (pl_rat > pl_rat.mean() + pl_rat.std() / 2)] 

Lionel Messi         94
Cristiano Ronaldo    93
Neymar Jr            92
Manuel Neuer         89
Dani Alves           88
Name: FIFA 20 Player Ratings, dtype: int64

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)
## 5.Operations & Methods

In [48]:
pl_rat

Lionel Messi         94
Cristiano Ronaldo    93
Neymar Jr            92
Manuel Neuer         89
Dani Alves           88
Matuidi              83
Oscar                82
Name: FIFA 20 Player Ratings, dtype: int64

In [50]:
pl_rat * 100 # Multiplies 100 with all the ratings

Lionel Messi         9400
Cristiano Ronaldo    9300
Neymar Jr            9200
Manuel Neuer         8900
Dani Alves           8800
Matuidi              8300
Oscar                8200
Name: FIFA 20 Player Ratings, dtype: int64

In [51]:
pl_rat.mean()

88.71428571428571

In [55]:
np.log(pl_rat) #Returns log values

Lionel Messi         4.543295
Cristiano Ronaldo    4.532599
Neymar Jr            4.521789
Manuel Neuer         4.488636
Dani Alves           4.477337
Matuidi              4.418841
Oscar                4.406719
Name: FIFA 20 Player Ratings, dtype: float64

In [57]:
pl_rat['Lionel Messi': 'Dani Alves'].mean() #Return mean values from Linonel Messi to Dani Alves

91.2

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)
## 6.Modifying Series

In [58]:
pl_rat['Lionel Messi']

94

In [61]:
pl_rat['Lionel Messi'] = 97 # Replaces the current value of rating with new value

In [60]:
pl_rat

Lionel Messi         97
Cristiano Ronaldo    93
Neymar Jr            92
Manuel Neuer         89
Dani Alves           88
Matuidi              83
Oscar                82
Name: FIFA 20 Player Ratings, dtype: int64

In [62]:
pl_rat.iloc[-1] = 79

In [63]:
pl_rat

Lionel Messi         97
Cristiano Ronaldo    93
Neymar Jr            92
Manuel Neuer         89
Dani Alves           88
Matuidi              83
Oscar                79
Name: FIFA 20 Player Ratings, dtype: int64

In [64]:
pl_rat[pl_rat < 90] = 70 #Changes the rating less than 90 to 70

In [65]:
pl_rat

Lionel Messi         97
Cristiano Ronaldo    93
Neymar Jr            92
Manuel Neuer         70
Dani Alves           70
Matuidi              70
Oscar                70
Name: FIFA 20 Player Ratings, dtype: int64

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

[Module-2-Part-2: Pandas Dataframe](https://github.com/ffarhaaan/Data-Visualization-Using-Python-Libraries/blob/master/M02-2-pandas-dataframe.ipynb)