<a href="https://colab.research.google.com/github/Berenice2018/DeepLearning/blob/master/Pandas.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Notes about Pandas**

In [0]:
import numpy as np
import pandas as pd

# Creating pandas Series

A Pandas series is a one-dimensional array-like object that can hold many data types

| Pandas Series | NumPy ndarray |
| --- | --- | 
| you can assign an index label to each element | cannot | 
| can hold data of different data types | cannot |
| data is not indexed with the label names |  indexed with numbers



In [2]:
# create a Pandas Series that stores a grocery list
groceries = pd.Series(data = [30, 6, 'Yes', 'No'], index = ['eggs', 'apples', 'milk', 'bread'])

# We display the Groceries Pandas Series
print(groceries)

print('\nGroceries has shape:', groceries.shape)
print('Groceries has dimension:', groceries.ndim)
print('Groceries has a total of', groceries.size, 'elements')

eggs       30
apples      6
milk      Yes
bread      No
dtype: object

Groceries has shape: (4,)
Groceries has dimension: 1
Groceries has a total of 4 elements


In [3]:
# print the index and data of the series
print('The data in Groceries is:', groceries.values)
print('The index of Groceries is:', groceries.index)

The data in Groceries is: [30 6 'Yes' 'No']
The index of Groceries is: Index(['eggs', 'apples', 'milk', 'bread'], dtype='object')


In [4]:
# Check if a label exists in the series
x = 'apples' in groceries
y = 'elephant' in groceries
print(x, y)

True False


# Accessing and Deleting Elements in pandas Series

The attribute *.loc* stands for location, it states that we are using a **labeled** index. 

The attribute .*iloc* stands for integer location, it states that we are using a **numerical** index.

In [5]:
# single index label
print('How many eggs do we need to buy:', groceries['eggs'])
# access multiple index labels
print('Do we need milk and bread:\n', groceries[['milk', 'bread']]) 
# use loc to access multiple index labels
print('How many eggs and apples do we need to buy:\n', groceries.loc[['eggs', 'apples']]) 
print()

# use multiple numerical indices
print('How many eggs and apples do we need to buy:\n',  groceries[[0, 1]]) 
# use a negative numerical index
print('Do we need bread:\n', groceries[[-1]]) 
# use a single numerical index
print('How many eggs do we need to buy:', groceries[0]) 
# access multiple numerical indices
print('Do we need milk and bread:\n', groceries.iloc[[2, 3]]) 

How many eggs do we need to buy: 30
Do we need milk and bread:
 milk     Yes
bread     No
dtype: object
How many eggs and apples do we need to buy:
 eggs      30
apples     6
dtype: object

How many eggs and apples do we need to buy:
 eggs      30
apples     6
dtype: object
Do we need bread:
 bread    No
dtype: object
How many eggs do we need to buy: 30
Do we need milk and bread:
 milk     Yes
bread     No
dtype: object


The `Series.drop(label)`  method drops elements from the Series out of place, it doesn't change the original Series being modified

Delete items from a Pandas Series in place by setting the keyword  `inplace` to `True` in the `.drop()`

In [6]:
print('Original Grocery List:\n', groceries)

# remove apples from our grocery list. The drop function removes elements out of place
print('\nWe remove apples (out of place):\n', groceries.drop('apples'))

print('\nOriginal grocery List after removing apples out of place:\n', groceries)

groceries.drop(['apples', 'eggs'], inplace = True)
print('\nRemoved apples and eggs in place:\n', groceries )

Original Grocery List:
 eggs       30
apples      6
milk      Yes
bread      No
dtype: object

We remove apples (out of place):
 eggs      30
milk     Yes
bread     No
dtype: object

Original grocery List after removing apples out of place:
 eggs       30
apples      6
milk      Yes
bread      No
dtype: object

Removed apples and eggs in place:
 milk     Yes
bread     No
dtype: object


# Arithmetic operations

In [14]:
# a Pandas Series that stores a grocery list of just fruits
fruits= pd.Series(data = [10, 6, 3,], index = ['apples', 'oranges', 'bananas'])

fruits

apples     10
oranges     6
bananas     3
dtype: int64

In [8]:
# apply arithmetic operations to each item in fruits
print('fruits + 2:\n', fruits + 2)
print('\nfruits - 2:\n', fruits - 2)
print('\nfruits * 2:\n', fruits * 2) 
print('\nfruits / 2:\n', fruits / 2) # results in dtype float64

fruits + 2:
 apples     12
oranges     8
bananas     5
dtype: int64

fruits - 2:
 apples     8
oranges    4
bananas    1
dtype: int64

fruits * 2:
 apples     20
oranges    12
bananas     6
dtype: int64

fruits / 2:
 apples     5.0
oranges    3.0
bananas    1.5
dtype: float64


In [13]:
# apply different mathematical functions to all elements of fruits
print('\nEXP(X) = \n', np.exp(fruits))
print('\nSQRT(X) =\n', np.sqrt(fruits))
print('\nPOW(X,2) =\n',np.power(fruits,2)) # raise to the power of 2


EXP(X) = 
 apples     22026.465795
oranges      403.428793
bananas       33.115452
dtype: float64

SQRT(X) =
 apples     3.162278
oranges    2.449490
bananas    1.870829
dtype: float64

POW(X,2) =
 apples     100.00
oranges     36.00
bananas     12.25
dtype: float64


Apply arithmetic operations on Pandas Series of mixed data type provided that the **arithmetic operation is defined for all data types** in the Series

When you have mixed data types in your Pandas Series make sure the arithmetic operations are valid on all the data types of your elements.

In [28]:
series2 = pd.Series([100], index=['cherries'])

print(groceries.append(series2) *2)
print()
groceries * 2

milk        YesYes
bread         NoNo
cherries       200
dtype: object



milk     YesYes
bread      NoNo
dtype: object