# Data Manipulation 
Data manipulation basically refers to adjusting the data to make it organised and easier to read. We will see some of the data manipulation techniques 

# Single element change
We will see how we can change single element of the dataframe. Let's create a dataframe conaining some NaN

In [None]:
import pandas as pd 
import numpy as np
A = ['a','b','c','d']
B = ['e',np.nan,'g','h']
C = ['i', 'j', np.nan, 'l']
D = ['a', 'e', 'i', 'o']
E = ['s', 'k', np.nan, 'g']
#create dataframe
df = pd.DataFrame(data=[A, B, C, D, E], columns=['one', 'two', 'three', 'four'])

#display dataframe
df

Unnamed: 0,one,two,three,four
0,a,b,c,d
1,e,,g,h
2,i,j,,l
3,a,e,i,o
4,s,k,,g


We use loc method using index operator. Here, we include labal name and column name and assigned it with our required value.

In [None]:
df.loc[0, 'one'] =5
df

Unnamed: 0,one,two,three,four
0,5,b,c,d
1,e,,g,h
2,i,j,,l
3,a,e,i,o
4,s,k,,g


# Fillna()
It is used for updating missing values. Let's see how we can use pandas fillna.

In [None]:
# inplace = True which indicates we make changes in the original dataframe

df.fillna('2', inplace= True)

In [None]:
df

Unnamed: 0,one,two,three,four
0,5,b,c,d
1,e,2,g,h
2,i,j,2,l
3,a,e,i,o
4,s,k,2,g


# drop_na()
If you want to exclude labels from a data set which refer to missing data, you can use dropna(). The dropna() function simply drop Rows/Columns of datasets with Null values in different ways. Fo this we need to specify the parameters axis: axis takes int or string value for rows/columns. Input can be 0 or 1 for Integer and 'index' or 'columns' for String.

In [None]:
import pandas as pd 
import numpy as np
A = ['a','b','c','d']
B = ['e',np.nan,'g','h']
C = ['i', 'j', np.nan, 'l']
D = ['a', 'e', 'i', 'o']
E = ['s', 'k', np.nan, 'g']
#create dataframe
df = pd.DataFrame(data=[A, B, C, D, E], columns=['one', 'two', 'three', 'four'])

#display dataframe
df

Unnamed: 0,one,two,three,four
0,a,b,c,d
1,e,,g,h
2,i,j,,l
3,a,e,i,o
4,s,k,,g


## Drop along the row
To perform drop along the row, we need to define axis as zero or we can ignore axis paramter.

In [None]:
df.dropna()

Unnamed: 0,one,two,three,four
0,a,b,c,d
3,a,e,i,o


## Drop along the column
To perform drop along the column, we need to define axis as one.

In [None]:
df.dropna(axis=1)

Unnamed: 0,one,four
0,a,d
1,e,h
2,i,l
3,a,o
4,s,g


# Apply functions
Let us say you have some data, and you want to apply a function on every item of the dataframe. We can apply it row-wise or column-wise, according to your requirements. It can also be a custom function that you make. 

In [None]:
import pandas as pd
dic = {'A':[1,2,3,4], 'B':[5,6,7,8]}
df = pd.DataFrame(dic)
df

Unnamed: 0,A,B
0,1,5
1,2,6
2,3,7
3,4,8


Consider the dataframe above. Now we define a function that we want to apply to every member of the dataframe. Let us consider the function `sum()`. 

In [None]:
df.apply(sum, axis=0)

A    10
B    26
dtype: int64

Notice that we have passed the function without the parentheses. We have received the sum of all the rows in A and B. This is because we put the parameter `axis=0`. If the axis is 1, we can perform these functions on the columns instead.

In [None]:
df.apply(sum, axis=1)

0     6
1     8
2    10
3    12
dtype: int64

## Filter()
Now let’s see another function called Pandas dataframe filter function. The filter function is used to Subset rows or columns of dataframe according to labels in the specified index. The things to notice is that the filter is applied to the labels of the index,this does not filter a dataframe on its contents. 

In [None]:
# importing pandas as pd
import pandas as pd
  
# Creating the Series
sr = pd.Series({'Coca Cola': 45, 'Coke': 40, 'Fanta': 40, 'Dew': 50, 'Thumbs Up':30})

# Print the series
sr

Coca Cola    45
Coke         40
Fanta        40
Dew          50
Thumbs Up    30
dtype: int64

In [None]:
# filter values
sr.filter(regex = '. .')


Coca Cola    45
Thumbs Up    30
dtype: int64

In [None]:
sr.filter(items=['Coke', 'Fanta'])

Coke     40
Fanta    40
dtype: int64

# References:
* [Panadas Fillna](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.fillna.html)
* [Pandas drop_na](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.dropna.html)
* [Pandas Filter](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.filter.html)