# Data Manipulation Functions

### reindex

The `reindex` function in pandas is used to conform DataFrame to a new index with optional filling logic. It can be used to change the row labels and column labels of a DataFrame.

In [None]:
import pandas as pd
import numpy as np

In [None]:
# create a dataframe
index = ["Firefox", "Chrome", "Safari", "IE10", "Konqueror"]
df = pd.DataFrame({"http_status": [200,200,404,404,301],
    "response_time": [0.04, 0.02, 0.07, 0.08, 1.0]},
    index=index)
df


Unnamed: 0,http_status,response_time
Firefox,200,0.04
Chrome,200,0.02
Safari,404,0.07
IE10,404,0.08
Konqueror,301,1.0


In [None]:
# reindex
new_index = ["Safari", "Iceweasel", "Comodo Dragon", "IE10", "Chrome"]
df.reindex(new_index)

Unnamed: 0,http_status,response_time
Safari,404.0,0.07
Iceweasel,,
Comodo Dragon,,
IE10,404.0,0.08
Chrome,200.0,0.02


### reset_index

it is used to reset the index or Convert an index into an column

In [None]:
df = pd.DataFrame([('bird', 389.0), ('bird', 24.0), ('mammal', 80.5), ('mammal', np.nan)],
                   index=['falcon', 'parrot', 'lion', 'monkey'],
                   columns=['class', 'max_speed'])
df

Unnamed: 0,class,max_speed
falcon,bird,389.0
parrot,bird,24.0
lion,mammal,80.5
monkey,mammal,


In [None]:
# converts index into a column
df.reset_index()

Unnamed: 0,index,class,max_speed
0,falcon,bird,389.0
1,parrot,bird,24.0
2,lion,mammal,80.5
3,monkey,mammal,


In [None]:
# removes the existing index and creates a new one
df.reset_index(drop=True)

Unnamed: 0,class,max_speed
0,bird,389.0
1,bird,24.0
2,mammal,80.5
3,mammal,


In [None]:
df

Unnamed: 0,class,max_speed
falcon,bird,389.0
parrot,bird,24.0
lion,mammal,80.5
monkey,mammal,


### sort_index

it is used to sort the index

In [None]:
df = pd.DataFrame({"Month": [1, 4, 7, 10],
                   "Year": [2012, 2014, 2013, 2014],
                   "Sales": [55, 40, 65, 55]}
                )
df

Unnamed: 0,Month,Year,Sales
0,1,2012,55
1,4,2014,40
2,7,2013,65
3,10,2014,55


In [None]:
df["Sales"].sort_values()

1    40
0    55
3    55
2    65
Name: Sales, dtype: int64

In [None]:
df["Sales"].sort_index()

0    55
1    40
2    65
3    55
Name: Sales, dtype: int64

### set_index

It sets a column into an index

In [None]:
df = pd.DataFrame({"Month": [1, 4, 7, 10],
                   "Year": [2012, 2014, 2013, 2014],
                   "Sales": [55, 40, 65, 55]}
                )
df

Unnamed: 0,Month,Year,Sales
0,1,2012,55
1,4,2014,40
2,7,2013,65
3,10,2014,55


In [None]:
# Let set the month column as index
df.set_index('Month')

Unnamed: 0_level_0,Year,Sales
Month,Unnamed: 1_level_1,Unnamed: 2_level_1
1,2012,55
4,2014,40
7,2013,65
10,2014,55


## Replace function

In [None]:
df = pd.DataFrame({"A": ["bat","root","mat"],
                   "B": ["x", "y", "z"]})
df

Unnamed: 0,A,B
0,bat,x
1,root,y
2,mat,z


In [None]:
df["A"] = df["A"].replace(("root"),("cat"))
df

Unnamed: 0,A,B
0,bat,x
1,cat,y
2,mat,z


## stack and unstack

In [None]:
df = pd.DataFrame([[0, 1], [2, 3]],
                   index=['cat', 'dog'],
                   columns=['weight', 'height'])
df

Unnamed: 0,weight,height
cat,0,1
dog,2,3


In [None]:
df.stack()

cat  weight    0
     height    1
dog  weight    2
     height    3
dtype: int64

In [None]:
df.unstack()

weight  cat    0
        dog    2
height  cat    1
        dog    3
dtype: int64

## Melt function

The `melt` function in pandas is used to transform or reshape a DataFrame from a wide format to a long format. This is particularly useful for data visualization and analysis, where you might want to have all the values of a variable in a single column

if you want a values for a perticular variable at that time it is usefull

In [None]:
df = pd.DataFrame({"A": {0: "a", 1: "b", 2: "c"},
                   "B": {0: 1, 1: 3, 2: 5},
                   "C": {0: 2, 1: 4, 2: 6}})
df

Unnamed: 0,A,B,C
0,a,1,2
1,b,3,4
2,c,5,6


In [None]:
df.melt(id_vars=['A'], value_vars=['B'])

Unnamed: 0,A,variable,value
0,a,B,1
1,b,B,3
2,c,B,5


In [None]:
df.melt(id_vars=['A'], value_vars=['B', 'C'])

Unnamed: 0,A,variable,value
0,a,B,1
1,b,B,3
2,c,B,5
3,a,C,2
4,b,C,4
5,c,C,6


## explode fuction

The `explode` function in pandas is used to transform each element of a list-like to a row, replicating the index values. 

In [None]:
# Example of using the explode function in pandas
df = pd.DataFrame({'A': [[1, 2, 3], [4, 5, 6], [7, 8, 9]], 'B': 1})
df

Unnamed: 0,A,B
0,"[1, 2, 3]",1
1,"[4, 5, 6]",1
2,"[7, 8, 9]",1


In [None]:
df_exploded = df.explode('A')
df_exploded

Unnamed: 0,A,B
0,1,1
0,2,1
0,3,1
1,4,1
1,5,1
1,6,1
2,7,1
2,8,1
2,9,1


## squeeze function

The `squeeze` function in pandas is used to convert a DataFrame with a single column or a single row into a Series. It can also convert a DataFrame with a single element into a scalar. This function is useful when you want to simplify the structure of your data.

In [None]:
df_single_column = pd.DataFrame({'A': [1, 2, 3]})
df_single_column

Unnamed: 0,A
0,1
1,2
2,3


In [None]:
df_single_column.squeeze()

0    1
1    2
2    3
Name: A, dtype: int64

<a style='text-decoration:none;line-height:16px;display:flex;color:#5B5B62;padding:10px;justify-content:end;' href='https://deepnote.com?utm_source=created-in-deepnote-cell&projectId=735df63f-eaf2-46ab-a8a5-7617b63b3308' target="_blank">
 </img>
Created in <span style='font-weight:600;margin-left:4px;'>Deepnote</span></a>