<a href="https://colab.research.google.com/github/Yusuf5001/DA-with-Python/blob/main/DAwPy_S4_(Pandas_DataFrames%2C_Selection_and_Indexing).ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

___

<p style="text-align: center;"><img src="https://docs.google.com/uc?id=1lY0Uj5R04yMY3-ZppPWxqCr5pvBLYPnV" class="img-fluid" alt="CLRSWY"></p>

___

<h1><p style="text-align: center;">Pandas Lesson, Session - 4</p><h1>
    

# Data Frames

 - ### ``DataFrames`` are the workhorse of pandas and are directly inspired by the R programming language. We can think of a DataFrame as a bunch of Series objects put together to share the same index. Let's use pandas to explore this topic!

In [146]:
import numpy as np  
import pandas as pd

 ## Creating a DataFrame using the ``list``s of data and columns

pd.DataFrame(data=None, index=None, columns=None, dtype=None, copy=False)

Two-dimensional, size-mutable, potentially heterogeneous tabular data.

In [147]:
data = [1, 3, 5, 7, 9, 18]
data

[1, 3, 5, 7, 9, 18]

In [148]:
pd.DataFrame(data=data)

Unnamed: 0,0
0,1
1,3
2,5
3,7
4,9
5,18


In [149]:
pd.Series(data=data)

0     1
1     3
2     5
3     7
4     9
5    18
dtype: int64

In [150]:
pd.DataFrame(data=data, columns=["col_1"])

Unnamed: 0,col_1
0,1
1,3
2,5
3,7
4,9
5,18


 ## Creating a DataFrame using a ``NumPy Arrays``

In [151]:
me = np.arange(1,24,2).reshape(3,4)
me

array([[ 1,  3,  5,  7],
       [ 9, 11, 13, 15],
       [17, 19, 21, 23]])

In [152]:
df = pd.DataFrame(data=me, columns=["var1", "var2", "var3", "var4"])
df

Unnamed: 0,var1,var2,var3,var4
0,1,3,5,7
1,9,11,13,15
2,17,19,21,23


In [153]:
df.head(1)

Unnamed: 0,var1,var2,var3,var4
0,1,3,5,7


In [154]:
df.tail(2)

Unnamed: 0,var1,var2,var3,var4
1,9,11,13,15
2,17,19,21,23


In [155]:
df.sample(3)

Unnamed: 0,var1,var2,var3,var4
1,9,11,13,15
0,1,3,5,7
2,17,19,21,23


In [156]:
df.columns

Index(['var1', 'var2', 'var3', 'var4'], dtype='object')

In [157]:
for i in df.columns:
  print(i)

var1
var2
var3
var4


In [158]:
for i in df.columns:
  print(df[i].sum())

27
33
39
45


In [159]:
df.columns=["new1", "new2", "old1", "old2"]

In [160]:
df

Unnamed: 0,new1,new2,old1,old2
0,1,3,5,7
1,9,11,13,15
2,17,19,21,23


In [161]:
df.index=["A", "B", "C"]
df

Unnamed: 0,new1,new2,old1,old2
A,1,3,5,7
B,9,11,13,15
C,17,19,21,23


In [162]:
df.rename(columns={"new1":"a", "new2":"b"})

Unnamed: 0,a,b,old1,old2
A,1,3,5,7
B,9,11,13,15
C,17,19,21,23


In [163]:
df.rename(index={"A":"a", "B":"b"})

Unnamed: 0,new1,new2,old1,old2
a,1,3,5,7
b,9,11,13,15
C,17,19,21,23


In [164]:
df.shape

(3, 4)

In [165]:
df.shape[1]

4

In [166]:
df.size

12

In [167]:
len(df)

3

In [168]:
df.values

array([[ 1,  3,  5,  7],
       [ 9, 11, 13, 15],
       [17, 19, 21, 23]])

In [169]:
type(df)

pandas.core.frame.DataFrame

In [170]:
type(df.values)

numpy.ndarray

 ## Creating a DataFrame using a ``dict``

In [171]:
s1 = np.random.randint(2,10, size = 4)
s2 = np.random.randint(3,10, size = 4)
s3 = np.random.randint(4,15, size = 4)

In [172]:
s1

array([2, 9, 8, 8])

In [173]:
s2

array([4, 7, 5, 7])

In [174]:
s3

array([ 4,  8, 14,  5])

In [175]:
dictim = {"var1":s1, "var2":s2, "var3":s3}
dictim

{'var1': array([2, 9, 8, 8]),
 'var2': array([4, 7, 5, 7]),
 'var3': array([ 4,  8, 14,  5])}

In [176]:
df1= pd.DataFrame(dictim)

In [177]:
df1

Unnamed: 0,var1,var2,var3
0,2,4,4
1,9,7,8
2,8,5,14
3,8,7,5


In [178]:
df1.index

RangeIndex(start=0, stop=4, step=1)

In [179]:
[i for i in df1.index]

[0, 1, 2, 3]

In [180]:
"var2" in df1

True

### Now, let's examine again the ***idexing, selection*** and ***slicing*** methods and several ***attributes*** using a different DataFrame

In [181]:
from numpy.random import randn

In [182]:
randn(5,4)

array([[ 8.07705914e-01,  7.29596753e-02,  6.38787013e-01,
         3.29646299e-01],
       [-4.97104023e-01, -7.54069701e-01, -9.43406403e-01,
         4.84751647e-01],
       [-1.16773316e-01,  1.90175480e+00,  2.38126959e-01,
         1.99665229e+00],
       [-9.93263500e-01,  1.96799505e-01, -1.13664459e+00,
         3.66479606e-04],
       [ 1.02598415e+00, -1.56597904e-01, -3.15791439e-02,
         6.49825833e-01]])

In [264]:
np.random.seed(101)
df3 = pd.DataFrame(randn(5,4), index = 'A B C D E'.split(), columns = 'W X Y Z'.split())
df3

Unnamed: 0,W,X,Y,Z
A,2.70685,0.628133,0.907969,0.503826
B,0.651118,-0.319318,-0.848077,0.605965
C,-2.018168,0.740122,0.528813,-0.589001
D,0.188695,-0.758872,-0.933237,0.955057
E,0.190794,1.978757,2.605967,0.683509


## Selection and Indexing

Let's learn the various methods to grab data from a DataFrame

In [184]:
df3

Unnamed: 0,W,X,Y,Z
A,2.70685,0.628133,0.907969,0.503826
B,0.651118,-0.319318,-0.848077,0.605965
C,-2.018168,0.740122,0.528813,-0.589001
D,0.188695,-0.758872,-0.933237,0.955057
E,0.190794,1.978757,2.605967,0.683509


#### DataFrame Columns are just Series

In [185]:
df3

Unnamed: 0,W,X,Y,Z
A,2.70685,0.628133,0.907969,0.503826
B,0.651118,-0.319318,-0.848077,0.605965
C,-2.018168,0.740122,0.528813,-0.589001
D,0.188695,-0.758872,-0.933237,0.955057
E,0.190794,1.978757,2.605967,0.683509


In [186]:
df3["W"] # return series, best use case

A    2.706850
B    0.651118
C   -2.018168
D    0.188695
E    0.190794
Name: W, dtype: float64

In [187]:
df3[["W"]] # return data frame

Unnamed: 0,W
A,2.70685
B,0.651118
C,-2.018168
D,0.188695
E,0.190794


In [188]:
df3.W    # df3.W = df3["W"]

A    2.706850
B    0.651118
C   -2.018168
D    0.188695
E    0.190794
Name: W, dtype: float64

In [189]:
df3

Unnamed: 0,W,X,Y,Z
A,2.70685,0.628133,0.907969,0.503826
B,0.651118,-0.319318,-0.848077,0.605965
C,-2.018168,0.740122,0.528813,-0.589001
D,0.188695,-0.758872,-0.933237,0.955057
E,0.190794,1.978757,2.605967,0.683509


In [190]:
#df3["W""Z"] # hata verir 

In [191]:
df3[["W", "Z"]] # return 2 columns

Unnamed: 0,W,Z
A,2.70685,0.503826
B,0.651118,0.605965
C,-2.018168,-0.589001
D,0.188695,0.955057
E,0.190794,0.683509


In [192]:
df3["W":"Z"]

Unnamed: 0,W,X,Y,Z


In [193]:
df3["A":"C"]

Unnamed: 0,W,X,Y,Z
A,2.70685,0.628133,0.907969,0.503826
B,0.651118,-0.319318,-0.848077,0.605965
C,-2.018168,0.740122,0.528813,-0.589001


In [194]:
df3["A":"C"]["W"] # return index rows and a column

A    2.706850
B    0.651118
C   -2.018168
Name: W, dtype: float64

In [195]:
df3["A":"C"][["W", "Z"]] # return index rows and 2 columns

Unnamed: 0,W,Z
A,2.70685,0.503826
B,0.651118,0.605965
C,-2.018168,-0.589001


**Creating a new column:**

In [196]:
df3

Unnamed: 0,W,X,Y,Z
A,2.70685,0.628133,0.907969,0.503826
B,0.651118,-0.319318,-0.848077,0.605965
C,-2.018168,0.740122,0.528813,-0.589001
D,0.188695,-0.758872,-0.933237,0.955057
E,0.190794,1.978757,2.605967,0.683509


In [197]:
df3["new"]=df3["X"]*df3["Y"]
df3

Unnamed: 0,W,X,Y,Z,new
A,2.70685,0.628133,0.907969,0.503826,0.570325
B,0.651118,-0.319318,-0.848077,0.605965,0.270806
C,-2.018168,0.740122,0.528813,-0.589001,0.391387
D,0.188695,-0.758872,-0.933237,0.955057,0.708208
E,0.190794,1.978757,2.605967,0.683509,5.156577


In [198]:
df3["new2"]=[1, 2, 3, 4, 5]
df3

Unnamed: 0,W,X,Y,Z,new,new2
A,2.70685,0.628133,0.907969,0.503826,0.570325,1
B,0.651118,-0.319318,-0.848077,0.605965,0.270806,2
C,-2.018168,0.740122,0.528813,-0.589001,0.391387,3
D,0.188695,-0.758872,-0.933237,0.955057,0.708208,4
E,0.190794,1.978757,2.605967,0.683509,5.156577,5


In [199]:
df3 = df3[["new", "W", "Z", "new2", "Y", "X"]]
df3

Unnamed: 0,new,W,Z,new2,Y,X
A,0.570325,2.70685,0.503826,1,0.907969,0.628133
B,0.270806,0.651118,0.605965,2,-0.848077,-0.319318
C,0.391387,-2.018168,-0.589001,3,0.528813,0.740122
D,0.708208,0.188695,0.955057,4,-0.933237,-0.758872
E,5.156577,0.190794,0.683509,5,2.605967,1.978757


## [Removing Columns & Rows](http://localhost:8888/notebooks/pythonic/DAwPythonSessions/w3resource-pandas-dataframe-drop.ipynb)

 ### Removing Columns

In [200]:
df3.drop("new", axis=1) # doesnt alter original series

Unnamed: 0,W,Z,new2,Y,X
A,2.70685,0.503826,1,0.907969,0.628133
B,0.651118,0.605965,2,-0.848077,-0.319318
C,-2.018168,-0.589001,3,0.528813,0.740122
D,0.188695,0.955057,4,-0.933237,-0.758872
E,0.190794,0.683509,5,2.605967,1.978757


In [201]:
df3

Unnamed: 0,new,W,Z,new2,Y,X
A,0.570325,2.70685,0.503826,1,0.907969,0.628133
B,0.270806,0.651118,0.605965,2,-0.848077,-0.319318
C,0.391387,-2.018168,-0.589001,3,0.528813,0.740122
D,0.708208,0.188695,0.955057,4,-0.933237,-0.758872
E,5.156577,0.190794,0.683509,5,2.605967,1.978757


In [202]:
df3.drop(["new", "X"], axis=1)

Unnamed: 0,W,Z,new2,Y
A,2.70685,0.503826,1,0.907969
B,0.651118,0.605965,2,-0.848077
C,-2.018168,-0.589001,3,0.528813
D,0.188695,0.955057,4,-0.933237
E,0.190794,0.683509,5,2.605967


In [203]:
df3.drop(columns= ["new", "X"])

Unnamed: 0,W,Z,new2,Y
A,2.70685,0.503826,1,0.907969
B,0.651118,0.605965,2,-0.848077
C,-2.018168,-0.589001,3,0.528813
D,0.188695,0.955057,4,-0.933237
E,0.190794,0.683509,5,2.605967


In [204]:
df3.drop(["new", "X"], axis=1, inplace=True)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  errors=errors,


In [205]:
df3

Unnamed: 0,W,Z,new2,Y
A,2.70685,0.503826,1,0.907969
B,0.651118,0.605965,2,-0.848077
C,-2.018168,-0.589001,3,0.528813
D,0.188695,0.955057,4,-0.933237
E,0.190794,0.683509,5,2.605967


 ### Removing rows

In [206]:
df3.drop("C", axis=0)

Unnamed: 0,W,Z,new2,Y
A,2.70685,0.503826,1,0.907969
B,0.651118,0.605965,2,-0.848077
D,0.188695,0.955057,4,-0.933237
E,0.190794,0.683509,5,2.605967


In [207]:
df3.drop(index=["B"])

Unnamed: 0,W,Z,new2,Y
A,2.70685,0.503826,1,0.907969
C,-2.018168,-0.589001,3,0.528813
D,0.188695,0.955057,4,-0.933237
E,0.190794,0.683509,5,2.605967


In [208]:
df4 = df3.drop(index=["B"])
df4

Unnamed: 0,W,Z,new2,Y
A,2.70685,0.503826,1,0.907969
C,-2.018168,-0.589001,3,0.528813
D,0.188695,0.955057,4,-0.933237
E,0.190794,0.683509,5,2.605967


In [209]:
df3

Unnamed: 0,W,Z,new2,Y
A,2.70685,0.503826,1,0.907969
B,0.651118,0.605965,2,-0.848077
C,-2.018168,-0.589001,3,0.528813
D,0.188695,0.955057,4,-0.933237
E,0.190794,0.683509,5,2.605967


## Selecting Rows

### First, let's take a quick look at [`.loc[]`](http://localhost:8888/notebooks/pythonic/DAwPythonSessions/w3resource-pandas-dataframe-loc.ipynb) | [`.iloc[]`](http://localhost:8888/notebooks/pythonic/DAwPythonSessions/w3resource-pandas-dataframe-iloc.ipynb)

#### `.loc[]` → allows us to select data using **labels** (names) of rows (index) & columns

#### `.iloc[]` → allows us to select data using **index numbers** of rows (index) & columns. it's like classical indexing logic

In [210]:
m = np.random.randint(1,40, size=(8,4))
df4 = pd.DataFrame(m, columns = ["var1","var2","var3",'var4'])
df4

Unnamed: 0,var1,var2,var3,var4
0,8,11,39,10
1,19,8,16,1
2,13,18,12,16
3,34,30,25,37
4,20,36,31,11
5,21,28,9,23
6,27,24,38,23
7,10,3,19,29


In [211]:
df4.loc[4] 

var1    20
var2    36
var3    31
var4    11
Name: 4, dtype: int64

In [214]:
type(df4.loc[4])

pandas.core.series.Series

In [213]:
df4.loc[[4]]

Unnamed: 0,var1,var2,var3,var4
4,20,36,31,11


In [215]:
type(df4.loc[[4]])

pandas.core.frame.DataFrame

In [216]:
df4.loc[2:5]

Unnamed: 0,var1,var2,var3,var4
2,13,18,12,16
3,34,30,25,37
4,20,36,31,11
5,21,28,9,23


In [217]:
df4.index='a b c d e f g h'.split()
df4

Unnamed: 0,var1,var2,var3,var4
a,8,11,39,10
b,19,8,16,1
c,13,18,12,16
d,34,30,25,37
e,20,36,31,11
f,21,28,9,23
g,27,24,38,23
h,10,3,19,29


In [218]:
df4.iloc[1:4] # looks for index 

Unnamed: 0,var1,var2,var3,var4
b,19,8,16,1
c,13,18,12,16
d,34,30,25,37


In [220]:
# df4.loc[1:4] cant find index 1 

In [221]:
df4.loc["a":"d"] # looks for labels

Unnamed: 0,var1,var2,var3,var4
a,8,11,39,10
b,19,8,16,1
c,13,18,12,16
d,34,30,25,37


In [222]:
df4

Unnamed: 0,var1,var2,var3,var4
a,8,11,39,10
b,19,8,16,1
c,13,18,12,16
d,34,30,25,37
e,20,36,31,11
f,21,28,9,23
g,27,24,38,23
h,10,3,19,29


In [223]:
df4.iloc[3,1]

30

In [224]:
df4.loc["d", "var2"]

30

In [228]:
df4.loc["d":"g", "var3"]

d    25
e    31
f     9
g    38
Name: var3, dtype: int64

In [226]:
df4

Unnamed: 0,var1,var2,var3,var4
a,8,11,39,10
b,19,8,16,1
c,13,18,12,16
d,34,30,25,37
e,20,36,31,11
f,21,28,9,23
g,27,24,38,23
h,10,3,19,29


In [229]:
df4.loc["d":"g"]["var3"] # returns series

d    25
e    31
f     9
g    38
Name: var3, dtype: int64

In [232]:
df4.loc["d":"g"][["var3"]] # returns dataframe

Unnamed: 0,var3
d,25
e,31
f,9
g,38


In [235]:
df4.loc["d":"g", ["var3"]] # returns dataframe

Unnamed: 0,var3
d,25
e,31
f,9
g,38


In [236]:
df4.iloc[2:5,2]

c    12
d    25
e    31
Name: var3, dtype: int64

#### Let's continue to examine `.loc[]` and `.iloc[]` using ``df3`` again

In [237]:
df3

Unnamed: 0,W,Z,new2,Y
A,2.70685,0.503826,1,0.907969
B,0.651118,0.605965,2,-0.848077
C,-2.018168,-0.589001,3,0.528813
D,0.188695,0.955057,4,-0.933237
E,0.190794,0.683509,5,2.605967


In [239]:
df3.loc["C"]

W      -2.018168
Z      -0.589001
new2    3.000000
Y       0.528813
Name: C, dtype: float64

In [240]:
df3.iloc[2]

W      -2.018168
Z      -0.589001
new2    3.000000
Y       0.528813
Name: C, dtype: float64

In [241]:
df3.loc[["C"]]

Unnamed: 0,W,Z,new2,Y
C,-2.018168,-0.589001,3,0.528813


In [242]:
df3.iloc[[2]]

Unnamed: 0,W,Z,new2,Y
C,-2.018168,-0.589001,3,0.528813


In [243]:
df3.loc["C", "Z"]

-0.5890005332865824

### Selecting subset of rows and columns

In [249]:
df3.loc[["C","A"], ["Z", "Y"]]

Unnamed: 0,Z,Y
C,-0.589001,0.528813
A,0.503826,0.907969


In [248]:
df3.loc[["A","C"], ["Y", "Z"]]

Unnamed: 0,Y,Z
A,0.907969,0.503826
C,0.528813,-0.589001


In [244]:
df3.loc[["C"], ["Z"]]

Unnamed: 0,Z
C,-0.589001


In [247]:
df3

Unnamed: 0,W,Z,new2,Y
A,2.70685,0.503826,1,0.907969
B,0.651118,0.605965,2,-0.848077
C,-2.018168,-0.589001,3,0.528813
D,0.188695,0.955057,4,-0.933237
E,0.190794,0.683509,5,2.605967


 - ### `.loc[[row labels|names], [column labels|names]]`

 - ### `.iloc[[row index numbers], [column index numbers]]`

### Conditional Selection

An important feature of pandas is conditional selection using bracket notation, very similar to numpy:

In [250]:
df3

Unnamed: 0,W,Z,new2,Y
A,2.70685,0.503826,1,0.907969
B,0.651118,0.605965,2,-0.848077
C,-2.018168,-0.589001,3,0.528813
D,0.188695,0.955057,4,-0.933237
E,0.190794,0.683509,5,2.605967


In [251]:
df3>2

Unnamed: 0,W,Z,new2,Y
A,True,False,False,False
B,False,False,False,False
C,False,False,True,False
D,False,False,True,False
E,False,False,True,True


In [252]:
df3[df3>2]

Unnamed: 0,W,Z,new2,Y
A,2.70685,,,
B,,,,
C,,,3.0,
D,,,4.0,
E,,,5.0,2.605967


In [254]:
df3[df3["Z"]>0.5] # missed row C coz Z value less than 2

Unnamed: 0,W,Z,new2,Y
A,2.70685,0.503826,1,0.907969
B,0.651118,0.605965,2,-0.848077
D,0.188695,0.955057,4,-0.933237
E,0.190794,0.683509,5,2.605967


In [258]:
df3[df3["Z"]>0.5]["Y"] # missed row C  returns series

A    0.907969
B   -0.848077
D   -0.933237
E    2.605967
Name: Y, dtype: float64

In [259]:
df3[df3["Z"]>0.5][["Y"]] # missed row C  returns dataframe

Unnamed: 0,Y
A,0.907969
B,-0.848077
D,-0.933237
E,2.605967


In [265]:
df3

Unnamed: 0,W,X,Y,Z
A,2.70685,0.628133,0.907969,0.503826
B,0.651118,-0.319318,-0.848077,0.605965
C,-2.018168,0.740122,0.528813,-0.589001
D,0.188695,-0.758872,-0.933237,0.955057
E,0.190794,1.978757,2.605967,0.683509


In [266]:
df3[(df3['W']>0) & (df3['Y']<1)]

Unnamed: 0,W,X,Y,Z
A,2.70685,0.628133,0.907969,0.503826
B,0.651118,-0.319318,-0.848077,0.605965
D,0.188695,-0.758872,-0.933237,0.955057


In [267]:
df3[(df3['W']>0) & (df3['Y']<1)]=0

In [268]:
df3

Unnamed: 0,W,X,Y,Z
A,0.0,0.0,0.0,0.0
B,0.0,0.0,0.0,0.0
C,-2.018168,0.740122,0.528813,-0.589001
D,0.0,0.0,0.0,0.0
E,0.190794,1.978757,2.605967,0.683509


In [269]:
np.random.seed(101)
df3 = pd.DataFrame(randn(5,4), index = 'A B C D E'.split(), columns = 'W X Y Z'.split())
df3

Unnamed: 0,W,X,Y,Z
A,2.70685,0.628133,0.907969,0.503826
B,0.651118,-0.319318,-0.848077,0.605965
C,-2.018168,0.740122,0.528813,-0.589001
D,0.188695,-0.758872,-0.933237,0.955057
E,0.190794,1.978757,2.605967,0.683509


#### For two conditions you can use **|** → `or`,  **&** →  `and` with parenthesis:

### Conditional selection using ``.loc[]`` and ``.iloc[]``

In [270]:
df3

Unnamed: 0,W,X,Y,Z
A,2.70685,0.628133,0.907969,0.503826
B,0.651118,-0.319318,-0.848077,0.605965
C,-2.018168,0.740122,0.528813,-0.589001
D,0.188695,-0.758872,-0.933237,0.955057
E,0.190794,1.978757,2.605967,0.683509


In [272]:
df3.loc[(df3.X>0), ["X", "Y"]]

Unnamed: 0,X,Y
A,0.628133,0.907969
C,0.740122,0.528813
E,1.978757,2.605967


In [274]:
df3.loc[((df3.W>1) | (df3.Y<1)), ['Y','Z']]

Unnamed: 0,Y,Z
A,0.907969,0.503826
B,-0.848077,0.605965
C,0.528813,-0.589001
D,-0.933237,0.955057


In [275]:
df3.loc[((df3.W>1) | (df3.Y<1)), ['Y','Z']]

Unnamed: 0,Y,Z
A,0.907969,0.503826
B,-0.848077,0.605965
C,0.528813,-0.589001
D,-0.933237,0.955057


## More Index Details

Let's discuss some more features of indexing, including resetting the index or setting it something else. We'll also talk about index hierarchy!

In [276]:
df3

Unnamed: 0,W,X,Y,Z
A,2.70685,0.628133,0.907969,0.503826
B,0.651118,-0.319318,-0.848077,0.605965
C,-2.018168,0.740122,0.528813,-0.589001
D,0.188695,-0.758872,-0.933237,0.955057
E,0.190794,1.978757,2.605967,0.683509


In [279]:
df3.reset_index()

Unnamed: 0,index,W,X,Y,Z
0,A,2.70685,0.628133,0.907969,0.503826
1,B,0.651118,-0.319318,-0.848077,0.605965
2,C,-2.018168,0.740122,0.528813,-0.589001
3,D,0.188695,-0.758872,-0.933237,0.955057
4,E,0.190794,1.978757,2.605967,0.683509


In [281]:
df3.reset_index(drop=True) # dropped indexes A,B,C..

Unnamed: 0,W,X,Y,Z
0,2.70685,0.628133,0.907969,0.503826
1,0.651118,-0.319318,-0.848077,0.605965
2,-2.018168,0.740122,0.528813,-0.589001
3,0.188695,-0.758872,-0.933237,0.955057
4,0.190794,1.978757,2.605967,0.683509


In [283]:
df3.set_index("Z") # switch columns to index header

Unnamed: 0_level_0,W,X,Y
Z,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
0.503826,2.70685,0.628133,0.907969
0.605965,0.651118,-0.319318,-0.848077
-0.589001,-2.018168,0.740122,0.528813
0.955057,0.188695,-0.758872,-0.933237
0.683509,0.190794,1.978757,2.605967


In [284]:
df3

Unnamed: 0,W,X,Y,Z
A,2.70685,0.628133,0.907969,0.503826
B,0.651118,-0.319318,-0.848077,0.605965
C,-2.018168,0.740122,0.528813,-0.589001
D,0.188695,-0.758872,-0.933237,0.955057
E,0.190794,1.978757,2.605967,0.683509


In [285]:
df3.reset_index(drop=True, inplace=True)

In [286]:
df3

Unnamed: 0,W,X,Y,Z
0,2.70685,0.628133,0.907969,0.503826
1,0.651118,-0.319318,-0.848077,0.605965
2,-2.018168,0.740122,0.528813,-0.589001
3,0.188695,-0.758872,-0.933237,0.955057
4,0.190794,1.978757,2.605967,0.683509


## Multi-Index and Index Hierarchy

Let us go over how to work with Multi-Index, first we'll create a quick example of what a Multi-Indexed DataFrame would look like:

### let's take a quick look at the [``.xs()``](http://localhost:8888/notebooks/pythonic/DAwPythonSessions/w3resource-pandas-dataframe-xs.ipynb)

### Let's learn new functions/attributes/methods on "iris dataset" 

# End of the Session