___

<p style="text-align: center;"><img src="https://docs.google.com/uc?id=1lY0Uj5R04yMY3-ZppPWxqCr5pvBLYPnV" class="img-fluid" 
alt="CLRSWY"></p>

## <p style="background-color:#FDFEFE; font-family:newtimeroman; color:#9d4f8c; font-size:100%; text-align:center; border-radius:10px 10px;">WAY TO REINVENT YOURSELF</p>

<img src=https://i.ibb.co/6gCsHd6/1200px-Pandas-logo-svg.png width="700" height="200">

## <p style="background-color:#FDFEFE; font-family:newtimeroman; color:#060108; font-size:200%; text-align:center; border-radius:10px 10px;">Data Analysis with Python</p>

## <p style="background-color:#FDFEFE; font-family:newtimeroman; color:#060108; font-size:150%; text-align:center; border-radius:10px 10px;">Session - 04</p>

## <p style="background-color:#FDFEFE; font-family:newtimeroman; color:#4d77cf; font-size:200%; text-align:center; border-radius:10px 10px;">Pandas DataFrames</p>

<a id="toc"></a>

## <p style="background-color:#9d4f8c; font-family:newtimeroman; color:#FFF9ED; font-size:175%; text-align:center; border-radius:10px 10px;">Content</p>

* [IMPORTING LIBRARIES NEEDED IN THIS NOTEBOOK](#0)
* [DATA FRAMES](#1)
* [CREATING A DATA FRAME](#2)
    * [Creating a DataFrame Using the Lists of Data & Columns](#2.1)
    * [Creating a DataFrame Using a Numpy Arrays](#2.2)
    * [Creating a DataFrame Using a Dictionary](#2.3)
    * [The Examination of Some Attributes on Data](#2.4)
* [INDEXING, SLICING & SELECTION](#3)    
* [CREATING A NEW COLUMN](#4)    
* [REMOVING COLUMNS](#5)
* [REMOVING ROWS](#6)
* [SELECTING ROWS & COLUMNS USING .loc[ ] & .iloc[ ] ](#7)
* [CONDITIONAL SELECTION](#8)
    * [One Conditional Statement](#8.1)
    * [Two or More Conditional Statements](#8.2)
    * [Conditional Selection Using .loc[ ]](#8.3)
* [reset_index() & set_index()](#9)
* [Multi-Index & Index Hierarchy](#10)
* [Some Other Useful Methods with Iris Dataset](#11)
* [THE END OF THE SESSION-04](#12)

## <p style="background-color:#9d4f8c; font-family:newtimeroman; color:#FFF9ED; font-size:175%; text-align:center; border-radius:10px 10px;">Importing Libraries Needed in This Notebook</p>

<a id="0"></a>
<a href="#toc" class="btn btn-primary btn-sm" role="button" aria-pressed="true" 
style="color:blue; background-color:#dfa8e4" data-toggle="popover">Content</a>

Once you've installed NumPy & Pandas you can import them as a library:

In [1]:
import numpy as np
import pandas as pd

## <p style="background-color:#9d4f8c; font-family:newtimeroman; color:#FFF9ED; font-size:175%; text-align:center; border-radius:10px 10px;">Data Frames</p>

<a id="1"></a>
<a href="#toc" class="btn btn-primary btn-sm" role="button" aria-pressed="true" 
style="color:blue; background-color:#dfa8e4" data-toggle="popover">Content</a>

A DataFrame is a two-dimensional data container, similar to a Matrix, but which can contain heterogeneous data, and for which symbolic names may be associated with the rows and columns. ``DataFrames`` are the workhorse of pandas and are directly inspired by the R programming language. We can think of a DataFrame as a bunch of Series objects put together to share the same index. 

### Why use Pandas?

Data scientists make use of Pandas in Python for its **following advantages**:

- Easily handles missing data
- It uses Series for one-dimensional data structure and DataFrame for multi-dimensional data structure
- It provides an efficient way to slice the data
- It provides a flexible way to merge, concatenate or reshape the data
- It includes a powerful time series tool to work with

In a nutshell, Pandas is a useful library in data analysis. It can be used to perform data manipulation and analysis. Pandas provide powerful and easy-to-use data structures, as well as the means to quickly perform operations on these structures.

[SOURCE01](https://pandas.pydata.org/pandas-docs/stable/user_guide/dsintro.html), 
[SOURCE02](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html), 
[SOURCE03](https://morioh.com/p/2528ac775b1b), 
[SOURCE04](https://www.datacamp.com/community/tutorials/pandas-tutorial-dataframe-python), 
[SOURCE05](https://www.guru99.com/python-pandas-tutorial.html), 
[SOURCE06](https://www.tutorialspoint.com/python_pandas/python_pandas_dataframe.htm), 
[SOURCE07](https://realpython.com/pandas-dataframe/) &
[SOURCE08](https://towardsdatascience.com/a-simple-guide-to-pandas-dataframes-b125f64e1453)<br>
[VIDEO SOURCE01](https://www.youtube.com/watch?v=zmdjNSmRXF4), 
[VIDEO SOURCE02](https://www.youtube.com/watch?v=F6kmIpWWEdU) &
[VIDEO SOURCE03](https://towardsdatascience.com/pandas-dataframe-basics-3c16eb35c4f3)<br>

**Now let's use pandas to explore this topic!**

![image.png](attachment:image.png)

## <p style="background-color:#9d4f8c; font-family:newtimeroman; color:#FFF9ED; font-size:175%; text-align:center; border-radius:10px 10px;">Creating a DataFrame</p>

<a id="2"></a>
<a href="#toc" class="btn btn-primary btn-sm" role="button" aria-pressed="true" 
style="color:blue; background-color:#dfa8e4" data-toggle="popover">Content</a>

A **``DataFrame``** is a **two-dimension collection of data**. It is a data structure where data is stored in **tabular form**. Datasets are arranged in rows and columns; we can store multiple datasets in the data frame. We can perform various arithmetic operations, such as adding column/row selection and columns/rows in the data frame.

We can import the DataFrames from the external storage; these storages can be referred to as the SQL Database, CSV file, and an Excel file. We can also use the lists, dictionary, and from a list of dictionary, etc.

In this session, we will learn to create the DataFrame in multiple ways. Let's understand these different ways.

**``pandas.DataFrame(data=None, index=None, columns=None, dtype=None, copy=None)``**

### <p style="background-color:#9d4f8c; font-family:newtimeroman; color:#FFF9ED; font-size:150%; text-align:LEFT; border-radius:10px 10px;">Creating a DataFrame Using the Lists of Data & Columns</p>

<a id="2.1"></a>
<a href="#toc" class="btn btn-primary btn-sm" role="button" aria-pressed="true" 
style="color:blue; background-color:#dfa8e4" data-toggle="popover">Content</a>

In [2]:
data = [1,3,5,7,9]
data

[1, 3, 5, 7, 9]

In [3]:
pd.DataFrame(data)

Unnamed: 0,0
0,1
1,3
2,5
3,7
4,9


In [4]:
pd.Series(data)

0    1
1    3
2    5
3    7
4    9
dtype: int64

#### serilerini birleşiminden bir dataframe oluşur diyoruz ama görüldüğü üzere tek sütunlu df'de olabiliyor.

In [5]:
pd.DataFrame(data=data , columns=["column_1"])

Unnamed: 0,column_1
0,1
1,3
2,5
3,7
4,9


### <p style="background-color:#9d4f8c; font-family:newtimeroman; color:#FFF9ED; font-size:150%; text-align:LEFT; border-radius:10px 10px;">Creating a DataFrame Using a Numpy Arrays</p>

<a id="2.2"></a>
<a href="#toc" class="btn btn-primary btn-sm" role="button" aria-pressed="true" 
style="color:blue; background-color:#dfa8e4" data-toggle="popover">Content</a>

In [6]:
data  =  np.arange(1,24,2).reshape(3,4)
data

array([[ 1,  3,  5,  7],
       [ 9, 11, 13, 15],
       [17, 19, 21, 23]])

In [7]:
df = pd.DataFrame(data,columns=["var1","var2","var3","var4"])
df

Unnamed: 0,var1,var2,var3,var4
0,1,3,5,7
1,9,11,13,15
2,17,19,21,23


In [8]:
# df = pd.DataFrame(data,columns=["var1","var2","var3"])  
## hata alırız çünkü data'daki column sayısı = vereceğim column sayısı olmalı 
# df

### <p style="background-color:#9d4f8c; font-family:newtimeroman; color:#FFF9ED; font-size:150%; text-align:LEFT; border-radius:10px 10px;">Creating a DataFrame Using a Dictionary</p>

<a id="2.3"></a>
<a href="#toc" class="btn btn-primary btn-sm" role="button" aria-pressed="true" 
style="color:blue; background-color:#dfa8e4" data-toggle="popover">Content</a>

In [9]:
s1 = np.random.randint(2, 10, size = 4)
s1

array([9, 7, 7, 7])

In [10]:
s1 = np.random.randint(2, 10, size = 4)
s2 = np.random.randint(3, 10, size = 4)
s3 = np.random.randint(4, 15, size = 4)

In [11]:
mydict = {"var1":s1,"var2":s2,"var3":s3}
mydict

{'var1': array([8, 6, 2, 7]),
 'var2': array([6, 6, 8, 6]),
 'var3': array([13, 12, 10,  8])}

In [12]:
df = pd.DataFrame(mydict)
df

Unnamed: 0,var1,var2,var3
0,8,6,13
1,6,6,12
2,2,8,10
3,7,6,8


### <p style="background-color:#9d4f8c; font-family:newtimeroman; color:#FFF9ED; font-size:150%; text-align:LEFT; border-radius:10px 10px;">The Examination of Some Attributes on Data</p>

<a id="2.4"></a>
<a href="#toc" class="btn btn-primary btn-sm" role="button" aria-pressed="true" 
style="color:blue; background-color:#dfa8e4" data-toggle="popover">Content</a>

In [13]:
df

Unnamed: 0,var1,var2,var3
0,8,6,13
1,6,6,12
2,2,8,10
3,7,6,8


In [14]:
df.head(2)

Unnamed: 0,var1,var2,var3
0,8,6,13
1,6,6,12


In [15]:
df.tail(2)

Unnamed: 0,var1,var2,var3
2,2,8,10
3,7,6,8


In [16]:
df.sample(2)

Unnamed: 0,var1,var2,var3
3,7,6,8
0,8,6,13


In [17]:
df.columns

Index(['var1', 'var2', 'var3'], dtype='object')

In [18]:
for col in df.columns:
    print(col)

var1
var2
var3


In [19]:
for col in df.columns:
    print(df[col].mean())

5.75
6.5
10.75


In [20]:
df.index

RangeIndex(start=0, stop=4, step=1)

In [21]:
[i for i in df.index]

[0, 1, 2, 3]

In [22]:
df.columns          ## bu sütun isimlerini değiştirebiliriz

Index(['var1', 'var2', 'var3'], dtype='object')

In [23]:
df.columns  = ["new1","new2","new3"]
df

Unnamed: 0,new1,new2,new3
0,8,6,13
1,6,6,12
2,2,8,10
3,7,6,8


In [24]:
df.index  ## aynı şekilde indexleri de değiştirebiliriz.

RangeIndex(start=0, stop=4, step=1)

In [25]:
df.index =[15,25,35,45]
df

Unnamed: 0,new1,new2,new3
15,8,6,13
25,6,6,12
35,2,8,10
45,7,6,8


## peki ben 10 sütunun sadece 2 tanesini değiştirmek istiyorum nasıl yapacağım ? 

In [26]:
 df.rename(columns={"new1":"aaa","new2":"bbb"})

Unnamed: 0,aaa,bbb,new3
15,8,6,13
25,6,6,12
35,2,8,10
45,7,6,8


In [27]:
df.rename(index={15: 1, 25: 2})

Unnamed: 0,new1,new2,new3
1,8,6,13
2,6,6,12
35,2,8,10
45,7,6,8


In [28]:
## df'de kalıcı değişiklik yapmadık biz daha !

In [29]:
df

Unnamed: 0,new1,new2,new3
15,8,6,13
25,6,6,12
35,2,8,10
45,7,6,8


In [30]:
df = df.rename(index={15:"a",25:"b",35:"c",45:"d"})         ## şu an kalıcı yaptık
df

Unnamed: 0,new1,new2,new3
a,8,6,13
b,6,6,12
c,2,8,10
d,7,6,8


In [31]:
df.shape

(4, 3)

In [32]:
df.shape[0]

4

In [33]:
len(df)  ## bu da bize satır sayısını verir size'nı vermez 

4

In [34]:
df.size

12

In [35]:
df.ndim

2

In [36]:
df.values

array([[ 8,  6, 13],
       [ 6,  6, 12],
       [ 2,  8, 10],
       [ 7,  6,  8]])

In [37]:
type(df.values)

numpy.ndarray

In [38]:
type(df)

pandas.core.frame.DataFrame

In [39]:
type(df["new1"])

pandas.core.series.Series

In [40]:
"new1" in df

True

In [58]:
df.isin(["new1"])   ## bu valuelara bakıyor

Unnamed: 0,new1,new2,new3
a,False,False,False
b,False,False,False
c,False,False,False
d,False,False,False


## <p style="background-color:#9d4f8c; font-family:newtimeroman; color:#FFF9ED; font-size:175%; text-align:center; border-radius:10px 10px;">Indexing, Slicing & Selection</p>

<a id="3"></a>
<a href="#toc" class="btn btn-primary btn-sm" role="button" aria-pressed="true" 
style="color:blue; background-color:#dfa8e4" data-toggle="popover">Content</a>

Let's learn a variety of methods to grab data from a DataFrame

In [41]:
from numpy.random import randn

In [43]:
'A B C D E'.split()

['A', 'B', 'C', 'D', 'E']

In [42]:
# creating a DataFrame by "keyword arguments"

np.random.seed(101)
df = pd.DataFrame(randn(5, 4), index = 'A B C D E'.split(), columns = 'W X Y Z'.split())
df

Unnamed: 0,W,X,Y,Z
A,2.70685,0.628133,0.907969,0.503826
B,0.651118,-0.319318,-0.848077,0.605965
C,-2.018168,0.740122,0.528813,-0.589001
D,0.188695,-0.758872,-0.933237,0.955057
E,0.190794,1.978757,2.605967,0.683509


In [44]:
df["Y"]

A    0.907969
B   -0.848077
C    0.528813
D   -0.933237
E    2.605967
Name: Y, dtype: float64

In [45]:
df.Y

A    0.907969
B   -0.848077
C    0.528813
D   -0.933237
E    2.605967
Name: Y, dtype: float64

In [46]:
df[["Y"]]

Unnamed: 0,Y
A,0.907969
B,-0.848077
C,0.528813
D,-0.933237
E,2.605967


In [47]:
df

Unnamed: 0,W,X,Y,Z
A,2.70685,0.628133,0.907969,0.503826
B,0.651118,-0.319318,-0.848077,0.605965
C,-2.018168,0.740122,0.528813,-0.589001
D,0.188695,-0.758872,-0.933237,0.955057
E,0.190794,1.978757,2.605967,0.683509


In [49]:
df[["X","Y"]]    # tek köşeli parantez seri döndürür seri de iki sütun olmaz o yüzden bir köşeli parantez daha kullanıyoruz.

Unnamed: 0,X,Y
A,0.628133,0.907969
B,-0.319318,-0.848077
C,0.740122,0.528813
D,-0.758872,-0.933237
E,1.978757,2.605967


In [56]:
df["W":"Y"]         ## BU İNDEXLERE BAKAR (SLICING)   loc ve iloc'la bunu aşacağız.

Unnamed: 0,W,X,Y,Z


In [58]:
df["A":"C"]          ## isim olarak slicing yapıyorsam son yazdığım indexin dahil olduğunu da farkedelim.

Unnamed: 0,W,X,Y,Z
A,2.70685,0.628133,0.907969,0.503826
B,0.651118,-0.319318,-0.848077,0.605965
C,-2.018168,0.740122,0.528813,-0.589001


In [63]:
df["A":"C"][["Y","X"]]   # df'deki sıra önemli değil önemli olan benim çağırdığım sıra.

Unnamed: 0,Y,X
A,0.907969,0.628133
B,-0.848077,-0.319318
C,0.528813,0.740122


## <p style="background-color:#9d4f8c; font-family:newtimeroman; color:#FFF9ED; font-size:175%; text-align:center; border-radius:10px 10px;">Creating a New Column</p>

<a id="4"></a>
<a href="#toc" class="btn btn-primary btn-sm" role="button" aria-pressed="true" 
style="color:blue; background-color:#dfa8e4" data-toggle="popover">Content</a>

In [64]:
df

Unnamed: 0,W,X,Y,Z
A,2.70685,0.628133,0.907969,0.503826
B,0.651118,-0.319318,-0.848077,0.605965
C,-2.018168,0.740122,0.528813,-0.589001
D,0.188695,-0.758872,-0.933237,0.955057
E,0.190794,1.978757,2.605967,0.683509


In [66]:
df["new1"] = df["W"] * df["X"]
df

Unnamed: 0,W,X,Y,Z,new1
A,2.70685,0.628133,0.907969,0.503826,1.700261
B,0.651118,-0.319318,-0.848077,0.605965,-0.207914
C,-2.018168,0.740122,0.528813,-0.589001,-1.493691
D,0.188695,-0.758872,-0.933237,0.955057,-0.143196
E,0.190794,1.978757,2.605967,0.683509,0.377536


In [68]:
# df["new2"] = [1,2,3]    burada hata alırız 3 satır yok bizde 5 satır var
# df

In [69]:
df["new2"] = np.arange(5)
df

Unnamed: 0,W,X,Y,Z,new1,new2
A,2.70685,0.628133,0.907969,0.503826,1.700261,0
B,0.651118,-0.319318,-0.848077,0.605965,-0.207914,1
C,-2.018168,0.740122,0.528813,-0.589001,-1.493691,2
D,0.188695,-0.758872,-0.933237,0.955057,-0.143196,3
E,0.190794,1.978757,2.605967,0.683509,0.377536,4


## <p style="background-color:#9d4f8c; font-family:newtimeroman; color:#FFF9ED; font-size:175%; text-align:center; border-radius:10px 10px;">Removing Columns</p>

<a id="5"></a>
<a href="#toc" class="btn btn-primary btn-sm" role="button" aria-pressed="true" 
style="color:blue; background-color:#dfa8e4" data-toggle="popover">Content</a>

In [70]:
df.drop("new2",axis=1)

Unnamed: 0,W,X,Y,Z,new1
A,2.70685,0.628133,0.907969,0.503826,1.700261
B,0.651118,-0.319318,-0.848077,0.605965,-0.207914
C,-2.018168,0.740122,0.528813,-0.589001,-1.493691
D,0.188695,-0.758872,-0.933237,0.955057,-0.143196
E,0.190794,1.978757,2.605967,0.683509,0.377536


In [72]:
df    # df.drop() ile kalıcı bir şekilde düşmediğine dikkat edelim

Unnamed: 0,W,X,Y,Z,new1,new2
A,2.70685,0.628133,0.907969,0.503826,1.700261,0
B,0.651118,-0.319318,-0.848077,0.605965,-0.207914,1
C,-2.018168,0.740122,0.528813,-0.589001,-1.493691,2
D,0.188695,-0.758872,-0.933237,0.955057,-0.143196,3
E,0.190794,1.978757,2.605967,0.683509,0.377536,4


###  birden fazla sütunu nasıl düşeceğim ? ------> liste şeklinde vererek.

In [73]:
df.drop(["new1","new2"],axis=1)

Unnamed: 0,W,X,Y,Z
A,2.70685,0.628133,0.907969,0.503826
B,0.651118,-0.319318,-0.848077,0.605965
C,-2.018168,0.740122,0.528813,-0.589001
D,0.188695,-0.758872,-0.933237,0.955057
E,0.190794,1.978757,2.605967,0.683509


In [74]:
df

Unnamed: 0,W,X,Y,Z,new1,new2
A,2.70685,0.628133,0.907969,0.503826,1.700261,0
B,0.651118,-0.319318,-0.848077,0.605965,-0.207914,1
C,-2.018168,0.740122,0.528813,-0.589001,-1.493691,2
D,0.188695,-0.758872,-0.933237,0.955057,-0.143196,3
E,0.190794,1.978757,2.605967,0.683509,0.377536,4


In [75]:
df.drop(["new1","new2"],axis=1,inplace=True)

In [77]:
df                            ## inplace True demeden df = yaparak da bunu yapabilirdik.

Unnamed: 0,W,X,Y,Z
A,2.70685,0.628133,0.907969,0.503826
B,0.651118,-0.319318,-0.848077,0.605965
C,-2.018168,0.740122,0.528813,-0.589001
D,0.188695,-0.758872,-0.933237,0.955057
E,0.190794,1.978757,2.605967,0.683509


## <p style="background-color:#9d4f8c; font-family:newtimeroman; color:#FFF9ED; font-size:175%; text-align:center; border-radius:10px 10px;">Removing Rows</p>

<a id="6"></a>
<a href="#toc" class="btn btn-primary btn-sm" role="button" aria-pressed="true" 
style="color:blue; background-color:#dfa8e4" data-toggle="popover">Content</a>

In [78]:
df.drop('C', axis=0)

Unnamed: 0,W,X,Y,Z
A,2.70685,0.628133,0.907969,0.503826
B,0.651118,-0.319318,-0.848077,0.605965
D,0.188695,-0.758872,-0.933237,0.955057
E,0.190794,1.978757,2.605967,0.683509


In [80]:
df

Unnamed: 0,W,X,Y,Z
A,2.70685,0.628133,0.907969,0.503826
B,0.651118,-0.319318,-0.848077,0.605965
C,-2.018168,0.740122,0.528813,-0.589001
D,0.188695,-0.758872,-0.933237,0.955057
E,0.190794,1.978757,2.605967,0.683509


In [82]:
df.drop('E')       #  axis=0 defaulttur

Unnamed: 0,W,X,Y,Z
A,2.70685,0.628133,0.907969,0.503826
B,0.651118,-0.319318,-0.848077,0.605965
C,-2.018168,0.740122,0.528813,-0.589001
D,0.188695,-0.758872,-0.933237,0.955057


## <p style="background-color:#9d4f8c; font-family:newtimeroman; color:#FFF9ED; font-size:175%; text-align:center; border-radius:10px 10px;">Selecting Rows and Columns using .loc[ ] and iloc[ ]</p>

<a id="7"></a>
<a href="#toc" class="btn btn-primary btn-sm" role="button" aria-pressed="true" 
style="color:blue; background-color:#dfa8e4" data-toggle="popover">Content</a>

#### `.loc[]` → allows us to select data using **labels** (names) of rows (index) & columns

#### `.iloc[]` → allows us to select data using **index numbers** of rows (index) & columns. it's like classical indexing logic

In [83]:
data = np.random.randint(1, 40, size=(8, 4))

df = pd.DataFrame(data, columns = ["var1", "var2", "var3", 'var4'])
df

Unnamed: 0,var1,var2,var3,var4
0,8,11,39,10
1,19,8,16,1
2,13,18,12,16
3,34,30,25,37
4,20,36,31,11
5,21,28,9,23
6,27,24,38,23
7,10,3,19,29


In [84]:
df.loc[4]

var1    20
var2    36
var3    31
var4    11
Name: 4, dtype: int32

In [85]:
df.loc[[4]]

Unnamed: 0,var1,var2,var3,var4
4,20,36,31,11


In [86]:
df

Unnamed: 0,var1,var2,var3,var4
0,8,11,39,10
1,19,8,16,1
2,13,18,12,16
3,34,30,25,37
4,20,36,31,11
5,21,28,9,23
6,27,24,38,23
7,10,3,19,29


In [91]:
df.loc[2:5]             # loc'da son elemanda DAHİLDİR. loc labellar ile çalışır.

Unnamed: 0,var1,var2,var3,var4
2,13,18,12,16
3,34,30,25,37
4,20,36,31,11
5,21,28,9,23


In [92]:
df.iloc[2:5]             # iloc'da dahil değildir çünkü iloc arka planda 0-1-2-3-4-5 ile çalışır labellarla çalışmaz.

Unnamed: 0,var1,var2,var3,var4
2,13,18,12,16
3,34,30,25,37
4,20,36,31,11


In [93]:
df.index = 'a b c d e f g h'.split()
df

Unnamed: 0,var1,var2,var3,var4
a,8,11,39,10
b,19,8,16,1
c,13,18,12,16
d,34,30,25,37
e,20,36,31,11
f,21,28,9,23
g,27,24,38,23
h,10,3,19,29


In [95]:
#  df.loc[1]   # keyerror

In [96]:
df.iloc[1]

var1    19
var2     8
var3    16
var4     1
Name: b, dtype: int32

In [98]:
df.iloc[1:4]           # iloc indexe bakmıyor arka plandaki indexlemesine bakıyor.

Unnamed: 0,var1,var2,var3,var4
b,19,8,16,1
c,13,18,12,16
d,34,30,25,37


In [100]:
df.loc["b":"f"] 

Unnamed: 0,var1,var2,var3,var4
b,19,8,16,1
c,13,18,12,16
d,34,30,25,37
e,20,36,31,11
f,21,28,9,23


In [None]:
#  df.loc[   satır     ,   sütun   ]

In [101]:
df

Unnamed: 0,var1,var2,var3,var4
a,8,11,39,10
b,19,8,16,1
c,13,18,12,16
d,34,30,25,37
e,20,36,31,11
f,21,28,9,23
g,27,24,38,23
h,10,3,19,29


In [103]:
# df["d","var3"]   # loc'u kullanmayı unutmayalım.

In [105]:
df.loc["d","var3"]           # loc ile labelları kullancağız.

25

In [106]:
df.loc[ "d":"g"    , "var2"   ]

d    30
e    36
f    28
g    24
Name: var2, dtype: int32

In [109]:
df.loc[ "d":"g"    , ["var2"  ] ]        ## sonucumu df isteseydim sütun kısmını çift köşeli paranteze alıyorum

Unnamed: 0,var2
d,30
e,36
f,28
g,24


In [110]:
df.loc[ "d":"g"    , ["var2" ,"var3"]] 

Unnamed: 0,var2,var3
d,30,25
e,36,31
f,28,9
g,24,38


In [112]:
df

Unnamed: 0,var1,var2,var3,var4
a,8,11,39,10
b,19,8,16,1
c,13,18,12,16
d,34,30,25,37
e,20,36,31,11
f,21,28,9,23
g,27,24,38,23
h,10,3,19,29


In [113]:
df.iloc[2:5,2]

c    12
d    25
e    31
Name: var3, dtype: int32

In [114]:
df.iloc[2:5,[2]]

Unnamed: 0,var3
c,12
d,25
e,31


In [None]:
df.iloc[2:5,[2]]

In [118]:
df.loc["a","var1"]

8

In [120]:
df.loc[["a"],["var1"]]

Unnamed: 0,var1
a,8


In [121]:
df.loc[["a","b"],["var1","var4"]]

Unnamed: 0,var1,var4
a,8,10
b,19,1


## <p style="background-color:#9d4f8c; font-family:newtimeroman; color:#FFF9ED; font-size:175%; text-align:center; border-radius:10px 10px;">Conditional Selection</p>

<a id="8"></a>
<a href="#toc" class="btn btn-primary btn-sm" role="button" aria-pressed="true" 
style="color:blue; background-color:#dfa8e4" data-toggle="popover">Content</a>

An important feature of pandas is conditional selection using bracket notation, very similar to numpy:

In [122]:
df

Unnamed: 0,var1,var2,var3,var4
a,8,11,39,10
b,19,8,16,1
c,13,18,12,16
d,34,30,25,37
e,20,36,31,11
f,21,28,9,23
g,27,24,38,23
h,10,3,19,29


### <p style="background-color:#9d4f8c; font-family:newtimeroman; color:#FFF9ED; font-size:150%; text-align:LEFT; border-radius:10px 10px;">One Conditional Statement</p>

<a id="8.1"></a>
<a href="#toc" class="btn btn-primary btn-sm" role="button" aria-pressed="true" 
style="color:blue; background-color:#dfa8e4" data-toggle="popover">Content</a>

In [123]:
df > 10

Unnamed: 0,var1,var2,var3,var4
a,False,True,True,False
b,True,False,True,False
c,True,True,True,True
d,True,True,True,True
e,True,True,True,True
f,True,True,False,True
g,True,True,True,True
h,False,False,True,True


In [125]:
df[df > 10]          # koşulu sağlayanları ilgili değerleri verdi sağlamayanları NaN verdi NaN floattır.

Unnamed: 0,var1,var2,var3,var4
a,,11.0,39.0,
b,19.0,,16.0,
c,13.0,18.0,12.0,16.0
d,34.0,30.0,25.0,37.0
e,20.0,36.0,31.0,11.0
f,21.0,28.0,,23.0
g,27.0,24.0,38.0,23.0
h,,,19.0,29.0


In [127]:
df[df["var1"] > 10 ]    # sütun 1'e göre bir filtre yaptıç

Unnamed: 0,var1,var2,var3,var4
b,19,8,16,1
c,13,18,12,16
d,34,30,25,37
e,20,36,31,11
f,21,28,9,23
g,27,24,38,23


In [130]:
df[df["var1"] > 10 ]["var2"]          # var1'e göre condition yazıp var2'yi çağırdık *****

b     8
c    18
d    30
e    36
f    28
g    24
Name: var2, dtype: int32

In [133]:
df[df["var1"] > 10 ][["var2"]]

Unnamed: 0,var2
b,8
c,18
d,30
e,36
f,28
g,24


In [134]:
df[df["var1"] > 10 ][["var2","var3"]]

Unnamed: 0,var2,var3
b,8,16
c,18,12
d,30,25
e,36,31
f,28,9
g,24,38


In [135]:
df[df["var1"] > 10 ][["var2","var3"]].mean()

var2    24.000000
var3    21.833333
dtype: float64

### <p style="background-color:#9d4f8c; font-family:newtimeroman; color:#FFF9ED; font-size:150%; text-align:LEFT; border-radius:10px 10px;">Two or More Conditional Statements</p>

<a id="8.2"></a>
<a href="#toc" class="btn btn-primary btn-sm" role="button" aria-pressed="true" 
style="color:blue; background-color:#dfa8e4" data-toggle="popover">Content</a>

**For two or more conditions, you can use | → or, & → and with parenthesis:**

In [136]:
df

Unnamed: 0,var1,var2,var3,var4
a,8,11,39,10
b,19,8,16,1
c,13,18,12,16
d,34,30,25,37
e,20,36,31,11
f,21,28,9,23
g,27,24,38,23
h,10,3,19,29


In [137]:
df[(df["var1"] > 10 )   &  (df["var2"] > 30 )  ]

Unnamed: 0,var1,var2,var3,var4
e,20,36,31,11


### <p style="background-color:#9d4f8c; font-family:newtimeroman; color:#FFF9ED; font-size:150%; text-align:LEFT; border-radius:10px 10px;">Conditional Selection Using .loc[ ] and .iloc[ ]</p>

<a id="8.3"></a>
<a href="#toc" class="btn btn-primary btn-sm" role="button" aria-pressed="true" 
style="color:blue; background-color:#dfa8e4" data-toggle="popover">Content</a>

In [138]:
df.loc[df["var1"] > 10 , ["var2","var3"]]             ####   satırlarla ilgili conditionlar    , sütunlar

Unnamed: 0,var2,var3
b,8,16
c,18,12
d,30,25
e,36,31
f,28,9
g,24,38


In [139]:
df

Unnamed: 0,var1,var2,var3,var4
a,8,11,39,10
b,19,8,16,1
c,13,18,12,16
d,34,30,25,37
e,20,36,31,11
f,21,28,9,23
g,27,24,38,23
h,10,3,19,29


In [140]:
df.loc[((df["var1"] < 10) | (df["var1"] > 30)), ['var2','var3']]

Unnamed: 0,var2,var3
a,11,39
d,30,25


## <p style="background-color:#9d4f8c; font-family:newtimeroman; color:#FFF9ED; font-size:175%; text-align:center; border-radius:10px 10px;">reset_index() & set_index()</p>

<a id="9"></a>
<a href="#toc" class="btn btn-primary btn-sm" role="button" aria-pressed="true" 
style="color:blue; background-color:#dfa8e4" data-toggle="popover">Content</a>

Let's discuss some more features of indexing, including resetting the index or setting it something else. We'll also talk about index hierarchy!

## <p style="background-color:#9d4f8c; font-family:newtimeroman; color:#FFF9ED; font-size:175%; text-align:center; border-radius:10px 10px;">Multi-Index & Index Hierarchy</p>

<a id="10"></a>
<a href="#toc" class="btn btn-primary btn-sm" role="button" aria-pressed="true" 
style="color:blue; background-color:#dfa8e4" data-toggle="popover">Content</a>

Let us go over how to work with Multi-Index, first we'll create a quick example of what a Multi-Indexed DataFrame would look like:

**``Note``** that all of the MultiIndex constructors accept a names argument which stores string names for the levels themselves. If no names are provided, None will be assigned:

For more information Indexing and Selecting Data, visit [**Pandas Official Documentation**](https://pandas.pydata.org/pandas-docs/version/0.13.0/indexing.html)

Now let's show how to index this! For index hierarchy we use ``df.loc[]``, if this was on the columns axis, you would just use normal bracket notation ``df[]``. Calling one level of the index returns the sub-dataframe:

More information for Multiindex and Advanced Indexing, visit [**Pandas Official Documentation**](https://pandas.pydata.org/docs/user_guide/advanced.html)

## <p style="background-color:#9d4f8c; font-family:newtimeroman; color:#FFF9ED; font-size:175%; text-align:center; border-radius:10px 10px;">Some Other Useful Methods with Iris Dataset</p>

<a id="11"></a>
<a href="#toc" class="btn btn-primary btn-sm" role="button" aria-pressed="true" 
style="color:blue; background-color:#dfa8e4" data-toggle="popover">Content</a>

### Let's apply functions/attributes/methods we have learnt for "iris dataset" 