# Pandasda sık kullanılan Fonksiyon ve Metodlar

Bu kısımda pandasda bulunan bazı built-in function ve methodları öğreneceğiz. Bu öğrendiklerimiz elbette buz dağının görünen kısmı. Daha gösteremediğim pek çok fonksiyon ve method vardır. Bütün fonksiyon ve methodları görebilmek için bu [dökümantasyon](https://pandas.pydata.org/pandas-docs/stable/reference/index.html) incelenebilir.
İlerleyen aşamalarda daha fazla fonksiyon ve method göreceğiz. Şidilik göreceğimiz fonksiyon ve methodlar :

* [apply() method](#apply_method)
* [apply() with a function](#apply_function)
* [apply() with a lambda expression](#apply_lambda)
* [apply() on multiple columns](#apply_multiple)
* [describe()](#describe)
* [sort_values()](#sort)
* [corr()](#corr)
* [idxmin and idxmax](#idx)
* [value_counts()](#v_c)
* [replace()](#replace)
* [unique and nunique()](#uni)
* [map()](#map)
* [duplicated and drop_duplicates()](#dup)
* [between()](#bet)
* [sample()](#sample)
* [nlargest()](#n)

In [1]:
import numpy as np
import pandas as pd

In [99]:
df = pd.read_csv('tips.csv')
df.head()

Unnamed: 0,total_bill,tip,sex,smoker,day,time,size,price_per_person,Payer Name,CC Number,Payment ID
0,16.99,1.01,Female,No,Sun,Dinner,2,8.49,Christy Cunningham,3560325168603410,Sun2959
1,10.34,1.66,Male,No,Sun,Dinner,3,3.45,Douglas Tucker,4478071379779230,Sun4608
2,21.01,3.5,Male,No,Sun,Dinner,3,7.0,Travis Walters,6011812112971322,Sun4458
3,23.68,3.31,Male,No,Sun,Dinner,2,11.84,Nathaniel Harris,4676137647685994,Sun5260
4,24.59,3.61,Female,No,Sun,Dinner,4,6.15,Tonya Carter,4832732618637221,Sun2251


## .apply() Methodu

In [None]:
# .apply() methodu bir series ya da datadrame'in sütunlarındaki değelerin hepsine bir anda bir fonksiyon'u uygulamaya yarar.
# Arka planında for döngüsü çalışır dolayısıyla çok hızlı değildir. Performans açısından vektörize edilmiş işlemleri uygulamak daha faydalıdır.
# Özellikle vektörize edilmiş işlemler oldukça hızlı çalışır ve for döngüsü kullanımını minimize eder.
# Ancak diğer taraftan .apply() methodunu kullanmak oldukça okunaklıdır. Başlangıçta kolaydır ve her duruma uygulanabilir. Ancak vektörize işlemler bazı durumlarda yapılamayabilir.

In [None]:
# Amaç : CC Number sütunundaki değerlerin son 4 rakamını almak ve CC Number sütununu bu şekilde güncellemek.

## 1.Yol 

In [4]:
# def son4(x):
#   return str(x)[-4:]

In [8]:
# df['CC Number'] = df['CC Number'].apply(son4)

In [9]:
# df.head()

Unnamed: 0,total_bill,tip,sex,smoker,day,time,size,price_per_person,Payer Name,CC Number,Payment ID
0,16.99,1.01,Female,No,Sun,Dinner,2,8.49,Christy Cunningham,3410,Sun2959
1,10.34,1.66,Male,No,Sun,Dinner,3,3.45,Douglas Tucker,9230,Sun4608
2,21.01,3.5,Male,No,Sun,Dinner,3,7.0,Travis Walters,1322,Sun4458
3,23.68,3.31,Male,No,Sun,Dinner,2,11.84,Nathaniel Harris,5994,Sun5260
4,24.59,3.61,Female,No,Sun,Dinner,4,6.15,Tonya Carter,7221,Sun2251


## 2.Yol

In [12]:
# df.head()

Unnamed: 0,total_bill,tip,sex,smoker,day,time,size,price_per_person,Payer Name,CC Number,Payment ID
0,16.99,1.01,Female,No,Sun,Dinner,2,8.49,Christy Cunningham,3560325168603410,Sun2959
1,10.34,1.66,Male,No,Sun,Dinner,3,3.45,Douglas Tucker,4478071379779230,Sun4608
2,21.01,3.5,Male,No,Sun,Dinner,3,7.0,Travis Walters,6011812112971322,Sun4458
3,23.68,3.31,Male,No,Sun,Dinner,2,11.84,Nathaniel Harris,4676137647685994,Sun5260
4,24.59,3.61,Female,No,Sun,Dinner,4,6.15,Tonya Carter,4832732618637221,Sun2251


In [16]:
# df['CC Number'] = df['CC Number'].apply(lambda x : str(x)[-4:])

In [17]:
# df.head()

Unnamed: 0,total_bill,tip,sex,smoker,day,time,size,price_per_person,Payer Name,CC Number,Payment ID
0,16.99,1.01,Female,No,Sun,Dinner,2,8.49,Christy Cunningham,3410,Sun2959
1,10.34,1.66,Male,No,Sun,Dinner,3,3.45,Douglas Tucker,9230,Sun4608
2,21.01,3.5,Male,No,Sun,Dinner,3,7.0,Travis Walters,1322,Sun4458
3,23.68,3.31,Male,No,Sun,Dinner,2,11.84,Nathaniel Harris,5994,Sun5260
4,24.59,3.61,Female,No,Sun,Dinner,4,6.15,Tonya Carter,7221,Sun2251


## 3.Yol (Vektorized)(En hızlısı bu)

In [19]:
df.head()

Unnamed: 0,total_bill,tip,sex,smoker,day,time,size,price_per_person,Payer Name,CC Number,Payment ID
0,16.99,1.01,Female,No,Sun,Dinner,2,8.49,Christy Cunningham,3560325168603410,Sun2959
1,10.34,1.66,Male,No,Sun,Dinner,3,3.45,Douglas Tucker,4478071379779230,Sun4608
2,21.01,3.5,Male,No,Sun,Dinner,3,7.0,Travis Walters,6011812112971322,Sun4458
3,23.68,3.31,Male,No,Sun,Dinner,2,11.84,Nathaniel Harris,4676137647685994,Sun5260
4,24.59,3.61,Female,No,Sun,Dinner,4,6.15,Tonya Carter,4832732618637221,Sun2251


In [26]:
df['CC Number'].astype(str).str[-4:] # Vektörize edilmiş işlem

0      3410
1      9230
2      1322
3      5994
4      7221
       ... 
239    2842
240    5404
241    7196
242    0950
243    8139
Name: CC Number, Length: 244, dtype: object

In [None]:
## Amaç : 10 dolardan az ödediyse $ , 10 dolar ile 30 dolar arasında ödediyse $$ , 
# 30 dolardan fazla ödediyse $$$ işareti bulunan Fiyat Kategorisi adında bir sütun oluşturun

## 1.Yol

In [27]:
df.head()

Unnamed: 0,total_bill,tip,sex,smoker,day,time,size,price_per_person,Payer Name,CC Number,Payment ID
0,16.99,1.01,Female,No,Sun,Dinner,2,8.49,Christy Cunningham,3560325168603410,Sun2959
1,10.34,1.66,Male,No,Sun,Dinner,3,3.45,Douglas Tucker,4478071379779230,Sun4608
2,21.01,3.5,Male,No,Sun,Dinner,3,7.0,Travis Walters,6011812112971322,Sun4458
3,23.68,3.31,Male,No,Sun,Dinner,2,11.84,Nathaniel Harris,4676137647685994,Sun5260
4,24.59,3.61,Female,No,Sun,Dinner,4,6.15,Tonya Carter,4832732618637221,Sun2251


In [28]:
def fiyat_kategori(x):
  if x < 10 :
    return '$'
  elif 10 <= x < 30 :
    return '$$'
  else : 
    return '$$$'

In [32]:
df['Fiyat_Kategorisi'] = df['total_bill'].apply(fiyat_kategori)

In [35]:
df.head()

Unnamed: 0,total_bill,tip,sex,smoker,day,time,size,price_per_person,Payer Name,CC Number,Payment ID,Fiyat_Kategorisi
0,16.99,1.01,Female,No,Sun,Dinner,2,8.49,Christy Cunningham,3560325168603410,Sun2959,$$
1,10.34,1.66,Male,No,Sun,Dinner,3,3.45,Douglas Tucker,4478071379779230,Sun4608,$$
2,21.01,3.5,Male,No,Sun,Dinner,3,7.0,Travis Walters,6011812112971322,Sun4458,$$
3,23.68,3.31,Male,No,Sun,Dinner,2,11.84,Nathaniel Harris,4676137647685994,Sun5260,$$
4,24.59,3.61,Female,No,Sun,Dinner,4,6.15,Tonya Carter,4832732618637221,Sun2251,$$


## 2.Yol

In [37]:
df.head()

Unnamed: 0,total_bill,tip,sex,smoker,day,time,size,price_per_person,Payer Name,CC Number,Payment ID
0,16.99,1.01,Female,No,Sun,Dinner,2,8.49,Christy Cunningham,3560325168603410,Sun2959
1,10.34,1.66,Male,No,Sun,Dinner,3,3.45,Douglas Tucker,4478071379779230,Sun4608
2,21.01,3.5,Male,No,Sun,Dinner,3,7.0,Travis Walters,6011812112971322,Sun4458
3,23.68,3.31,Male,No,Sun,Dinner,2,11.84,Nathaniel Harris,4676137647685994,Sun5260
4,24.59,3.61,Female,No,Sun,Dinner,4,6.15,Tonya Carter,4832732618637221,Sun2251


In [39]:
df['Fiyat_Kategorisi'] = df['total_bill'].apply(lambda x : '$' if x < 10 else ('$$' if 10 <= x < 30 else '$$$'))

In [40]:
df.sample(10)

Unnamed: 0,total_bill,tip,sex,smoker,day,time,size,price_per_person,Payer Name,CC Number,Payment ID,Fiyat_Kategorisi
16,10.33,1.67,Female,No,Sun,Dinner,3,3.44,Elizabeth Foster,4240025044626033,Sun9715,$$
242,17.82,1.75,Male,No,Sat,Dinner,2,8.91,Dennis Dixon,4375220550950,Sat17,$$
52,34.81,5.2,Female,No,Sun,Dinner,4,8.7,Emily Daniel,4291280793094374,Sun6165,$$$
6,8.77,2.0,Male,No,Sun,Dinner,2,4.38,Kristopher Johnson,2223727524230344,Sun5985,$
198,13.0,2.0,Female,Yes,Thur,Lunch,2,6.5,Katherine Bond,4926725945192,Thur437,$$
217,11.59,1.5,Male,Yes,Sat,Dinner,2,5.8,Gary Orr,30324521283406,Sat8489,$$
63,18.29,3.76,Male,Yes,Sat,Dinner,4,4.57,Chad Hart,580171498976,Sat4178,$$
207,38.73,3.0,Male,Yes,Sat,Dinner,4,9.68,Ricky Ramirez,347817964484033,Sat4505,$$$
153,24.55,2.0,Male,No,Sun,Dinner,4,6.14,Todd Patterson,4416804908942159,Sun8670,$$
101,15.38,3.0,Female,Yes,Fri,Dinner,2,7.69,Tiffany Colon,6011012799432041,Fri8382,$$


## 3.Yol Vektorized(En Hızlısı)

In [41]:
df.head()

Unnamed: 0,total_bill,tip,sex,smoker,day,time,size,price_per_person,Payer Name,CC Number,Payment ID,Fiyat_Kategorisi
0,16.99,1.01,Female,No,Sun,Dinner,2,8.49,Christy Cunningham,3560325168603410,Sun2959,$$
1,10.34,1.66,Male,No,Sun,Dinner,3,3.45,Douglas Tucker,4478071379779230,Sun4608,$$
2,21.01,3.5,Male,No,Sun,Dinner,3,7.0,Travis Walters,6011812112971322,Sun4458,$$
3,23.68,3.31,Male,No,Sun,Dinner,2,11.84,Nathaniel Harris,4676137647685994,Sun5260,$$
4,24.59,3.61,Female,No,Sun,Dinner,4,6.15,Tonya Carter,4832732618637221,Sun2251,$$


In [42]:
df.drop('Fiyat_Kategorisi' , axis = 1 , inplace = True)

In [47]:
df['total_bill'].values

array([16.99, 10.34, 21.01, 23.68, 24.59, 25.29,  8.77, 26.88, 15.04,
       14.78, 10.27, 35.26, 15.42, 18.43, 14.83, 21.58, 10.33, 16.29,
       16.97, 20.65, 17.92, 20.29, 15.77, 39.42, 19.82, 17.81, 13.37,
       12.69, 21.7 , 19.65,  9.55, 18.35, 15.06, 20.69, 17.78, 24.06,
       16.31, 16.93, 18.69, 31.27, 16.04, 17.46, 13.94,  9.68, 30.4 ,
       18.29, 22.23, 32.4 , 28.55, 18.04, 12.54, 10.29, 34.81,  9.94,
       25.56, 19.49, 38.01, 26.41, 11.24, 48.27, 20.29, 13.81, 11.02,
       18.29, 17.59, 20.08, 16.45,  3.07, 20.23, 15.01, 12.02, 17.07,
       26.86, 25.28, 14.73, 10.51, 17.92, 27.2 , 22.76, 17.29, 19.44,
       16.66, 10.07, 32.68, 15.98, 34.83, 13.03, 18.28, 24.71, 21.16,
       28.97, 22.49,  5.75, 16.32, 22.75, 40.17, 27.28, 12.03, 21.01,
       12.46, 11.35, 15.38, 44.3 , 22.42, 20.92, 15.36, 20.49, 25.21,
       18.24, 14.31, 14.  ,  7.25, 38.07, 23.95, 25.71, 17.31, 29.93,
       10.65, 12.43, 24.08, 11.69, 13.42, 14.26, 15.95, 12.48, 29.8 ,
        8.52, 14.52,

In [43]:
df.head()

Unnamed: 0,total_bill,tip,sex,smoker,day,time,size,price_per_person,Payer Name,CC Number,Payment ID
0,16.99,1.01,Female,No,Sun,Dinner,2,8.49,Christy Cunningham,3560325168603410,Sun2959
1,10.34,1.66,Male,No,Sun,Dinner,3,3.45,Douglas Tucker,4478071379779230,Sun4608
2,21.01,3.5,Male,No,Sun,Dinner,3,7.0,Travis Walters,6011812112971322,Sun4458
3,23.68,3.31,Male,No,Sun,Dinner,2,11.84,Nathaniel Harris,4676137647685994,Sun5260
4,24.59,3.61,Female,No,Sun,Dinner,4,6.15,Tonya Carter,4832732618637221,Sun2251


In [49]:
def fiyat_kategori_vectorized(bill):
  categories = np.full_like(bill , '$' , dtype= 'U3')

  categories[(bill >= 10) & (bill < 30)] = '$$'

  categories[bill >= 30] = '$$$'

  return categories

In [51]:
df['Fiyat_Kategorisi'] = fiyat_kategori_vectorized(df['total_bill'].values)

In [52]:
df.sample(10)

Unnamed: 0,total_bill,tip,sex,smoker,day,time,size,price_per_person,Payer Name,CC Number,Payment ID,Fiyat_Kategorisi
7,26.88,3.12,Male,No,Sun,Dinner,4,6.72,Robert Buck,3514785077705092,Sun8157,$$
159,16.49,2.0,Male,No,Sun,Dinner,4,4.12,Christopher Soto,30501814271434,Sun1781,$$
33,20.69,2.45,Female,No,Sat,Dinner,4,5.17,Amber Francis,377742985258914,Sat6649,$$
92,5.75,1.0,Female,Yes,Fri,Dinner,2,2.88,Leah Ramirez,3508911676966392,Fri3780,$
79,17.29,2.71,Male,No,Thur,Lunch,2,8.64,Brian Diaz,4759290988169738,Thur9501,$$
139,13.16,2.75,Female,No,Thur,Lunch,2,6.58,Lindsey Meyer,676239597203,Thur6245,$$
115,17.31,3.5,Female,No,Sun,Dinner,2,8.65,Kayla Stone,379494319310858,Sun8746,$$
60,20.29,3.21,Male,Yes,Sat,Dinner,2,10.14,Anthony Mclean,347614304015027,Sat2353,$$
137,14.15,2.0,Female,No,Thur,Lunch,2,7.08,Vanessa Morris,213189344156819,Thur3890,$$
104,20.92,4.08,Female,No,Sat,Dinner,2,10.46,Gabrielle Frederick,4013010878990106,Sat3194,$$


# .apply() metodu birden fazla sütuna aynı anda uygulanabilir

## 1.Yol

In [53]:
df.head()

Unnamed: 0,total_bill,tip,sex,smoker,day,time,size,price_per_person,Payer Name,CC Number,Payment ID,Fiyat_Kategorisi
0,16.99,1.01,Female,No,Sun,Dinner,2,8.49,Christy Cunningham,3560325168603410,Sun2959,$$
1,10.34,1.66,Male,No,Sun,Dinner,3,3.45,Douglas Tucker,4478071379779230,Sun4608,$$
2,21.01,3.5,Male,No,Sun,Dinner,3,7.0,Travis Walters,6011812112971322,Sun4458,$$
3,23.68,3.31,Male,No,Sun,Dinner,2,11.84,Nathaniel Harris,4676137647685994,Sun5260,$$
4,24.59,3.61,Female,No,Sun,Dinner,4,6.15,Tonya Carter,4832732618637221,Sun2251,$$


In [54]:
def kareAl(x):
  return x**2

In [56]:
# df[['total_bill' , 'tip']] = df[['total_bill' , 'tip']].apply(kareAl)
df[['total_bill' , 'tip']].apply(kareAl)

Unnamed: 0,total_bill,tip
0,288.6601,1.0201
1,106.9156,2.7556
2,441.4201,12.2500
3,560.7424,10.9561
4,604.6681,13.0321
...,...,...
239,842.7409,35.0464
240,738.7524,4.0000
241,513.9289,4.0000
242,317.5524,3.0625


## 2.Yol

In [58]:
# df[['total_bill' , 'tip']] = df[['total_bill' , 'tip']].apply(lambda x : x**2)
df[['total_bill' , 'tip']].apply(lambda x : x**2)

Unnamed: 0,total_bill,tip
0,288.6601,1.0201
1,106.9156,2.7556
2,441.4201,12.2500
3,560.7424,10.9561
4,604.6681,13.0321
...,...,...
239,842.7409,35.0464
240,738.7524,4.0000
241,513.9289,4.0000
242,317.5524,3.0625


## 3.Yol Vektorized(En Hızlısı)

In [61]:
# df[['total_bill' , 'tip']] = np.square(df[['total_bill' , 'tip']])
np.square(df[['total_bill' , 'tip']])

Unnamed: 0,total_bill,tip
0,288.6601,1.0201
1,106.9156,2.7556
2,441.4201,12.2500
3,560.7424,10.9561
4,604.6681,13.0321
...,...,...
239,842.7409,35.0464
240,738.7524,4.0000
241,513.9289,4.0000
242,317.5524,3.0625


In [62]:
df.describe()

Unnamed: 0,total_bill,tip,size,price_per_person,CC Number
count,244.0,244.0,244.0,244.0,244.0
mean,19.785943,2.998279,2.569672,7.888197,2563496000000000.0
std,8.902412,1.383638,0.9511,2.914234,2369340000000000.0
min,3.07,1.0,1.0,2.88,60406790000.0
25%,13.3475,2.0,2.0,5.8,30407310000000.0
50%,17.795,2.9,2.0,7.255,3525318000000000.0
75%,24.1275,3.5625,3.0,9.39,4553675000000000.0
max,50.81,10.0,6.0,20.27,6596454000000000.0


In [63]:
df.head()

Unnamed: 0,total_bill,tip,sex,smoker,day,time,size,price_per_person,Payer Name,CC Number,Payment ID,Fiyat_Kategorisi
0,16.99,1.01,Female,No,Sun,Dinner,2,8.49,Christy Cunningham,3560325168603410,Sun2959,$$
1,10.34,1.66,Male,No,Sun,Dinner,3,3.45,Douglas Tucker,4478071379779230,Sun4608,$$
2,21.01,3.5,Male,No,Sun,Dinner,3,7.0,Travis Walters,6011812112971322,Sun4458,$$
3,23.68,3.31,Male,No,Sun,Dinner,2,11.84,Nathaniel Harris,4676137647685994,Sun5260,$$
4,24.59,3.61,Female,No,Sun,Dinner,4,6.15,Tonya Carter,4832732618637221,Sun2251,$$


In [64]:
df.sort_values(by= 'total_bill') # ascending = True olduğu için küçükten büyüğe sıraladı.

Unnamed: 0,total_bill,tip,sex,smoker,day,time,size,price_per_person,Payer Name,CC Number,Payment ID,Fiyat_Kategorisi
67,3.07,1.00,Female,Yes,Sat,Dinner,1,3.07,Tiffany Brock,4359488526995267,Sat3455,$
92,5.75,1.00,Female,Yes,Fri,Dinner,2,2.88,Leah Ramirez,3508911676966392,Fri3780,$
111,7.25,1.00,Female,No,Sat,Dinner,1,7.25,Terri Jones,3559221007826887,Sat4801,$
172,7.25,5.15,Male,Yes,Sun,Dinner,2,3.62,Larry White,30432617123103,Sun9209,$
149,7.51,2.00,Male,No,Thur,Lunch,2,3.76,Daniel Robbins,4823139288341889,Thur6321,$
...,...,...,...,...,...,...,...,...,...,...,...,...
182,45.35,3.50,Male,Yes,Sun,Dinner,3,15.12,Jose Parsons,4112207559459910,Sun2337,$$$
156,48.17,5.00,Male,No,Sun,Dinner,6,8.03,Ryan Gonzales,3523151482063321,Sun7518,$$$
59,48.27,6.73,Male,No,Sat,Dinner,4,12.07,Brian Ortiz,6596453823950595,Sat8139,$$$
212,48.33,9.00,Male,No,Sat,Dinner,4,12.08,Alex Williamson,676218815212,Sat4590,$$$


In [66]:
df.sort_values(by= 'total_bill' , ascending = False , inplace = True)

In [67]:
df.head()

Unnamed: 0,total_bill,tip,sex,smoker,day,time,size,price_per_person,Payer Name,CC Number,Payment ID,Fiyat_Kategorisi
170,50.81,10.0,Male,Yes,Sat,Dinner,3,16.94,Gregory Clark,5473850968388236,Sat1954,$$$
212,48.33,9.0,Male,No,Sat,Dinner,4,12.08,Alex Williamson,676218815212,Sat4590,$$$
59,48.27,6.73,Male,No,Sat,Dinner,4,12.07,Brian Ortiz,6596453823950595,Sat8139,$$$
156,48.17,5.0,Male,No,Sun,Dinner,6,8.03,Ryan Gonzales,3523151482063321,Sun7518,$$$
182,45.35,3.5,Male,Yes,Sun,Dinner,3,15.12,Jose Parsons,4112207559459910,Sun2337,$$$


In [68]:
df.reset_index(inplace = True , drop = True)

In [69]:
df.head()

Unnamed: 0,total_bill,tip,sex,smoker,day,time,size,price_per_person,Payer Name,CC Number,Payment ID,Fiyat_Kategorisi
0,50.81,10.0,Male,Yes,Sat,Dinner,3,16.94,Gregory Clark,5473850968388236,Sat1954,$$$
1,48.33,9.0,Male,No,Sat,Dinner,4,12.08,Alex Williamson,676218815212,Sat4590,$$$
2,48.27,6.73,Male,No,Sat,Dinner,4,12.07,Brian Ortiz,6596453823950595,Sat8139,$$$
3,48.17,5.0,Male,No,Sun,Dinner,6,8.03,Ryan Gonzales,3523151482063321,Sun7518,$$$
4,45.35,3.5,Male,Yes,Sun,Dinner,3,15.12,Jose Parsons,4112207559459910,Sun2337,$$$


In [72]:
df.sort_values(['tip' , 'total_bill'] , ascending = False)

Unnamed: 0,total_bill,tip,sex,smoker,day,time,size,price_per_person,Payer Name,CC Number,Payment ID,Fiyat_Kategorisi
0,50.81,10.00,Male,Yes,Sat,Dinner,3,16.94,Gregory Clark,5473850968388236,Sat1954,$$$
1,48.33,9.00,Male,No,Sat,Dinner,4,12.08,Alex Williamson,676218815212,Sat4590,$$$
10,39.42,7.58,Male,No,Sat,Dinner,4,9.86,Lance Peterson,3542584061609808,Sat239,$$$
2,48.27,6.73,Male,No,Sat,Dinner,4,12.07,Brian Ortiz,6596453823950595,Sat8139,$$$
20,34.30,6.70,Male,No,Thur,Lunch,6,5.72,Steven Carlson,3526515703718508,Thur1025,$$$
...,...,...,...,...,...,...,...,...,...,...,...,...
131,16.99,1.01,Female,No,Sun,Dinner,2,8.49,Christy Cunningham,3560325168603410,Sun2959,$$
195,12.60,1.00,Male,Yes,Sat,Dinner,2,6.30,Matthew Myers,3543676378973965,Sat5032,$$
240,7.25,1.00,Female,No,Sat,Dinner,1,7.25,Terri Jones,3559221007826887,Sat4801,$
242,5.75,1.00,Female,Yes,Fri,Dinner,2,2.88,Leah Ramirez,3508911676966392,Fri3780,$


In [76]:
df.head(10)

Unnamed: 0,total_bill,tip,sex,smoker,day,time,size,price_per_person,Payer Name,CC Number,Payment ID,Fiyat_Kategorisi
0,50.81,10.0,Male,Yes,Sat,Dinner,3,16.94,Gregory Clark,5473850968388236,Sat1954,$$$
1,48.33,9.0,Male,No,Sat,Dinner,4,12.08,Alex Williamson,676218815212,Sat4590,$$$
2,48.27,6.73,Male,No,Sat,Dinner,4,12.07,Brian Ortiz,6596453823950595,Sat8139,$$$
3,48.17,5.0,Male,No,Sun,Dinner,6,8.03,Ryan Gonzales,3523151482063321,Sun7518,$$$
4,45.35,3.5,Male,Yes,Sun,Dinner,3,15.12,Jose Parsons,4112207559459910,Sun2337,$$$
5,44.3,2.5,Female,Yes,Sat,Dinner,3,14.77,Heather Cohen,379771118886604,Sat6240,$$$
6,43.11,5.0,Female,Yes,Thur,Lunch,4,10.78,Brooke Soto,5544902205760175,Thur9313,$$$
7,41.19,5.0,Male,No,Thur,Lunch,5,8.24,Eric Andrews,4356531761046453,Thur3621,$$$
8,40.55,3.0,Male,Yes,Sun,Dinner,2,20.27,Stephen Cox,3547798222044029,Sun5140,$$$
9,40.17,4.73,Male,Yes,Fri,Dinner,4,10.04,Aaron Bentley,180026611638690,Fri9628,$$$


In [74]:
df['price_per_person'].max()

20.27

In [75]:
df['price_per_person'].argmax()

8

In [77]:
# df['price_per_person'].idxmax()

8

In [78]:
df.head()

Unnamed: 0,total_bill,tip,sex,smoker,day,time,size,price_per_person,Payer Name,CC Number,Payment ID,Fiyat_Kategorisi
0,50.81,10.0,Male,Yes,Sat,Dinner,3,16.94,Gregory Clark,5473850968388236,Sat1954,$$$
1,48.33,9.0,Male,No,Sat,Dinner,4,12.08,Alex Williamson,676218815212,Sat4590,$$$
2,48.27,6.73,Male,No,Sat,Dinner,4,12.07,Brian Ortiz,6596453823950595,Sat8139,$$$
3,48.17,5.0,Male,No,Sun,Dinner,6,8.03,Ryan Gonzales,3523151482063321,Sun7518,$$$
4,45.35,3.5,Male,Yes,Sun,Dinner,3,15.12,Jose Parsons,4112207559459910,Sun2337,$$$


In [79]:
df.corr() # Makine öğrenmesi modelleri kurarken hayati öneme sahip.
          # Numeric columnlar arasındaki doğrusal ilişkileri gösteriyor.

Unnamed: 0,total_bill,tip,size,price_per_person,CC Number
total_bill,1.0,0.675734,0.598315,0.647554,0.104576
tip,0.675734,1.0,0.489299,0.347405,0.110857
size,0.598315,0.489299,1.0,-0.175359,-0.030239
price_per_person,0.647554,0.347405,-0.175359,1.0,0.13524
CC Number,0.104576,0.110857,-0.030239,0.13524,1.0


In [81]:
df['sex'].value_counts()

Male      157
Female     87
Name: sex, dtype: int64

In [83]:
df.day.value_counts()

Sat     87
Sun     76
Thur    62
Fri     19
Name: day, dtype: int64

In [84]:
df.day.unique()

array(['Sat', 'Sun', 'Thur', 'Fri'], dtype=object)

In [85]:
df.day.nunique()

4

In [86]:
# len(df.day.unique())

4

In [87]:
# Amaç : sex column'undaki Male olanları 'M' , Female olanları 'F' haline getirmek.

## 1.Yol (.apply() metodu ile)

In [88]:
df.head()

Unnamed: 0,total_bill,tip,sex,smoker,day,time,size,price_per_person,Payer Name,CC Number,Payment ID,Fiyat_Kategorisi
0,50.81,10.0,Male,Yes,Sat,Dinner,3,16.94,Gregory Clark,5473850968388236,Sat1954,$$$
1,48.33,9.0,Male,No,Sat,Dinner,4,12.08,Alex Williamson,676218815212,Sat4590,$$$
2,48.27,6.73,Male,No,Sat,Dinner,4,12.07,Brian Ortiz,6596453823950595,Sat8139,$$$
3,48.17,5.0,Male,No,Sun,Dinner,6,8.03,Ryan Gonzales,3523151482063321,Sun7518,$$$
4,45.35,3.5,Male,Yes,Sun,Dinner,3,15.12,Jose Parsons,4112207559459910,Sun2337,$$$


In [91]:
df['sex'] = df['sex'].apply(lambda x : 'M' if x == 'Male' else 'F')

In [92]:
df.head()

Unnamed: 0,total_bill,tip,sex,smoker,day,time,size,price_per_person,Payer Name,CC Number,Payment ID,Fiyat_Kategorisi
0,50.81,10.0,M,Yes,Sat,Dinner,3,16.94,Gregory Clark,5473850968388236,Sat1954,$$$
1,48.33,9.0,M,No,Sat,Dinner,4,12.08,Alex Williamson,676218815212,Sat4590,$$$
2,48.27,6.73,M,No,Sat,Dinner,4,12.07,Brian Ortiz,6596453823950595,Sat8139,$$$
3,48.17,5.0,M,No,Sun,Dinner,6,8.03,Ryan Gonzales,3523151482063321,Sun7518,$$$
4,45.35,3.5,M,Yes,Sun,Dinner,3,15.12,Jose Parsons,4112207559459910,Sun2337,$$$


## 2.Yol (.replace() Metodu ile)

In [94]:
df.head()

Unnamed: 0,total_bill,tip,sex,smoker,day,time,size,price_per_person,Payer Name,CC Number,Payment ID
0,16.99,1.01,Female,No,Sun,Dinner,2,8.49,Christy Cunningham,3560325168603410,Sun2959
1,10.34,1.66,Male,No,Sun,Dinner,3,3.45,Douglas Tucker,4478071379779230,Sun4608
2,21.01,3.5,Male,No,Sun,Dinner,3,7.0,Travis Walters,6011812112971322,Sun4458
3,23.68,3.31,Male,No,Sun,Dinner,2,11.84,Nathaniel Harris,4676137647685994,Sun5260
4,24.59,3.61,Female,No,Sun,Dinner,4,6.15,Tonya Carter,4832732618637221,Sun2251


In [97]:
df['sex'].replace('Female' , 'F' , inplace = True)
df['sex'].replace('Male' , 'M' , inplace = True)

In [98]:
df.head()

Unnamed: 0,total_bill,tip,sex,smoker,day,time,size,price_per_person,Payer Name,CC Number,Payment ID
0,16.99,1.01,F,No,Sun,Dinner,2,8.49,Christy Cunningham,3560325168603410,Sun2959
1,10.34,1.66,M,No,Sun,Dinner,3,3.45,Douglas Tucker,4478071379779230,Sun4608
2,21.01,3.5,M,No,Sun,Dinner,3,7.0,Travis Walters,6011812112971322,Sun4458
3,23.68,3.31,M,No,Sun,Dinner,2,11.84,Nathaniel Harris,4676137647685994,Sun5260
4,24.59,3.61,F,No,Sun,Dinner,4,6.15,Tonya Carter,4832732618637221,Sun2251


In [100]:
df.head()

Unnamed: 0,total_bill,tip,sex,smoker,day,time,size,price_per_person,Payer Name,CC Number,Payment ID
0,16.99,1.01,Female,No,Sun,Dinner,2,8.49,Christy Cunningham,3560325168603410,Sun2959
1,10.34,1.66,Male,No,Sun,Dinner,3,3.45,Douglas Tucker,4478071379779230,Sun4608
2,21.01,3.5,Male,No,Sun,Dinner,3,7.0,Travis Walters,6011812112971322,Sun4458
3,23.68,3.31,Male,No,Sun,Dinner,2,11.84,Nathaniel Harris,4676137647685994,Sun5260
4,24.59,3.61,Female,No,Sun,Dinner,4,6.15,Tonya Carter,4832732618637221,Sun2251


In [101]:
df['sex'].replace(['Female' , 'Male'] , ['F' , 'M'])

0      F
1      M
2      M
3      M
4      F
      ..
239    M
240    F
241    M
242    M
243    F
Name: sex, Length: 244, dtype: object

## 3.Yol (.map() metodu ile)

In [102]:
df['sex'].map({'Female' : 'F' , 
               'Male' : 'M'})

0      F
1      M
2      M
3      M
4      F
      ..
239    M
240    F
241    M
242    M
243    F
Name: sex, Length: 244, dtype: object

In [None]:
# Done.