# PANDAS 

## Series
###  Q: What is Series in Pandas?
A: A Series is defined as a one-dimensional array that is capable of storing various data types. The row labels of series are called the index. By using a Series method, we can easily convert the list, tuple, and dictionary into series. A Series cannot contain multiple columns.  

### Practice
You can think of the pandas series as a column with labels. Because, series can only contain a single list with index, whereas the DataFrames can be made of more than one series.

In [1]:
import numpy as np

In [2]:
import pandas as pd

In [3]:
labels = ['a', 'b','c']
my_data = [10,20,30]
arr = np.array(my_data)
d = {'a':10, 'b':20, 'c':30}

In [4]:
pd.Series(data=my_data) # Panda serileri cok cesitlidir ancak biz su an data ve indexe odaklanacagiz. 
                        # burada benim datama pandas otamatik index atadi

0    10
1    20
2    30
dtype: int64

In [322]:
pd.Series(data=my_data, index=labels) # burada benim datama labels dizinimi index olarak veriyorum

a    10
b    20
c    30
dtype: int64

In [323]:
pd.Series(my_data, labels) # bu sekilde de yapabilirim yani ilk siradaki veri ikinci siradaki index

a    10
b    20
c    30
dtype: int64

In [324]:
pd.Series(arr)

0    10
1    20
2    30
dtype: int32

In [325]:
pd.Series(arr,labels)

a    10
b    20
c    30
dtype: int32

In [326]:
pd.Series(d)

a    10
b    20
c    30
dtype: int64

In [327]:
d

{'a': 10, 'b': 20, 'c': 30}

In [328]:
pd.Series(data=[sum,print,len])

0      <built-in function sum>
1    <built-in function print>
2      <built-in function len>
dtype: object

In [329]:
ser1 = pd.Series([1,2,3,4],['USA','Germany','USSR','Japan']) # burada ulkeler index durumunda, sayilar ise data durumunda

In [330]:
ser1

USA        1
Germany    2
USSR       3
Japan      4
dtype: int64

In [331]:
ser2 = pd.Series([1,2,3,4],['USA','Germany','Italy','Japan']) 

In [332]:
ser2

USA        1
Germany    2
Italy      3
Japan      4
dtype: int64

In [333]:
ser1['USA'] # diziden bir index alabilmek icin alacagimiz index i yazariz. 
            # Usually your index is going to be hopefully either a number or a string.

1

In [334]:
labels

['a', 'b', 'c']

In [335]:
ser3 = pd.Series(data=labels)

In [336]:
ser3

0    a
1    b
2    c
dtype: object

In [337]:
ser3[0]

'a'

In [338]:
ser1

USA        1
Germany    2
USSR       3
Japan      4
dtype: int64

In [339]:
ser2

USA        1
Germany    2
Italy      3
Japan      4
dtype: int64

In [340]:
ser1 + ser2 # bu islemin sonucu ulkeler index oldugu icin toplam degeri ulkeleri (indexleri) harf sirasina gore siralar
            # karsisina gelen degerleri ise toplar eger bir indexin karsinda toplanacak deger yoksa Nan yazar. 

Germany    4.0
Italy      NaN
Japan      8.0
USA        2.0
USSR       NaN
dtype: float64

## Pandas - DataFrames - Part 1

#### Q: What is DataFrame?
A: A DataFrame is a widely used data structure of pandas and works with a two-dimensional array with labeled axes (rows and columns) DataFrame is defined as a standard way to store data and has two different indexes, i.e., row index and column index. It consists of the following properties:

The columns can be heterogeneous types like int and bool.
It can be seen as a dictionary of Series structure where both the rows and columns are indexed. It isdenoted as "columns" in the case of columns and "index" in case of rows.

In [341]:
import numpy as np
import pandas as pd

In [342]:
from numpy.random import randn

In [343]:
np.random.seed(101)

In [344]:
df = pd.DataFrame(randn(5,4),['A','B','C','D','E'], ['W','X','Y','Z']) # sift tab yaparak pd.DataFrame inceledigimizde
                                                                    # ilk kisim data (5 satir 4 stunluk bir matris) , 
                                                                    # ikinci kisim ['A','B','C','D','E'] index yani satirlar
                                                                    # ucuncu kisim ['W','X','Y','Z'] kolonlar stunlar oluyor

In [345]:
df # BU ciktitaki her bir stun aslinda bir panda serisidir.  data frameler aslinda ayni indexi paylasan serilerdir. 
        #burada A,B,C,D,E indexlerini paylasan 'W','X','Y','Z' serileri gorulmektedir 

Unnamed: 0,W,X,Y,Z
A,2.70685,0.628133,0.907969,0.503826
B,0.651118,-0.319318,-0.848077,0.605965
C,-2.018168,0.740122,0.528813,-0.589001
D,0.188695,-0.758872,-0.933237,0.955057
E,0.190794,1.978757,2.605967,0.683509


In [346]:
randn(5,4) # satir sayisi 5 olan stunsayisi 4 olan bir matris olusturuyor

array([[ 0.30266545,  1.69372293, -1.70608593, -1.15911942],
       [-0.13484072,  0.39052784,  0.16690464,  0.18450186],
       [ 0.80770591,  0.07295968,  0.63878701,  0.3296463 ],
       [-0.49710402, -0.7540697 , -0.9434064 ,  0.48475165],
       [-0.11677332,  1.9017548 ,  0.23812696,  1.99665229]])

In [347]:
df['W'] # burada W serisini cekiyoruz. 
        # DataFrame den (bir veri cercevesinden) bir sutun secip cikarmanin iki farkli yontemi vardir.

A    2.706850
B    0.651118
C   -2.018168
D    0.188695
E    0.190794
Name: W, dtype: float64

In [348]:
type(df['W']) # burada kolanlarin birer seri oldugunu gorebiliriz

pandas.core.series.Series

In [349]:
type(df) # buradan df nin bir  frame.DataFrame oldugunu gorebiliriz. 

pandas.core.frame.DataFrame

In [350]:
df.W     # # DataFrame den (bir veri cercevesinden) bir sutun secip cikarmanin iki farkli yontemi vardir.

A    2.706850
B    0.651118
C   -2.018168
D    0.188695
E    0.190794
Name: W, dtype: float64

In [351]:
df[['W','Z']] ## birden cok stun istersek ayrica istedigimiz stunlari listelemeliyiz yani ikinci bir koseli parentez kullaniriz

Unnamed: 0,W,Z
A,2.70685,0.503826
B,0.651118,0.605965
C,-2.018168,-0.589001
D,0.188695,0.955057
E,0.190794,0.683509


# DataFrame yeni bir kolon eklemek

In [352]:
df['new'] = df['W'] + df['Y'] # zaten var olan kolanlar arasinda aritmetik islem yaparak yeni bir kolon olusturabilirim

In [353]:
df

Unnamed: 0,W,X,Y,Z,new
A,2.70685,0.628133,0.907969,0.503826,3.614819
B,0.651118,-0.319318,-0.848077,0.605965,-0.196959
C,-2.018168,0.740122,0.528813,-0.589001,-1.489355
D,0.188695,-0.758872,-0.933237,0.955057,-0.744542
E,0.190794,1.978757,2.605967,0.683509,2.796762


## DataFrame bir kolon Cikarmak

In [354]:
df.drop('new', axis = 1, inplace=True) # shift tab ile icine bakalim varsayilan olarak axis = 0 
                        # benim erismek istedigim stunlar icin axis = 1 yapmaliyim. bu sekilde new stununu silebilirim

In [355]:
df    # new kolonumun silinmedigini goruyorum. sift tab yaptigimda --inplace=False-- argumentin false olmasindan kaynaklaniyor.
     # bunun icin calismalari yaptiktan sonra degisikliklerin kalici olmasi icin --inplace=True-- inplace argumanimi True yapmaliyim

Unnamed: 0,W,X,Y,Z
A,2.70685,0.628133,0.907969,0.503826
B,0.651118,-0.319318,-0.848077,0.605965
C,-2.018168,0.740122,0.528813,-0.589001
D,0.188695,-0.758872,-0.933237,0.955057
E,0.190794,1.978757,2.605967,0.683509


In [356]:
df.drop('E') # defoult olarak axis = 0  oldugundan tekrar belirtmemize gerek yok

Unnamed: 0,W,X,Y,Z
A,2.70685,0.628133,0.907969,0.503826
B,0.651118,-0.319318,-0.848077,0.605965
C,-2.018168,0.740122,0.528813,-0.589001
D,0.188695,-0.758872,-0.933237,0.955057


In [357]:
df.shape # df nin sekline baktigimizda e satiri silinmesine ragmen halen (5, 4) bir matris olarak gorulur

(5, 4)

In [358]:
df # E satirinin silinmedigini goruyoruz --inplace=False-- argumentin false olmasindan kaynaklaniyor

Unnamed: 0,W,X,Y,Z
A,2.70685,0.628133,0.907969,0.503826
B,0.651118,-0.319318,-0.848077,0.605965
C,-2.018168,0.740122,0.528813,-0.589001
D,0.188695,-0.758872,-0.933237,0.955057
E,0.190794,1.978757,2.605967,0.683509


In [359]:
df[['Z','X']]

Unnamed: 0,Z,X
A,0.503826,0.628133
B,0.605965,-0.319318
C,-0.589001,0.740122
D,0.955057,-0.758872
E,0.683509,1.978757


# ROWS

In [360]:
# dataframe den satir secmenini iki yontemi vardir .loc 

In [361]:
df

Unnamed: 0,W,X,Y,Z
A,2.70685,0.628133,0.907969,0.503826
B,0.651118,-0.319318,-0.848077,0.605965
C,-2.018168,0.740122,0.528813,-0.589001
D,0.188695,-0.758872,-0.933237,0.955057
E,0.190794,1.978757,2.605967,0.683509


In [362]:
df.loc['A'] # stun baslilari index olarak siralaniyorlar

W    2.706850
X    0.628133
Y    0.907969
Z    0.503826
Name: A, dtype: float64

In [363]:
#df.loc['A','B']

In [364]:
df.iloc[2] # satirda 0 index numarasindan baslamak uzere  0,1,2.ikinci index yani 2. satir olan c satirini dondurecektir

W   -2.018168
X    0.740122
Y    0.528813
Z   -0.589001
Name: C, dtype: float64

In [365]:
df.loc['A', 'Y'] # satir ve kolon numaralarinin kesisimini getirir

0.9079694464765431

In [366]:
df

Unnamed: 0,W,X,Y,Z
A,2.70685,0.628133,0.907969,0.503826
B,0.651118,-0.319318,-0.848077,0.605965
C,-2.018168,0.740122,0.528813,-0.589001
D,0.188695,-0.758872,-0.933237,0.955057
E,0.190794,1.978757,2.605967,0.683509


In [367]:
df.loc[['A','B'],['W','Y']]

Unnamed: 0,W,Y
A,2.70685,0.907969
B,0.651118,-0.848077


In [368]:
# Creating a Pandas Series

In [369]:
import numpy as np
import pandas as pd

In [370]:
# Creating a Pandas Series With Basic Format

In [371]:
pd.Series([10,88,3,4,5])

0    10
1    88
2     3
3     4
4     5
dtype: int64

In [372]:
ser = pd.Series([10,88,3,4,5])

In [373]:
ser

0    10
1    88
2     3
3     4
4     5
dtype: int64

In [374]:
# Basic Attributes of Series
# atributlerde parantez yoktur. pandas serisinin atributlerine bakalim

In [375]:
type(ser)

pandas.core.series.Series

In [376]:
ser.dtype

dtype('int64')

In [377]:
ser.size

5

In [378]:
ser.ndim

1

In [379]:
ser.values

array([10, 88,  3,  4,  5], dtype=int64)

In [380]:
for i in ser.values: 
    print(i)

10
88
3
4
5


In [381]:
[i for i in ser.values] # list conferations method ile acilimi

[10, 88, 3, 4, 5]

In [382]:
ser.head(3) # serinini ilk uc elemanini gosterir

0    10
1    88
2     3
dtype: int64

In [383]:
ser.tail(3) # sondan uc satiri gosterir

2    3
3    4
4    5
dtype: int64

In [384]:
string = "clarusway"
pd.Series([i for i in string])

0    c
1    l
2    a
3    r
4    u
5    s
6    w
7    a
8    y
dtype: object

In [385]:
# Creating Pandas Series by Using a list, numpy array or dictionary

In [386]:
label = ["a","b","c"]
my_lis= [10,20,30]
arr = np.array([10,20,30])
d = {"a":10, "b":20, "c":30}

In [387]:
pd.Series(data=my_lis)

0    10
1    20
2    30
dtype: int64

In [388]:
# Using NumPy Arrays

In [389]:
arr

array([10, 20, 30])

In [390]:
pd.Series(arr)

0    10
1    20
2    30
dtype: int32

In [391]:
pd.Series(arr,label)

a    10
b    20
c    30
dtype: int32

In [392]:
# Using Dictionary

In [393]:
d   # burada a , b, c key
    # 10,20,30 Value

{'a': 10, 'b': 20, 'c': 30}

In [394]:
pd.Series(d) # key ler index olur Value ler datanin kendisi olur. 

a    10
b    20
c    30
dtype: int64

# Data in a Series

In [395]:
pd.Series(data=label)

0    a
1    b
2    c
dtype: object

In [396]:
pd.Series([sum, print, len]) # icerisine python objectler dahi koyabiliriz

0      <built-in function sum>
1    <built-in function print>
2      <built-in function len>
dtype: object

In [397]:
mix_data = [1, "cat", True]

In [398]:
pd.Series(mix_data)

0       1
1     cat
2    True
dtype: object

In [399]:
# https://numpy.org/devdocs/user/quickstart.html

# Indexing Pandas Series

In [400]:
serr1 = pd.Series([1,2,3,4], index=["USA","Germany","USSR","Japan"])

In [401]:
serr1

USA        1
Germany    2
USSR       3
Japan      4
dtype: int64

In [402]:
serr2 = pd.Series([1,2,5,4], index=["USA","Germany","Italy","Japan"])

In [403]:
serr2

USA        1
Germany    2
Italy      5
Japan      4
dtype: int64

In [404]:
serr1["USA"]

1

In [405]:
serr3 = pd.Series(data = label)

In [406]:
serr3 

0    a
1    b
2    c
dtype: object

In [407]:
serr3[0]

'a'

In [408]:
serr1+serr2

Germany    4.0
Italy      NaN
Japan      8.0
USA        2.0
USSR       NaN
dtype: float64

In [409]:
# Indexing Examples

In [410]:
a = np.array([1,2,33,444,75])

In [411]:
a

array([  1,   2,  33, 444,  75])

In [412]:
panser = pd.Series(a)

In [413]:
panser

0      1
1      2
2     33
3    444
4     75
dtype: int32

In [414]:
panser[0]

1

In [415]:
panser[0:3]

0     1
1     2
2    33
dtype: int32

In [416]:
serr1

USA        1
Germany    2
USSR       3
Japan      4
dtype: int64

In [417]:
serr1["USA":"USSR"] # USA ve USSR arasindakileri dahil ederek aldi

USA        1
Germany    2
USSR       3
dtype: int64

In [418]:
# Pandas_Series[index] | pandas_Series[[indices,indices]]

In [419]:
panser = pd.Series([121,200,150,99], index = ["ali", "veli", "gul", "nur"])

In [420]:
panser

ali     121
veli    200
gul     150
nur      99
dtype: int64

In [421]:
panser["ali"]

121

In [422]:
panser[0]

121

In [423]:
panser[[ "veli", "nur" ]]

veli    200
nur      99
dtype: int64

In [424]:
panser[0:3]

ali     121
veli    200
gul     150
dtype: int64

In [425]:
panser["ali":"nur"]

ali     121
veli    200
gul     150
nur      99
dtype: int64

In [426]:
# Several Selecting Attributes

In [427]:
panser.index

Index(['ali', 'veli', 'gul', 'nur'], dtype='object')

In [428]:
panser.values

array([121, 200, 150,  99], dtype=int64)

In [429]:
panser.items  # atribute

<bound method Series.items of ali     121
veli    200
gul     150
nur      99
dtype: int64>

In [430]:
panser.items()

<zip at 0x200ce5cecc0>

In [431]:
list(panser.items()) # bu items () bir object bu objectimizi collections yapisi olan list icerisine soktuk

[('ali', 121), ('veli', 200), ('gul', 150), ('nur', 99)]

In [432]:
for index, value in panser.items():
    print(index, "-", value)

ali - 121
veli - 200
gul - 150
nur - 99


In [433]:
"mehmet" in panser

False

In [434]:
"ali" in panser

True

In [435]:
"gulnur" in panser

False

In [436]:
"gul" in panser

True

In [437]:
99 in panser.values

True

In [438]:
500 in panser.values

False

In [439]:
panser["veli"]

200

In [440]:
panser["veli"] = 571 # broodcasting yaptik birtane indexi cagirip atama yaptik

In [441]:
panser

ali     121
veli    571
gul     150
nur      99
dtype: int64

In [442]:
panser > 130

ali     False
veli     True
gul      True
nur     False
dtype: bool

In [443]:
panser[panser > 130]

veli    571
gul     150
dtype: int64

In [444]:
## pANDAS dATAfRAMES

In [445]:
# creating a DataFrame using the list s of data and columns

In [446]:
datam = [1,2,39,67,90]

In [447]:
datam

[1, 2, 39, 67, 90]

In [448]:
pd.DataFrame(datam, columns = ["column_name"])

Unnamed: 0,column_name
0,1
1,2
2,39
3,67
4,90


In [449]:
# creating a DataFrame using a Numpy Arrays

In [450]:
m = np.arange(1,10).reshape((3,3))
m

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

m

In [451]:
# creating a DataFrame using a Numpy Arrays

In [452]:
pd.DataFrame(m,columns = ["var1","var2","var3"])

Unnamed: 0,var1,var2,var3
0,1,2,3
1,4,5,6
2,7,8,9


In [453]:
pd.DataFrame(data = m,columns = ["var1","var2","var3"])

Unnamed: 0,var1,var2,var3
0,1,2,3
1,4,5,6
2,7,8,9


In [454]:
df = pd.DataFrame(data = m,columns = ["var1","var2","var3"])
df

Unnamed: 0,var1,var2,var3
0,1,2,3
1,4,5,6
2,7,8,9


In [455]:
df.head(1)

Unnamed: 0,var1,var2,var3
0,1,2,3


In [456]:
df.head(3)

Unnamed: 0,var1,var2,var3
0,1,2,3
1,4,5,6
2,7,8,9


In [457]:
df.columns 

Index(['var1', 'var2', 'var3'], dtype='object')

In [458]:
for i in df.columns:
    print(i)

var1
var2
var3


In [459]:
df.columns = ["new1", "new2", "new3"]

In [460]:
df

Unnamed: 0,new1,new2,new3
0,1,2,3
1,4,5,6
2,7,8,9


In [461]:
type(df)

pandas.core.frame.DataFrame

In [462]:
df.shape

(3, 3)

In [463]:
df.ndim

2

In [464]:
df.size # 3*3 9 elemanli

9

In [465]:
df.values # datamizdaki degerleri verir

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

# creating a DataFrame using a dict

In [466]:
s1 = np.random.randint(10,size = 5)
s2 = np.random.randint(10,size = 5)
s3 = np.random.randint(10,size = 5)

In [467]:
s1

array([6, 5, 6, 9, 2])

In [468]:
s2

array([2, 1, 3, 3, 3])

In [469]:
s3

array([4, 5, 9, 5, 8])

In [470]:
myDict = {"var1":s1, "var2":s2, "var3" :s3}
myDict

{'var1': array([6, 5, 6, 9, 2]),
 'var2': array([2, 1, 3, 3, 3]),
 'var3': array([4, 5, 9, 5, 8])}

In [471]:
df1 = pd.DataFrame(myDict)
df1

Unnamed: 0,var1,var2,var3
0,6,2,4
1,5,1,5
2,6,3,9
3,9,3,5
4,2,3,8


In [472]:
pwd

'C:\\Users\\Mustafa\\Desktop\\MyWorkSpace\\Data_Science'

In [473]:
# ornekcsv.csv

In [474]:
df3 = pd.read_csv("ornekcsv.csv", delimiter = ";")

FileNotFoundError: [Errno 2] File ornekcsv.csv does not exist: 'ornekcsv.csv'

In [475]:
df3.head()

NameError: name 'df3' is not defined

In [476]:
df1

Unnamed: 0,var1,var2,var3
0,6,2,4
1,5,1,5
2,6,3,9
3,9,3,5
4,2,3,8


In [477]:
df1[1:3]

Unnamed: 0,var1,var2,var3
1,5,1,5
2,6,3,9


In [478]:
df1.index

RangeIndex(start=0, stop=5, step=1)

In [479]:
[i for i in df1.index]

[0, 1, 2, 3, 4]

In [480]:
df1.index = ["a","b","c","d","e"]

df1

In [481]:
df1

Unnamed: 0,var1,var2,var3
a,6,2,4
b,5,1,5
c,6,3,9
d,9,3,5
e,2,3,8


In [482]:
df1["b":"d"] # slijslamada dilimleme kendi turunden veri dondurur

Unnamed: 0,var1,var2,var3
b,5,1,5
c,6,3,9
d,9,3,5


In [483]:
"var2" in df1 # dataframe de  stunlarim arasinda var2 adli bir degiskenim varmi

True

In [484]:
len("var2")

4

In [485]:
"Joseph" in df1

False

In [486]:
## Now, let's examine again the indexing; selection and slicing methods and several attributes using a different DataFrame

In [487]:
from numpy.random import randn
np.random.seed(101)

In [488]:
df4 = pd.DataFrame(randn(5,4), index="A B C D E".split(), columns = "W X Y Z".split())

In [489]:
df4

Unnamed: 0,W,X,Y,Z
A,2.70685,0.628133,0.907969,0.503826
B,0.651118,-0.319318,-0.848077,0.605965
C,-2.018168,0.740122,0.528813,-0.589001
D,0.188695,-0.758872,-0.933237,0.955057
E,0.190794,1.978757,2.605967,0.683509


In [490]:
df4 = pd.DataFrame(index="A B C D E".split(), randn(5,4), columns = "W X Y Z".split())

SyntaxError: positional argument follows keyword argument (<ipython-input-490-e6a4fade62aa>, line 1)

In [491]:
df4

Unnamed: 0,W,X,Y,Z
A,2.70685,0.628133,0.907969,0.503826
B,0.651118,-0.319318,-0.848077,0.605965
C,-2.018168,0.740122,0.528813,-0.589001
D,0.188695,-0.758872,-0.933237,0.955057
E,0.190794,1.978757,2.605967,0.683509


In [492]:
df4["W"]

A    2.706850
B    0.651118
C   -2.018168
D    0.188695
E    0.190794
Name: W, dtype: float64

In [493]:
type(df4["W"])

pandas.core.series.Series

In [494]:
df4["W"].values

array([ 2.70684984,  0.65111795, -2.01816824,  0.18869531,  0.19079432])

In [495]:
df4[["W"]]

Unnamed: 0,W
A,2.70685
B,0.651118
C,-2.018168
D,0.188695
E,0.190794


In [496]:
type(df4[["W"]])

pandas.core.frame.DataFrame

In [497]:
istedigimstunlar = ["W", "Z"]

In [498]:
df4[istedigimstunlar]

Unnamed: 0,W,Z
A,2.70685,0.503826
B,0.651118,0.605965
C,-2.018168,-0.589001
D,0.188695,0.955057
E,0.190794,0.683509


In [499]:
WZ_df = df4[istedigimstunlar]
WZ_df

Unnamed: 0,W,Z
A,2.70685,0.503826
B,0.651118,0.605965
C,-2.018168,-0.589001
D,0.188695,0.955057
E,0.190794,0.683509


In [500]:
df4[["W", "Z"]]

Unnamed: 0,W,Z
A,2.70685,0.503826
B,0.651118,0.605965
C,-2.018168,-0.589001
D,0.188695,0.955057
E,0.190794,0.683509


In [501]:
df4["W"]
# df4.W

A    2.706850
B    0.651118
C   -2.018168
D    0.188695
E    0.190794
Name: W, dtype: float64

In [502]:
# df4["W"]
df4.W

A    2.706850
B    0.651118
C   -2.018168
D    0.188695
E    0.190794
Name: W, dtype: float64

In [503]:
df4["A":"C"] # slicing

Unnamed: 0,W,X,Y,Z
A,2.70685,0.628133,0.907969,0.503826
B,0.651118,-0.319318,-0.848077,0.605965
C,-2.018168,0.740122,0.528813,-0.589001


In [504]:
df4["C":"C"]

Unnamed: 0,W,X,Y,Z
C,-2.018168,0.740122,0.528813,-0.589001


In [505]:
df4["new"] = df4["W"]+df4["Y"] # yeni bir stun olusturma

In [507]:
df4["new2"] = df4["W"]*df4["Y"]

In [546]:
df4

Unnamed: 0,var1,var2,var3
a,8,11,21
b,25,8,7
c,15,10,21
d,19,24,8
e,8,16,13
f,1,21,11
g,13,18,25
h,12,20,16
i,2,25,5
j,29,20,19


In [None]:
## https://www.w3resource.com/python-exercises/pandas/index.php

In [513]:
df4.drop(["new", axis = 1, inplace = True])

SyntaxError: invalid syntax (<ipython-input-513-9586f21d8aba>, line 1)

## Creating a new column

# Removing Columns & Rows
# Removing Columns

In [510]:
df4

Unnamed: 0,W,X,Y,Z,new,new2
A,2.70685,0.628133,0.907969,0.503826,3.614819,2.457737
B,0.651118,-0.319318,-0.848077,0.605965,-0.196959,-0.552198
C,-2.018168,0.740122,0.528813,-0.589001,-1.489355,-1.067235
D,0.188695,-0.758872,-0.933237,0.955057,-0.744542,-0.176097
E,0.190794,1.978757,2.605967,0.683509,2.796762,0.497204


In [511]:
df4.drop("E", axis=0)

Unnamed: 0,W,X,Y,Z,new,new2
A,2.70685,0.628133,0.907969,0.503826,3.614819,2.457737
B,0.651118,-0.319318,-0.848077,0.605965,-0.196959,-0.552198
C,-2.018168,0.740122,0.528813,-0.589001,-1.489355,-1.067235
D,0.188695,-0.758872,-0.933237,0.955057,-0.744542,-0.176097


In [None]:
df.loc['viper'] # loc index numaralari ile calismaz sayilarla calisir. 

In [514]:
m = np.random.randint(1,30, size = (10,3))
df4 = pd.DataFrame(m, columns = ["var1","var2","var3"])
df4

Unnamed: 0,var1,var2,var3
0,8,11,21
1,25,8,7
2,15,10,21
3,19,24,8
4,8,16,13
5,1,21,11
6,13,18,25
7,12,20,16
8,2,25,5
9,29,20,19


In [515]:
df4.loc[1] # loc satirlari cagirir loc icerisindeki label etiketleri getirir

var1    25
var2     8
var3     7
Name: 1, dtype: int32

In [516]:
df4.loc[1:4] # labellar ile calisirken 1 , 2, 3, 4 isimli labellari cagirir

Unnamed: 0,var1,var2,var3
1,25,8,7
2,15,10,21
3,19,24,8
4,8,16,13


In [517]:
df4.iloc[1:4] # iloc indexlerle calisir

Unnamed: 0,var1,var2,var3
1,25,8,7
2,15,10,21
3,19,24,8


In [527]:
df4.index = "a b c d e f g h i j".split()

In [528]:
df4

Unnamed: 0,var1,var2,var3
a,8,11,21
b,25,8,7
c,15,10,21
d,19,24,8
e,8,16,13
f,1,21,11
g,13,18,25
h,12,20,16
i,2,25,5
j,29,20,19


In [529]:
df4.loc[1:4]

TypeError: cannot do slice indexing on <class 'pandas.core.indexes.base.Index'> with these indexers [1] of <class 'int'>

In [530]:
df4.iloc[1:4]

Unnamed: 0,var1,var2,var3
b,25,8,7
c,15,10,21
d,19,24,8


In [532]:
df4.loc["b":"e", "var2"]

b     8
c    10
d    24
e    16
Name: var2, dtype: int32

In [535]:
df4.loc["b":"e"]

Unnamed: 0,var1,var2,var3
b,25,8,7
c,15,10,21
d,19,24,8
e,8,16,13


In [536]:
df4.loc["b":"e"], ["var2"]

(   var1  var2  var3
 b    25     8     7
 c    15    10    21
 d    19    24     8
 e     8    16    13,
 ['var2'])

In [537]:
df4.loc["b":"e"], [["var2"]]

(   var1  var2  var3
 b    25     8     7
 c    15    10    21
 d    19    24     8
 e     8    16    13,
 [['var2']])

In [538]:
df4.loc["b":"e", ["var2"]]

Unnamed: 0,var2
b,8
c,10
d,24
e,16


In [539]:
df4.iloc[1:5,1]

b     8
c    10
d    24
e    16
Name: var2, dtype: int32

In [540]:
df4

Unnamed: 0,var1,var2,var3
a,8,11,21
b,25,8,7
c,15,10,21
d,19,24,8
e,8,16,13
f,1,21,11
g,13,18,25
h,12,20,16
i,2,25,5
j,29,20,19


In [542]:
df4.iloc[1:5] [["var2"]]

Unnamed: 0,var2
b,8
c,10
d,24
e,16


In [543]:
df4.iloc[1:5]

Unnamed: 0,var1,var2,var3
b,25,8,7
c,15,10,21
d,19,24,8
e,8,16,13


In [545]:
df4

Unnamed: 0,var1,var2,var3
a,8,11,21
b,25,8,7
c,15,10,21
d,19,24,8
e,8,16,13
f,1,21,11
g,13,18,25
h,12,20,16
i,2,25,5
j,29,20,19


In [547]:
df2 = pd.DataFrame(randn(5,4), index="A B C D E".split(), columns = "W X Y Z".split())

df2

In [552]:
df2

Unnamed: 0,W,X,Y,Z
A,-0.497104,-0.75407,-0.943406,0.484752
B,-0.116773,1.901755,0.238127,1.996652
C,-0.993263,0.1968,-1.136645,0.000366
D,1.025984,-0.156598,-0.031579,0.649826
E,2.154846,-0.610259,-0.755325,-0.346419


In [553]:
df2.iloc[2]

W   -0.993263
X    0.196800
Y   -1.136645
Z    0.000366
Name: C, dtype: float64

In [556]:
df2.loc[["B"]]

Unnamed: 0,W,X,Y,Z
B,-0.116773,1.901755,0.238127,1.996652


In [557]:
df2.iloc[:,2]

A   -0.943406
B    0.238127
C   -1.136645
D   -0.031579
E   -0.755325
Name: Y, dtype: float64

In [560]:
df2.iloc[:,[2]]

Unnamed: 0,Y
A,-0.943406
B,0.238127
C,-1.136645
D,-0.031579
E,-0.755325


In [561]:
df2.Y

A   -0.943406
B    0.238127
C   -1.136645
D   -0.031579
E   -0.755325
Name: Y, dtype: float64

In [565]:
df2[["Y"]]

Unnamed: 0,Y
A,-0.943406
B,0.238127
C,-1.136645
D,-0.031579
E,-0.755325


In [566]:
df2["Y"]

A   -0.943406
B    0.238127
C   -1.136645
D   -0.031579
E   -0.755325
Name: Y, dtype: float64

In [None]:
## Selecting sunset of rows and columns

In [None]:
# .loc[[row]]

In [567]:
df2

Unnamed: 0,W,X,Y,Z
A,-0.497104,-0.75407,-0.943406,0.484752
B,-0.116773,1.901755,0.238127,1.996652
C,-0.993263,0.1968,-1.136645,0.000366
D,1.025984,-0.156598,-0.031579,0.649826
E,2.154846,-0.610259,-0.755325,-0.346419


In [568]:
df2.loc["B","Y"]

0.23812695876901832

In [571]:
df2.loc[["B"],["Y"]]

Unnamed: 0,Y
B,0.238127


In [574]:
df2.loc[["A", "B"],["W", "Y"]]

Unnamed: 0,W,Y
A,-0.497104,-0.943406
B,-0.116773,0.238127


In [575]:
df2

Unnamed: 0,W,X,Y,Z
A,-0.497104,-0.75407,-0.943406,0.484752
B,-0.116773,1.901755,0.238127,1.996652
C,-0.993263,0.1968,-1.136645,0.000366
D,1.025984,-0.156598,-0.031579,0.649826
E,2.154846,-0.610259,-0.755325,-0.346419


In [577]:
df2.iloc[[0,1],[0,2]] # satir indexleri stun indexleri

Unnamed: 0,W,Y
A,-0.497104,-0.943406
B,-0.116773,0.238127


In [579]:
df2 > 0

Unnamed: 0,W,X,Y,Z
A,False,False,False,True
B,False,True,True,True
C,False,True,False,True
D,True,False,False,True
E,True,False,False,False


In [580]:
df2[df2 > 0]

Unnamed: 0,W,X,Y,Z
A,,,,0.484752
B,,1.901755,0.238127,1.996652
C,,0.1968,,0.000366
D,1.025984,,,0.649826
E,2.154846,,,


In [581]:
df2["W"] > 0

A    False
B    False
C    False
D     True
E     True
Name: W, dtype: bool

In [None]:
df2[df2["w"]"W"] > 0

In [582]:
df2[[True,True,False,True,True]]

Unnamed: 0,W,X,Y,Z
A,-0.497104,-0.75407,-0.943406,0.484752
B,-0.116773,1.901755,0.238127,1.996652
D,1.025984,-0.156598,-0.031579,0.649826
E,2.154846,-0.610259,-0.755325,-0.346419


In [584]:
df2[df2["w"] > 0] [["Y"]]

KeyError: 'w'

In [585]:
df2

Unnamed: 0,W,X,Y,Z
A,-0.497104,-0.75407,-0.943406,0.484752
B,-0.116773,1.901755,0.238127,1.996652
C,-0.993263,0.1968,-1.136645,0.000366
D,1.025984,-0.156598,-0.031579,0.649826
E,2.154846,-0.610259,-0.755325,-0.346419


In [586]:
df2[(df2["W"] >0) (df2["Y"]>1)

SyntaxError: unexpected EOF while parsing (<ipython-input-586-33b8472f5449>, line 1)

In [587]:
df2.loc[(df2.X > 0), ["X","Z"]]

Unnamed: 0,X,Z
B,1.901755,1.996652
C,0.1968,0.000366


In [588]:
df2.loc[((df2.W) > 2 | (df2.W<-2)), ["W", "Y"]]

Unnamed: 0,W,Y
D,1.025984,-0.031579
E,2.154846,-0.755325
