## loc, iloc and []
[link to recording](https://ithogskolan.sharepoint.com/:v:/s/AI23/ETPOC8bHsidEifllq7-prTgBFC1nEm7ozH_AlCJhdI065w?e=Aaw86O)
- Pandas supports three types of multi-axis indexing and object selection.
- **loc** is a method for label based indexing (it uses the index names and column names)
- **iloc** is a method for index based position (it uses the numerical positions wrt index and columns)
- **[colname]** returns the column as a Pandas Series object
  
#### Slicing:
    - Using slice on one axis returns a Pandas Series object
    - Using slice on both axis returns a Pandas DataFrame object
    - man kan slicea som i listor typ [start:stop:step]
    - [rad, kolumn]
    - Man kan använda listor som argument för att slicea

In [2]:
import pandas as pd

In [3]:
column_values = [f"Column{i}" for i in range(1, 5)]
index_values = [chr(i) for i in range(65, 72)] #uni koder för A till G, chr() returnerar uni för den angivna koden

df = pd.DataFrame([[f"{ind}{col}" for col in range(1,5)] for ind in index_values], index=index_values, columns=column_values)

df # display(df)

Unnamed: 0,Column1,Column2,Column3,Column4
A,A1,A2,A3,A4
B,B1,B2,B3,B4
C,C1,C2,C3,C4
D,D1,D2,D3,D4
E,E1,E2,E3,E4
F,F1,F2,F3,F4
G,G1,G2,G3,G4


### iloc
- Index based location
- using no slices, returnerar värdet av en cell i df
- fungerar också med negativ indexing
- [rad, kolumn]
- **Important note**: when slicing, the upper bound is included for loc but excluded for iloc


In [3]:
df.iloc[1, 0]

'B1'

In [8]:
#slicing av en rad i en df, observera att kolumnerna blir index
df.iloc[1,:] # hela rad 1

Column1    B1
Column2    B2
Column3    B3
Column4    B4
Name: B, dtype: object

In [10]:
df.iloc[:,0] # hela kolumn 0

A    A1
B    B1
C    C1
D    D1
E    E1
F    F1
G    G1
Name: Column1, dtype: object

Slicing med iloc

In [11]:
# slizing till men inte inklusive rad 3, i kolumn 2.
df.iloc[:3, 1]

A    A2
B    B2
C    C2
Name: Column2, dtype: object

### loc
- label based indexing

In [4]:
df.loc["B","Column1"]

'B1'

In [10]:
# slizing till och inklusive rad "C" i kolumn 2.
# loc: slicingen är "inclusive", vilket är speciellt för loc.
df.loc[:"C", "Column2"] 

A    A2
B    B2
C    C2
Name: Column2, dtype: object

Använda loc och iloc för att göra samma sak i en df:
- Returnera ett element i en DataFrame:

In [9]:
#df.iloc[1, 0]
#df.loc["B","Column1"]

A    A2
B    B2
C    C2
Name: Column2, dtype: object

Returnera en rad från en DataFrame:
- 'rad 1/B och alla kolumner'

In [None]:
#df.iloc[1,:]
#df.loc["B",:]

Returnera en kolumn från en DataFrame:
- 'alla rader för kolumn 0'

In [None]:
#df.iloc[:,0] # alla rader, kolumn 0
#df.loc[:,"Column1"]

Returnera en slize:
- 'Alla rader fram till tredje raden för kolumn 1'

In [None]:
#df.iloc[:3, 1]
df.loc[:"C", "Column2"] # loc: slicingen är "inclusive", vilket är speciellt för loc

### Slicing på båda axlarna returnerar en Pandas DataFrame object

In [16]:
df.iloc[:2,:2]

Unnamed: 0,Column1,Column2
A,A1,A2
B,B1,B2


Nedan två slices ger samma med både loc och iloc.

In [21]:
#df.loc["C":"F","Column2":"Column3"]
df.iloc[2:6,1:3]

Unnamed: 0,Column2,Column3
C,C2,C3
D,D2,D3
E,E2,E3
F,F2,F3


Man kan köra .head() eller .info() mm på en slice.

In [22]:
df.iloc[2:6,1:3].info()

<class 'pandas.core.frame.DataFrame'>
Index: 4 entries, C to F
Data columns (total 2 columns):
 #   Column   Non-Null Count  Dtype 
---  ------   --------------  ----- 
 0   Column2  4 non-null      object
 1   Column3  4 non-null      object
dtypes: object(2)
memory usage: 96.0+ bytes


In [37]:
df.loc["C":"F","Column2":"Column3"]

Unnamed: 0,Column2,Column3
C,C2,C3
D,D2,D3
E,E2,E3
F,F2,F3


In [32]:
df.loc[:"B",:"Column2"]

Unnamed: 0,Column1,Column2
A,A1,A2
B,B1,B2


### Lists can be used insted of slices for either axis
- Valid for both loc and iloc

In [33]:
df.iloc[[0,3,5],[1,3]]

Unnamed: 0,Column2,Column4
A,A2,A4
D,D2,D4
F,F2,F4


In [35]:
df.loc[["A","D","F"],["Column2", "Column4"]]

Unnamed: 0,Column2,Column4
A,A2,A4
D,D2,D4
F,F2,F4


In [38]:
# df[colname] returns a column as a Pandas Series object
df["Column2"]

A    A2
B    B2
C    C2
D    D2
E    E2
F    F2
G    G2
Name: Column2, dtype: object

In [40]:
# df[list of colnames] returns a new DataFrame with the given columns
df[["Column1","Column4","Column3","Column1"]]

Unnamed: 0,Column1,Column4,Column3,Column1.1
A,A1,A4,A3,A1
B,B1,B4,B3,B1
C,C1,C4,C3,C1
D,D1,D4,D3,D1
E,E1,E4,E3,E1
F,F1,F4,F3,F1
G,G1,G4,G3,G1


In [44]:
# Indexing a DF with a list (of columns) returns a DataFrame,
# so indexing again would be on the new DataFrame (but don't do this)
df[["Column1","Column4","Column3","Column1"]]["Column3"]
#df[["Column1","Column4","Column3","Column1"]]["Column3"]["E"]

A    A3
B    B3
C    C3
D    D3
E    E3
F    F3
G    G3
Name: Column3, dtype: object

In [41]:
# series[index] returns a scalar value
df["Column2"]["D"]

'D2'