# Selecting columns 2: using `[]`
By the end of this lecture you will be able to:
- select a column or columns with `[]` indexing
- select rows and columns with `[]` indexing

In [1]:
import polars as pl

In [2]:
csv_path = '../Files/Sample_Superstore.csv'

In [9]:
df = pl.read_csv(csv_path)


Row_ID,Order_ID,Order_Date,Ship Date,Ship_Mode,Customer_ID,Customer_Name,Segment,Country,City,State,Postal_Code,Region,Product_ID,Category,Sub_Category,Product_Name,Sales,Quantity,Discount,Profit
i64,str,str,str,str,str,str,str,str,str,str,i64,str,str,str,str,str,f64,i64,f64,f64
1,"""CA-2016-152156""","""11/8/2016""","""11/11/2016""","""Second Class""","""CG-12520""","""Claire Gute""","""Consumer""","""United States""","""Henderson""","""Kentucky""",42420,"""South""","""FUR-BO-10001798""","""Furniture""","""Bookcases""","""Bush Somerset Collection Bookc…",261.96,2,0.0,41.9136
2,"""CA-2016-152156""","""11/8/2016""","""11/11/2016""","""Second Class""","""CG-12520""","""Claire Gute""","""Consumer""","""United States""","""Henderson""","""Kentucky""",42420,"""South""","""FUR-CH-10000454""","""Furniture""","""Chairs""","""Hon Deluxe Fabric Upholstered …",731.94,3,0.0,219.582
3,"""CA-2016-138688""","""6/12/2016""","""6/16/2016""","""Second Class""","""DV-13045""","""Darrin Van Huff""","""Corporate""","""United States""","""Los Angeles""","""California""",90036,"""West""","""OFF-LA-10000240""","""Office Supplies""","""Labels""","""Self-Adhesive Address Labels f…",14.62,2,0.0,6.8714


In [None]:
df.head(3)

## Choosing 2 columns with square brackets

We can choose a column with a string in `[]`

In [10]:
df[['Customer_Name','Profit']].head(3)

Customer_Name,Profit
str,f64
"""Claire Gute""",41.9136
"""Claire Gute""",219.582
"""Darrin Van Huff""",6.8714


## Choosing rows and columns with `[]`
We can choose rows and columns together with `[]`

In [11]:
df[[0,1],["Customer_Name","Profit"]]

Customer_Name,Profit
str,f64
"""Claire Gute""",41.9136
"""Claire Gute""",219.582


### Slice


In [14]:
df[1:3]

Row_ID,Order_ID,Order_Date,Ship Date,Ship_Mode,Customer_ID,Customer_Name,Segment,Country,City,State,Postal_Code,Region,Product_ID,Category,Sub_Category,Product_Name,Sales,Quantity,Discount,Profit
i64,str,str,str,str,str,str,str,str,str,str,i64,str,str,str,str,str,f64,i64,f64,f64
2,"""CA-2016-152156""","""11/8/2016""","""11/11/2016""","""Second Class""","""CG-12520""","""Claire Gute""","""Consumer""","""United States""","""Henderson""","""Kentucky""",42420,"""South""","""FUR-CH-10000454""","""Furniture""","""Chairs""","""Hon Deluxe Fabric Upholstered …",731.94,3,0.0,219.582
3,"""CA-2016-138688""","""6/12/2016""","""6/16/2016""","""Second Class""","""DV-13045""","""Darrin Van Huff""","""Corporate""","""United States""","""Los Angeles""","""California""",90036,"""West""","""OFF-LA-10000240""","""Office Supplies""","""Labels""","""Self-Adhesive Address Labels f…",14.62,2,0.0,6.8714


We can choose columns with a `slice` into the list in `df.columns` 

In [12]:
df[:2, "Customer_Name":"Profit"]

Customer_Name,Segment,Country,City,State,Postal_Code,Region,Product_ID,Category,Sub_Category,Product_Name,Sales,Quantity,Discount,Profit
str,str,str,str,str,i64,str,str,str,str,str,f64,i64,f64,f64
"""Claire Gute""","""Consumer""","""United States""","""Henderson""","""Kentucky""",42420,"""South""","""FUR-BO-10001798""","""Furniture""","""Bookcases""","""Bush Somerset Collection Bookc…",261.96,2,0.0,41.9136
"""Claire Gute""","""Consumer""","""United States""","""Henderson""","""Kentucky""",42420,"""South""","""FUR-CH-10000454""","""Furniture""","""Chairs""","""Hon Deluxe Fabric Upholstered …",731.94,3,0.0,219.582


### Selecting 2 columns

We can select multiple columns with comma-separated strings

In [13]:
df.select('Customer_Name','Profit').head(3)

Customer_Name,Profit
str,f64
"""Claire Gute""",41.9136
"""Claire Gute""",219.582
"""Darrin Van Huff""",6.8714


Or we can pass a list of column names to `select`

In [16]:
df.select(['Customer_Name','Profit']).head(3)


Customer_Name,Profit
str,f64
"""Claire Gute""",41.9136
"""Claire Gute""",219.582
"""Darrin Van Huff""",6.8714


### Selecting 2 columns with a list of expressions

We can also pass 2 expressions seperated by commas or in a `list`. 

In this case we use the `alias` expression to change the name of one column in the output

In [17]:
df.select(pl.col('Customer_Name'),pl.col('Profit').round(0).alias('roundedProfit')).head(3)

Customer_Name,roundedProfit
str,f64
"""Claire Gute""",42.0
"""Claire Gute""",220.0
"""Darrin Van Huff""",7.0
