In [1]:
import pandas as pd

In [2]:
df = pd.DataFrame([["New York", 100], ["Chicago", 50]], columns=["city", "number"])

In [3]:
df

Unnamed: 0,city,number
0,New York,100
1,Chicago,50


# 1. Statements aka sentence types

## 1.1 Expressions

In [4]:
1 + 5

6

In [5]:
df

Unnamed: 0,city,number
0,New York,100
1,Chicago,50


In [6]:
print("You're great!")

You're great!


### They can be put in parenthesis for grouping

In [8]:
(1 + 5) * 2

12

In [9]:
(df)

Unnamed: 0,city,number
0,New York,100
1,Chicago,50


In [11]:
(print("You're great!") or df)

You're great!


Unnamed: 0,city,number
0,New York,100
1,Chicago,50


### Inside parenthesis expression can continue on next lines

In [12]:
(
    print("Long expression"),
    print("is easier to write"),
    print("on multiple lines"),
    100 + 50
)

Long expression
is easier to write
on multiple lines


(None, None, None, 150)

## 1.2 Assignment

It is a separate statement, cannot be part of the expression

**Uses one `=` character**

In [35]:
a = 10

In [36]:
a, b = 10, 20

In [37]:
a, b = "Hey boss".split()

## 1.3 Conditional

In [7]:
if df.shape[0] > 10:
    print("Long dataframe")
else:
    print("Not so long dataframe")

Not so long dataframe


### Statements that should be executed based on the expression _must_ be indented, and indented in the same way

In [13]:
if False:
    print("Inside condition")
print("Outside condition")

Outside condition


In [15]:
if False:
    print("Inside condition")
  print("Outside condition")

IndentationError: unindent does not match any outer indentation level (<tokenize>, line 3)

In [16]:
if False:
    print("Inside condition")
        print("Outside condition")

IndentationError: unexpected indent (2297102792.py, line 3)

## 1.4 Loops

In [17]:
for index, row in df.iterrows():
    print(row)

city      New York
number         100
Name: 0, dtype: object
city      Chicago
number         50
Name: 1, dtype: object


In [20]:
i = 0
while True:
    print(df.number[i])
    i += 1
    if i == df.shape[0]:
        break

100
50


## 1.5 Exceptions

In [23]:
a = None
try:
    a = int("123")
except ValueError:
    print("Does not look like a number")
a

123

In [24]:
a = None
try:
    a = int("mmmm")
except ValueError:
    print("Does not look like a number")
a

Does not look like a number


# 2. Expressions

## 2.1 Binary operators

In [25]:
(
    5 / 3,
    5 // 3,
    5 % 3
)

(1.6666666666666667, 1, 2)

In [26]:
False and True

False

In [27]:
False or True

True

In [28]:
False or "Look here!"

'Look here!'

In [30]:
True and "Look here!"

'Look here!'

In [32]:
"String " + "string"

'String string'

In [33]:
"String " - "string"

TypeError: unsupported operand type(s) for -: 'str' and 'str'

In [34]:
"String " * 2

'String String '

In [38]:
"A" + "b" == "Ab"

True

In [39]:
"A" + ("b" == "Ab")

TypeError: can only concatenate str (not "bool") to str

## 2.2 Square brackets

`[` can be used with some types, but not others, to get some data from an object that has multiple data

Such objects are either _sequences_: list, tuple, string (list of characters)

Or _dictionaries_: key-value container

### Getting one element

In [41]:
l = [1, 2, 10]
l[2]

10

In [42]:
t = (1, 2, 10)
t[2]

10

In [43]:
d = {1: 1, 2: "2", 10: 100}
d[2]

'2'

### Setting one element

In [45]:
l[2] = 100
l

[1, 2, 100]

In [46]:
t[2] = 100
t

TypeError: 'tuple' object does not support item assignment

In [47]:
d[2] = 100
d

{1: 1, 2: 100, 10: 100}

### Getting a slice (subsequence) of elements

In [48]:
l[1:3]

[2, 100]

In [49]:
l[:2]

[1, 2]

In [50]:
l[2:]

[100]

In [51]:
l[:]

[1, 2, 100]

In [52]:
l[::-1]

[100, 2, 1]

In [53]:
l[0:3:-1]

[]

In [54]:
l[3:0:-1]

[100, 2]

In [55]:
l[3:-1:-1]

[]

In [56]:
l[3::-1]

[100, 2, 1]

Same applies to **tuples**, but not to **dictionaries**

In [57]:
t[::-1]

(10, 2, 1)

In [58]:
d[:]

TypeError: unhashable type: 'slice'

Multiple assignment is also possible with lists

In [60]:
l[:2] = [10]
l

[10, 100]

# 3. Working with pandas data frames

To select a column, or row as Series: use `loc` or `iloc` with one coordinate a single value

To select a particular value: use `at` or `iat`, or `loc` or `iloc` with _both_ coordinates as a single value

To select a value from the Series: use `[` or `at` or `iat` or `loc` or `iloc`

To select a sub-table: use `loc` or `iloc` with both coordinates as lists or slices

_i_-operators (`iat`, `iloc`) are for selecting by position (1st row, 10th column etc)

regular operator (`at`, `loc`) are for selecting by labels (index for rows, columns for columns)

## 3.1 Selecting a column

In [63]:
df.loc[:, "city"]

0    New York
1     Chicago
Name: city, dtype: object

In [64]:
df["city"]

0    New York
1     Chicago
Name: city, dtype: object

In [65]:
df.city

0    New York
1     Chicago
Name: city, dtype: object

## 3.2 Selecting sub-table

In [66]:
df.loc[:, ["city"]]

Unnamed: 0,city
0,New York
1,Chicago


## 3.3 Adding (or setting) a column

In [75]:
df["temperature"] = [80, 60]
df

Unnamed: 0,city,number,temperature
0,New York,100,80
1,Chicago,50,60
