# McKinney Chapter 2 - Python Language Basics, IPython, and Jupyter Notebooks

## Introduction

We must understand the basics of Python before we can use it to analyze financial data.
Chapter 2 of @mckinney2022python provides a crash course in Python's syntax, and Chapter 3 provides a crash course in Python's built-in data structures.
This notebook focuses on the "Python Language Basics" in Section 2.3, which covers language semantics, scalar types, and control flow.

***Note:*** 
Indented block quotes are from @mckinney2022python unless otherwise indicated.
The section numbers here differ from @mckinney2022python because we will only discuss some topics.

## Language Semantics

### Indentation, not braces

> Python uses whitespace (tabs or spaces) to structure code instead of using braces as in many other languages like R, C++, Java, and Perl.

***Spaces are more than cosmetic in Python.***
Here is a `for` loop with an `if` statement that shows how Python uses identation to separate code instead of parentheses and braces.

In [1]:
array = [1, 2, 3]
pivot = 2
less = []
greater = []

for x in array:
    if x < pivot:
        print(f'{x} is less than {pivot}')
        less.append(x)        
    else:
        print(f'{x} is NOT less than {pivot}')
        greater.append(x)

1 is less than 2
2 is NOT less than 2
3 is NOT less than 2


In [2]:
less

[1]

In [3]:
greater

[2, 3]

### Comments

> Any text preceded by the hash mark (pound sign) # is ignored by the Python interpreter. This is often used to add comments to code. At times you may also want to exclude certain blocks of code without deleting them.

The Python interpreter ignores any code after a hash mark `#` on a given line.
We can quickly comment/un-comment lines of code with the `<Ctrl>-/` shortcut.

In [4]:
# We often use comments to leave notes for future us (or co-workers)
# 5 + 5

### Function and object method calls

> You call functions using parentheses and passing zero or more arguments, optionally assigning the returned value to a variable:
> ```python
>     result = f(x, y, z)
>     g()
> ```
> Almost every object in Python has attached functions, known as methods, that have access to the object's internal contents. You can call them using the following syntax:
> ```python
>     obj.some_method(x, y, z)
> ```
> Functions can take both positional and keyword arguments:
> ```python
>     result = f(a, b, c, d=5, e='foo')
> ```
> More on this later.

Here is a function named `add_numbers` that adds two numbers.

In [5]:
def add_numbers(a, b):
    return a + b

In [6]:
add_numbers(5, 5)

10

Here is a function named `add_strings` that adds or concatenates two strings separated by a space.

In [7]:
def add_strings(a, b):
    return a + ' ' + b

In [8]:
add_strings('5', '5')

'5 5'

What is the difference between `print()` and `return`?

- `print()` returns its argument to the console or "standard output"
- `return` returns its argument as an output we can assign to variables

Please see the following example.

In [9]:
def add_strings_2(a, b):
    string_to_print = a + ' ' + b + ' (this is from the print statement)'
    string_to_return = a + ' ' + b + ' (this is from the return statement)'
    print(string_to_print)
    return string_to_return

In [10]:
returned = add_strings_2('5', '5')

5 5 (this is from the print statement)


In [11]:
returned

'5 5 (this is from the return statement)'

### Variables and argument passing

> When assigning a variable (or name) in Python, you are creating a reference to the object on the righthand side of the equals sign.

In [12]:
a = [1, 2, 3]
b = a

If we assign `a` to a new variable `b`, both `a` and `b` refer to the *same* object, which is the list `[1, 2, 3]`.

In [13]:
a is b

True

***If we modify `a` by appending `4`, we also modify `b` because `a` and `b` refer to the same list.***

In [14]:
a.append(4)

In [15]:
a

[1, 2, 3, 4]

In [16]:
b

[1, 2, 3, 4]

***Likewise, if we modify `b` by appending `5`, we also modify `a`.***

In [17]:
b.append(5)

In [18]:
b

[1, 2, 3, 4, 5]

In [19]:
a

[1, 2, 3, 4, 5]

### Dynamic references, strong types

> In contrast with many compiled languages, such as Java and C++, object references in Python have no type associated with them.

Python has *dynamic references*.
Therefore, we do not declare variable types, and we can change variable types.
This behavior is because variables are names assigned to objects.

For example, above we assign `a` to a list, and below we can reassign it to an integer and then a string.

In [20]:
a

[1, 2, 3, 4, 5]

In [21]:
type(a)

list

In [22]:
a = 5
type(a)

int

In [23]:
a = 'foo'
type(a)

str

Python has *strong types.*
Therefore, Python typically will not convert object types.

For example, `'5' + 5` returns either `'55'` as a string or `10` as an integer in many programming languages.
However, below `'5' + 5` returns an error because Python will not implicitly convert the type of the string or integer.

In [24]:
# '5' + 5 #TypeError: can only concatenate str (not "int") to str

However, Python will implicitly convert integers to floats.

In [25]:
a = 4.5
b = 2
a / b

2.25

### Attributes and methods

We can use tab completion to access attributes (characteristics stored inside objects) and methods (functions associated with objects).
Tab completion is a feature of the IPython and Jupyter environments.

In [26]:
a = 'foo'

In [27]:
a.capitalize()

'Foo'

In [28]:
a.upper().lower()

'foo'

In [29]:
a.count('o')

2

### Binary operators and comparisons

Binary operators operate on two arguments.

In [30]:
5 - 7

-2

In [31]:
12 + 21.5

33.5

In [32]:
5 <= 2

False

***Table 2-1*** from @mckinney2022python summarizes the binary operators.

- `a + b` : Add a and b
- `a - b` : Subtract b from a
- `a * b` : Multiply a by b
- `a / b` : Divide a by b
- `a // b` : Floor-divide a by b, dropping any fractional remainder
- `a ** b` : Raise a to the b power
- `a & b` : True if both a and b are True; for integers, take the bitwise AND
- `a | b` : True if either a or b is True; for integers, take the bitwise OR
- `a ^ b` : For booleans, True if a or b is True , but not both; for integers, take the bitwise EXCLUSIVE-OR
- `a == b` : True if a equals b
- `a != b`: True if a is not equal to b
- `a <= b, a < b` : True if a is less than (less than or equal) to b
- `a > b, a >= b`: True if a is greater than (greater than or equal) to b
- `a is b` : True if a and b reference the same Python object
- `a is not b` : True if a and b reference different Python objects

### Mutable and immutable objects

> Most objects in Python, such as lists, dicts, NumPy arrays, and most user-defined
types (classes), are mutable. This means that the object or values that they contain can
be modified.

A list is a *mutable*, ordered collection of elements, which can be any data type.
*Because lists are mutable, we can modify them.*
Lists are defined using square brackets `[]` with elements separated by commas.
Lists support indexing, slicing, and various methods for adding, removing, and modifying elements.


In [33]:
a_list = ['foo', 2, [4, 5]]
a_list

['foo', 2, [4, 5]]

***Python is zero-indexed! The first element has a zero subscript `[0]`!***

In [34]:
a_list[0]

'foo'

In [35]:
a_list[2]

[4, 5]

In [36]:
a_list[2][0]

4

In [37]:
a_list[2] = (3, 4)
a_list

['foo', 2, (3, 4)]

A tuple is an *immutable*, ordered collection of elements, which can be any data type.
*Because tuples are immutable, we cannot modify them.*
Tuples are defined using optional but helpful parentheses `()`, with elements separated by commas.

In [38]:
a_tuple = (3, 5, (4, 5))
a_tuple

(3, 5, (4, 5))

The Python interpreter returns an error if we try to modify `a_tuple` because tuples are immutable.

In [39]:
# a_tuple[1] = 'four' # TypeError: 'tuple' object does not support item assignment

The parentheses `()` are optional for tuples.
However, parentheses `()` are helpful because they improve readability and remove ambiguity.

In [40]:
test = 1, 2, 3
type(test)

tuple

We will learn more about Python's built-in data structures in Chapter 3.

## Scalar Types

> Python along with its standard library has a small set of built-in types for handling numerical data, strings, boolean ( True or False ) values, and dates and time. These "single value" types are sometimes called scalar types and we refer to them in this book as scalars. See Table 2-4 for a list of the main scalar types. Date and time handling will be discussed separately, as these are provided by the datetime module in the standard  library.

***Table 2-2*** from @mckinney2022python summarizes the standard scalar types.

- `None`: The Python "null" value (only one instance of the None object exists)
- `str`: String type; holds Unicode (UTF-8 encoded) strings
- `bytes`: Raw ASCII bytes (or Unicode encoded as bytes)
- `float`: Double-precision (64-bit) floating-point number (note there is no separate double type)
- `bool`: A True or False value
- `int`: Arbitrary precision signed integer

### Numeric types

Integers are unbounded in Python.
The `**` binary operator raises the number on the left to the power on the right.

In [41]:
ival = 17239871
ival ** 6

26254519291092456596965462913230729701102721

Floats (decimal numbers) are 64-bit in Python.

In [42]:
fval = 7.243
type(fval)

float

Dividing integers yields a float, if necessary.

In [43]:
3 / 2

1.5

We use `//` if we want integer division.

In [44]:
3 // 2

1

### Booleans

> The two Boolean values in Python are written as True and False. Comparisons and other conditional expressions evaluate to either True or False. Boolean values are combined with the and and or keywords.

We must type Booleans as `True` and `False` because Python is case sensitive.

In [45]:
True and True

True

In [46]:
(5 > 1) and (10 > 5)

True

In [47]:
False and True

False

In [48]:
False or True

True

In [49]:
(5 > 1) or (10 > 5)

True

We can substitute `&` for `and` and `|` for `or`.

In [50]:
True & True

True

In [51]:
False & True

False

In [52]:
False | True

True

### Type casting

We can "recast" variables to change their types.

In [53]:
s = '3.14159'
type(s)

str

In [54]:
1 + float(s)

4.14159

In [55]:
fval = float(s)
type(fval)

float

In [56]:
int(fval)

3

We can recast a string `'5'` to an integer or an integer `5` to a string to prevent the `5 + '5'` error above.

In [57]:
5 + int('5')

10

In [58]:
str(5) + '5'

'55'

### None

`None` is null in Python.
`None` is like `#N/A` or `=na()` in Excel.

In [59]:
a = None
a is None

True

In [60]:
b = 5
b is not None

True

In [61]:
type(None)

NoneType

## Control Flow

> Python has several built-in keywords for conditional logic, loops, and other standard control flow concepts found in other programming languages.

If you understand Excel's `if()`, then you understand Python's `if`, `elif`, and `else`.

### if, elif, and else

In [62]:
x = -1
type(x)

int

In [63]:
if x < 0:
    print("It's negative")

It's negative


Single quotes and double quotes (`'` and `"`) are equivalent in Python.
However, in the preceding code cell, we must use double quotes to differentiate between the enclosing quotes and the apostrophe in `It's`.

Python's `elif` avoids nested `if` statements.
`elif` allows another `if` condition that is tested only if the preceding `if` and `elif` conditions were not `True`.
An `else` runs if no other conditions are met.

In [64]:
x = 10
if x < 0:
    print("It's negative")
elif x == 0:
    print('Equal to zero')
elif 0 < x < 5:
    print('Positive but smaller than 5')
else:
    print('Positive and larger than or equal to 5')

Positive and larger than or equal to 5


We can combine comparisons with `and` and `or` (or `&` and `|`).

In [65]:
a = 5
b = 7
c = 8
d = 4
if (a < b) or (c > d):
    print('Made it')

Made it


### for loops

We use `for` loops to loop over collections, like lists or tuples.

The `continue` keyword skips the remainder of the current iteration of the `for` loop, moving to the next iteration.

The `+=` operator adds and assigns values with one operator.
That is, `a += 5` is an abbreviation for `a = a + 5`.
There are equivalent operators for subtraction, multiplication, and division (i.e., `-=`, `*=`, and `/=`).

In [66]:
sequence = [1, 2, None, 4, None, 5, 'Alex']
total = 0
for value in sequence:
    if value is None or isinstance(value, str):
        continue
    total += value # the += operator is equivalent to "total = total + value"

In [67]:
total

12

The `break` keyword skips the remainder of the current and all remaining iterations of the `for` loop.

In [68]:
sequence = [1, 2, 0, 4, 6, 5, 2, 1]
total_until_5 = 0
for value in sequence:
    if value == 5:
        break
    total_until_5 += value

In [69]:
total_until_5

13

### range

> The range function returns an iterator that yields a sequence of evenly spaced integers.

The `range()` function quickly and efficiently generates iterators for `for` loops.

- With one argument, `range()` creates an iterator from 0 to that number *but excludes that number*, so `range(10)` is an iterator that starts at 0, stops at 9, with a length of 10
- With two arguments, the first argument is the *included* start value, and the second argument is the *excluded* stop value
- With three arguments, the third argument is the iterator step size

In [70]:
range(10)

range(0, 10)

We can cast a range to a list.

In [71]:
list(range(10))

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Python intervals are "closed" (included) on the left and "open" (excluded) on the right.
The following is an empty list because we cannot count from 5 to 0 by steps of +1.

In [72]:
list(range(5, 0))

[]

However, we can count from 5 to 0 in steps of -1.

In [73]:
list(range(5, 0, -1))

[5, 4, 3, 2, 1]

### Ternary expressions

We can complete simple comparisons on one line in Python.

In [74]:
x = -5
value = 'Non-negative' if x >= 0 else 'Negative'
value

'Negative'