<h1><font color='blue'>Session 1 - Atomic Data Types</font></h1>

We will focus on the basic data types that you should know before proceeding further. They are called atomic because in Python, you cannot get any more granular than these unless you go into machine code (1's and 0's). You can liken them to bricks in construction because if you break down a brick you get dust. Which is not very useful....

The atomic data types are `int`, `float`, `bool` and `str`

![data types](https://i.imgur.com/6cg2E9Q.png)

Before we being, the code is run on Google Colaboratory, which is a Google-hosted version of Jupyter Notebook. Jupyter Notebook is an easy way to run Python code in an interactive manner through an interface like Excel cells. The architecture is shown below

![](https://miro.medium.com/max/720/1*OSnF3tM2sXsVXrmM4qV2qw.png)

Since the kernel resides on Google's servers and not your own computer, reading and writing files will be challenging and requires additional setup. It is recommended you download the notebooks and run it on your own Jupyter Notebook server locally.

And now, without further ado, let's learn a little bit of everything about Python!

----

# Section 1 - Numerical Data

Computers like numbers. They're easy to store in memory and operate with. We will further breakdown numerical data into integers (hereafter referred to as `int`) and floating point (`float`). [Boring discourse here](https://en.wikipedia.org/wiki/IEEE_754).

For all numerical data you can perform arithmetic operations like addition, subtraction, multiplication and division, as well as exponential calculations.

## 1a - `int`

An integer is a whole number with no decimal. It is useful as a tool to index your data e.g. the first element in a list or the third letter in a word.


### Quick note on variable assignment
Note: Here we will also teach you about variable assignment. 

> A single equals sign (=) means assignment 

> An operator followed by an equals sign (`+=` or `-=` or `*=` or `/=`) means concurrent operation and assignment i.e. if `a=1`, and then you execute `a+=1`, Python will add 1 to `a` and then assign that result back to `a` meaning `a` is now 2.

i.e. binding a value to a variable. 

> A double equals sign (==) means equivalence

If you want to compare whether two variables or expressions are the same, use two equals (==). This expression will return `True` or `False` and is known as a logical operator. We will cover this in detail in the section on Boolean values.

### print() function
Used to display outputs on the console. Can print multiple inputs in a single function call.

In [1]:
a = 5
print(a)
print(type(a))

a += 1

print(a)
print(type(a))

5
<class 'int'>
6
<class 'int'>


In [2]:
b = 2
print(b)
print(type(b))

2
<class 'int'>


In [3]:
c = 3
print(c)
print(type(c))

3
<class 'int'>


Variables `a` and `b` are now stored in memory. You can now do things with them. The following are examples of what you can do with two variables.

- addition (`+`)
- subtraction (`-`)
- multiplication (`*`)
- division (`/`)
- exponential (`**`)
- modulo a.k.a remainder (`%`)
- quotient (`//`)

Note: As in regular mathematics, $BEDMAS$ rule applies.

In [4]:
a = 10
b = -5
c = 7

print("The sum of a and b is: ",a+b)
print("The difference between a and b is: ",a-b)
print("The product of a and b is: ",a*b)
print("The result of (a+b)*c is: ",(a+b)*c)
print("The result of (a+b*c is: ",a+b*c)
print("a divided b is: ",a/b)
print("a to the power of b is: ",a**b)
print("The remainder when a is divided by b is: ",a%b)
print("The quotient when a is divided by b is: ",a//b)

The sum of a and b is:  5
The difference between a and b is:  15
The product of a and b is:  -50
The result of (a+b)*c is:  35
The result of (a+b*c is:  -25
a divided b is:  -2.0
a to the power of b is:  1e-05
The remainder when a is divided by b is:  0
The quotient when a is divided by b is:  -2


The modulo (`%`) operator is useful for keeping a numerical quantity to be bounded within a range. I.e. if you want to select a card from a deck and you have a large random number, you can calculate the modulus of 52 with it and your answer will always be between 0 and 52

Notice that there is a decimal point when you take a/b. That is no longer an integer, but an......

## 1b - `float`

When you add, subtract and multiply integers, you get integers back. But whenever you divide, you ALWAYS get a `float` back.

In [5]:
type(a/b)

float

### Coercion

If your code requires you numerical input be of a different type, you can coerce or cast the input to be a different type.

Note: int will typecast by removing decimals, not rounding! See [w3schools](https://www.w3schools.com/python/python_casting.asp)

In [6]:
int(a/b)

-2

In [7]:
str(a/b)

'-2.0'

## 1c - `bool`

Boolean values simply mean `True` or `False`. They are commonly represented by 0 for `False` and 1 for `True`. Whenever you use logical operators, you get a Boolean value. e.g. when you ask Python whether `a` is larger than/smaller than/equal to `b`:

In [8]:
a=7
b=5
print("Is a greater than b?: ",a>b)
print("Is a lesser than b?: ",a<b)
print("Is a equal to b?: ",a==b)

Is a greater than b?:  True
Is a lesser than b?:  False
Is a equal to b?:  False


> Quick note: Now that we have assigned new values to a and b, the old values get discarded.

In [9]:
print(a)
print(b)

7
5


Boolean values are very useful to influencing the logical flow of your program. For example, you can find out whether a number is odd or even, and then print out a different response according to whether the number is odd or even.

Note: We will cover flow control in detail in later sections. the `if` statement requires that an expression that evaluates to either `True` or `False` immediately after it followed by a colon `:`, and executes the code that follows it if the expression is `True`. The code has to be indented.

If the `if` portion evaluates to false, it will execute the code in the `else` portion. Note that you can nest another `if-else` within an `if-else` statement for more complex logic.

In [10]:
number_to_check = 10

print(number_to_check % 2 == 0)

if number_to_check % 2 == 0: # If it can be divided by 2 with no remainder
    print("number_to_check is even")
else:
    print("number_to_check is odd")

True
number_to_check is even


----

# Section 2 - Strings

Textual data in programming are called strings, or `str` for short. Textual data is infinitely more complex than numerical data, and there many ways to manipulate strings. We will focus on the absolute basics for this notebook. [See here for more in depth coverage](https://www.pythoncheatsheet.org/cheatsheet/manipulating-strings).

> **!!!IMPORTANT!!!** A string can be enclosed in single `'` or double quotes `"`. If you want to use single or double quotes in your string, you need to make sure the quotes used to enclose the text are different. I.e. if you want to use `'`, enclose with `"` and vice versa.

You can also use escape characters `\'` to negate any special characters and use the literal version. [Escape characters primer](https://www.scaler.com/topics/escape-sequence-in-python/) can be viewed here if interested.

In [11]:
first_name = 'Ian'
last_name = "Chong"
print(f"Hello {first_name}! Is your last name '{last_name}'?")
print(f"Actually, I'm not even sure if your first name is even \"{first_name}\"")

Hello Ian! Is your last name 'Chong'?
Actually, I'm not even sure if your first name is even "Ian"


A common source of confusion when basic python code doesn't work is when you add a string representation of a number to another number. It will lead to unexpected behaviour. Coerce or handle the errors with `try` and `except`

In [12]:
print(1+2) # this becomes int 3
print(1+2.0) # this becomes float 3.0
print('1'+'2') # this becomes string 12

try:
    1 + '2' # this results in an error
except:
    print("Cant add int to string. Always check your data types!")

3
3.0
12
Cant add int to string. Always check your data types!


In [13]:
# To fix the above, have to coerce!
print(1+int('2'))
print(str(1)+'2')

3
12


## 2a - String formatting

Many a time, we want to change the string programmatically i.e. without any human input. This is called string formatting. In Python the easiest way to do this is append an `f` in front of the string and use curly braces to encapsulate variables to inject into the string, as shown above.

For numbers, you can even change the precision `in-situ` by using a few characters to specify after the variable name as shown below.

See this [cookbook](https://mkaz.blog/code/python-string-format-cookbook/) for more details!

In [14]:
from math import pi
print(pi)

print(f"Pi to 4 places is {pi:.4f}")

3.141592653589793
Pi to 4 places is 3.1416


For further reference, [see this cheatsheet](https://www.pythoncheatsheet.org/cheatsheet/string-formatting)

To store a multi-line string, use triple double-quotes `"""`.

In [15]:
my_string = """Pull on my
motherfucking
beads
"""

print(my_string)

Pull on my
motherfucking
beads



Otherwise, you will have to use the newline character `\n` within your string. A simple Enter will cause an error

In [16]:
my_string2= "alskdjfkjsadfl\nsjdklfjaks\nldflkasjdkfjklasdj\nfasdjfkljasdfkjsakldjfl\nkasjdfklasjdklfjklasdfjasdflkjaskldjfklasjdf"

print(my_string2)

alskdjfkjsadfl
sjdklfjaks
ldflkasjdkfjklasdj
fasdjfkljasdfkjsakldjfl
kasjdfklasjdklfjklasdfjasdflkjaskldjfklasjdf


## 2b - String Operations

### 2ai) - Concatenation

You can add strings together, or `concatenate` them.

In [17]:
first_name = 'Richard'
last_name = 'Symonds'
full_name = first_name+last_name
print(full_name)

RichardSymonds


In [18]:
print(first_name+' '+last_name)

Richard Symonds


Note that Python will not automatically add spaces between your strings. To join a collection of strings with a specified character between them, use the `.join()` method. In the following example, we join first_name and last_name on a single space. Functions and methods will be covered in detail in a later section

In [19]:
full_name = " ".join([first_name,last_name])
print("My full name is "+full_name)

My full name is Richard Symonds


In [20]:
my_list = ["super","cali","fragilistic","expialidocious"]
print(" ".join(my_list))

super cali fragilistic expialidocious


### 2bi) - Indexing

You can also subset strings using an indexer, which in Python is denoted by square brackets. This will segue into our next section on `lists`.

![](https://media.geeksforgeeks.org/wp-content/uploads/List-Slicing.jpg)

*Source: [GeeksForGeeks](https://www.geeksforgeeks.org/python-lists/)*

Python is a *zero-based indexing* language. i.e. the first element is zero, second is one and so on and so forth. So if you want to subset "`d`" in "Richard", you subset it using `[6]`.

Alternatively, you can use `[-1]` to select the last element in the string i.e. first from the end.

If you want to subset a range of elements, you need to specify where the index starts and where it ends. The ending element will be the position of the last element plus one. 

Confusing? Think of it as having pointers segmenting a word. In "Richard", if you want to subset '`R`' and '`i`', the elements will be denoted as `[0:2]`. `0` will refer to the point before '`R`', and 2 will refer to the point just after '`i`'.

In [21]:
print("first_name:", first_name)
print("First element (zero-th character) in first_name: ",first_name[0])
print("Last element (character -1) in first_name: ",first_name[-1])

first_name: Richard
First element (zero-th character) in first_name:  R
Last element (character -1) in first_name:  d


In [22]:
first_name[0:2]

'Ri'

You can not only specify the start and end, but also the step of your selection i.e. how many positions to skip. The format will be `[start:stop:step]`.

If you dont specify a start or stop, Python will take it to mean as the first or last character, respectively. So to print all characters after the 3rd element, use `[3:]`

In [23]:
print(first_name[3:])
print("Giggidy")

hard
Giggidy


To print every alternate letter in the string, use a step of 2.

In [24]:
print(first_name[::2])

Rcad


What happens when you print a step of -1?

In [25]:
print(first_name[::-1])

drahciR


In [26]:
my_string="Supercalifragilisticexpialidocious"

In [27]:
my_string[::2]

'Sprairglsiepaioiu'

In [28]:
my_string[1::3]

'urlriscploo'

### Common methods

Apart from `.join()` which we have used, the following methods are very common when you want to modify your strings. This guide will not cover all methods but rather will give you a taste of what can be done.

> Note: We will cover functions in greater detail later but for now, just think of a method as a function that's specific to the object you are operating on.

#### .upper() and .lower()

Used to change the case of a string. 

In [29]:
print(first_name.upper())
print(first_name.lower())

RICHARD
richard


#### `.split(character)`

You can also split a string into multiple sub-strings at a specified character. This is useful when dealing with strings with particular patterns i.e. comma delimited values. If no character is given, split on spaces.

In [30]:
my_string = "Why,hello there,Richard Symonds"
print(my_string.split())
print(my_string.split(","))
print(my_string.split("e"))

['Why,hello', 'there,Richard', 'Symonds']
['Why', 'hello there', 'Richard Symonds']
['Why,h', 'llo th', 'r', ',Richard Symonds']


You can also find the position of a sub string within a string with `.find()` or `.index()`

In [31]:
my_string="This is my rifle. There are many like it, but this one is mine."
my_string.find("rifle")

11

Python will search from left to right and only return the first match. In the below case, the "is" in "This" (position 2)

In [32]:
my_string.index("is")

2

## 2c - Regular Expressions

The following is an advanced topic and you are not expected to know how to code it from scratch. This is meant to illustrate the power of knowing how to work with string data. See [this Medium article](https://medium.com/factory-mind/regex-tutorial-a-simple-cheatsheet-by-examples-649dc1c3f285) for a more in depth introduction.

![](https://i0.wp.com/www.novixys.com/blog/wp-content/uploads/2018/02/regex.png?w=700&ssl=1)

*Note: Not required for basic Python.*

Regular Expression, or `regex` or `regexp` in short, is extremely and amazingly powerful in searching and manipulating text strings, particularly in processing text files. This is invaluable in parsing standardized reports and scraping data from the web.

Regex is supported in all the scripting languages (such as Perl, Python, PHP, and JavaScript); as well as general purpose programming languages such as Java; and even word processors such as Word for searching texts.

You can quickly prototype regex in many sites including [regex101](https://regex101.com/)

We will use Apple's Q4 earnings as an example:

In [33]:
apple_earnings = """CUPERTINO, CALIFORNIA OCTOBER 27, 2022 Apple today announced financial results for its fiscal 2022 fourth quarter ended September 24, 2022. The Company posted a September quarter record revenue of $90.1 billion, up 8 percent year over year, and quarterly earnings per diluted share of $1.29, up 4 percent year over year. Annual revenue was $394.3 billion, up 8 percent year over year, and annual earnings per diluted share were $6.11, up 9 percent year over year.
“This quarter’s results reflect Apple’s commitment to our customers, to the pursuit of innovation, and to leaving the world better than we found it,” said Tim Cook, Apple’s CEO. “As we head into the holiday season with our most powerful lineup ever, we are leading with our values in every action we take and every decision we make. We are deeply committed to protecting the environment, to securing user privacy, to strengthening accessibility, and to creating products and services that can unlock humanity’s full creative potential.”
“Our record September quarter results continue to demonstrate our ability to execute effectively in spite of a challenging and volatile macroeconomic backdrop,” said Luca Maestri, Apple’s CFO. “We continued to invest in our long-term growth plans, generated over $24 billion in operating cash flow, and returned over $29 billion to our shareholders during the quarter. The strength of our ecosystem, unmatched customer loyalty, and record sales spurred our active installed base of devices to a new all-time high. This quarter capped another record-breaking year for Apple, with revenue growing over $28 billion and operating cash flow up $18 billion versus last year.”
Apple’s board of directors has declared a cash dividend of $0.23 per share of the Company’s common stock. The dividend is payable on November 10, 2022 to shareholders of record as of the close of business on November 7, 2022.
Apple will provide live streaming of its Q4 2022 financial results conference call beginning at 2:00 p.m. PT on October 27, 2022 at apple.com/investor/earnings-call. This webcast will be available for replay for approximately two weeks thereafter.
Apple periodically provides information for investors on its corporate website, apple.com, and its investor relations website, investor.apple.com. This includes press releases and other information about financial performance, reports filed or furnished with the SEC, information on corporate governance, and details related to its annual meeting of shareholders.
"""

The following code finds all instances of financial numbers in the text.

In [34]:
import re
pattern = r"\$\d+,*\d*\.*\d*"
regex = re.compile(pattern)
regex.findall(apple_earnings)

['$90.1', '$1.29', '$394.3', '$6.11', '$24', '$29', '$28', '$18', '$0.23']

The following code will find all words that come immediately after "Apple" with any suffixes. (i.e. "Apple" or "Apple's" or "Apple,")

In [35]:
pattern = r"(?<=Apple).*? (\w+)"
regex = re.compile(pattern)
regex.findall(apple_earnings)

['today', 'commitment', 'CEO', 'CFO', 'with', 'board', 'will', 'periodically']