# Python Bootcamp (oDCM)

*After installing Anaconda and going over the first 3 chapters of the [Introduction to Python](https://learn.datacamp.com/courses/intro-to-python-for-data-science) DataCamp course, you should have a understanding of variables, lists, and functions. Therefore, we assume you know how to load a Jupyter Notebook and perform basic operations in Python. In this tutorial, we fill in the gaps of knowledge required for web scraping and APIs purposes.*

--- 

## Learning Objectives

Students will be able to: 
* Apply conditional logical using if-else statements
* Loop over a list of elements 
* Define and add items to a dictionary
* Write their own functions using parameters
* Handle common error messages and debug their code
* Read and write text files 

--- 

## Acknowledgements
This course draws on a variety of online resources which can be retrieved from the [course website](https://odcm.hannesdatta.com/#student-profile--prerequisites). 


--- 

## Support Needed?
For technical issues outside of scheduled classes, please check the [support section](https://odcm.hannesdatta.com/docs/course/support) on the course website.

---

## 1. Conditional Logic

### 1.1 Comparison Operators
**Importance**  
In one of the very first Datacamp exercises you were asked to assign the value `100` to a variable `savings` with the statement `savings = 100`. In practice, you oftentimes find yourself in a situation where you want to make a decision based on whether something is true or false. For example, if we reach a negative account balance (`balance < 0`) we want to transfer money from our savings account to our checking account. 

In Python we can make such comparisons with logical operators like you have seen before in your math classes: 

| Operator | Example | What it does | 
| :--- | :--- | :--- |
| > | `a > b` |  Truthy if **a** is greater than **b** |
| < | `a < b` | Truthy if **a** is less than **b** |
| >= | `a >= b` | Truthy if **a** is greater than or equal to **b** |
| <= | `a <= b` | Truthy if **a** is less than or equal to **b** |



**Let's try it out!**

Now assume both `a` and `b` take on the value `1`. Before running the cell below, try to evaluate whether each of the four value (`>`, `<`, `>=`, `<=`) translate into a `True` or `False` value (also known as booleans). 

In [7]:
a = 1 
b = 1
print(a > b)
print(a < b)
print(a >= b)
print(a <= b)

False
False
True
True


Rather than looking for values greater or smaller than, we can also check whether items take on the same value (`a == b`) or a different value (`a != b`). Note the difference between variable assignment (single `=`) and the comparison operator (double `==`). For example, `savings = 100` means we create a new variable that we assign a value of `100`, whereas `savings == 100` checks whether the variable already contains a value of `100` (if not it will return `False`). 

In [9]:
savings = 100 # variable assignment
print(savings == 100) # comparison 
print(savings != 100) # comparison (False because it is 100!)

True
False


### 1.2 If-statements

**Importance**  
We can use these comparison operators as inputs for if-statements which tell the computer to choose a different path based on the some type of comparison: 

First, it checks whether the first condition is met (`if some condition`). If not, it will move on to the next line (`elif some condition`). If neither of those conditions are true, it will `do something` according to the `else` clause.

In the bank account example, our program could look like this:

In [16]:
if balance < 0: 
    print("You should top up your checking account to avoid paying interest")
elif balance == 0: 
    print("Your checking account balance is exactly €0.00, be careful when making new payments!")
else: 
    print("You have a positive balance")

You should top up your checking account to avoid paying interest


A few remarks:   
* After each comparison there is a colon (`:`). This tells the program that it's the end of the comparison.  
* The statement below each if, elif, else clause needs to be indented (that is, a TAB or 4 spaces to the right). This improves structure and readability. 
* There can be multiple `elif` statements. For example, if you want to display another message when the balance is positive though very low (e.g., `balance < 50`).
* Note how we can derive that the `balance` must be positive from the fact that it is neither negative (`balance < 0`) nor equal to zero (`balance == 0`).

**Let's try it out!**  
Add a variable `balance` to the top of the cell and assign it a value of `-10` and run the cell. Now do it again and change the value to `0` and `10`. Does the output match your expectations? 

**Exercise 1**  
Say that we want to develop a program that advises students on whether they can take the Online Data Collection and Management course. On the [course catalogue](https://catalogus.tilburguniversity.edu/osiris_student_tiuprd/OnderwijsCatalogusKiesCursus.do) page we find that the course is instructed to Marketing Analytics students. Furthermore, Research Master students can audit this course upon approval of the instructor. 

Use conditional logical to write a program that checks the value of a variable `study` and prints one of the following statements: 
* *You satisfy the course requirements* (for Marketing Analytics students)
* *Please send an email with your motivation to enroll in the course to Hannes Datta* (for Research Master students)
* *You do not satisfy the course requirements. Please contact your educational officer if you want to enroll in the course.* (for all other studies, e.g. "Psychology")

In [None]:
# your code goes here!

In [18]:
# solution
if study == "Marketing Analytics":
    print("You satisfy the course requirements")
elif study == "Research Master":
    print("Please send an email with your motivation to enroll in the course to Hannes Datta")
else:
    print("You do not satisfy the course requirements. Please contact your educational officer if you want to enroll in the course.")

You satisfy the course requirements


### 1.3 And / Or Operators

**Importance**  
In reality, you may want to check for multiple conditions. For example, employers with a full-time job go to their work every workday provided that it's not a holiday. In other words, both conditions must be met (`and`). Alternatively, you can check if at least one of the conditions is satisfied (`or`). For example, you go to bed if you're tired or whether it's bedtime already. Lastly, you may require a value to be NOT true, for example workers go to work during workdays not during the weekend (`not`). 


| Operator | Example | What it does | 
| :--- | :--- | :--- |
| and | if workday and no_holiday: <br> &nbsp;&nbsp;&nbsp;&nbsp;print("Go to work!") | Truthy if both `workday` and `no_holiday` are true |
| or | if tired or bed_time: <br> &nbsp;&nbsp;&nbsp;&nbsp;print("Go to sleep!") | Truthy if either `tired` or `bed_time` is true (or both) |
| not | if not weekend: <br> &nbsp;&nbsp;&nbsp;&nbsp;print("Go to work!") | Truthy if the opposite is true |

**Let's try it out!**  
Change the boolean values (from `True` to `False` and vice versa), and see how it affects the output!

In [20]:
workday = True
no_holiday = True

if workday and no_holiday:
    print("Go to work!")

Go to work!


In [22]:
tired = True
bed_time = False

if tired or bed_time:
    print("Go to sleep!")

Go to sleep!


In [21]:
weekend = False

if not weekend:
    print("Go to work!")

Go to work!


**Exercise 2**  
In addition to the study program, the oDCM [course catalogue](https://catalogus.tilburguniversity.edu/osiris_student_tiuprd/OnderwijsCatalogusKiesCursus.do) page also describes that students are expected to have acquired a working knowledge in Python. Extend your program of Exercise 1 such that it not only checks whether students have the right study program, but also the required `prior_knowledge` (boolean variable).

In [None]:
# solution (it's not necessary to check whether prior_knowledge == True)
if study == "Marketing Analytics" and prior_knowledge:
    print("You satisfy the course requirements")
elif study == "Research Master" and prior_knowledge:
    print("Please send an email with your motivation to enroll in the course to Hannes Datta")
else:
    print("You do not satisfy the course requirements. Please contact your educational officer if you want to enroll in the course.")

Once you chain `and` and `or` statements things become more complex. Suppose that we want to calculate whether a student passed the course or not. As mentioned in the grading criteria, students pass the course if the total course grade is >= 5.5, and the exam is passed (>= 5.5). Since the student scored a 4.3 in her first attempt, she needed to take the resit for which she scored a 5.6. Still, her final grade was only a 5.4 because her team did not do really well in the team project. 

According to the grading criteria, she therefore did not pass the course. Yet the boolean expression below evaluates to `True`, why is that? 

In [28]:
final_grade = 5.4
exam_grade = 4.3
resit_grade = 5.6

final_grade > 5.5 and exam_grade > 5.5 or resit_grade > 5.5

True

Python implicitly evaluates the code from left to write which implies that: 
* The final grade and the exam grade must be greater than or equal to 5.5
* OR the resit grade must be greater than or equal to 5.5 (regardless of the final grade)

We can fix this by explicitly enforcing the structure of the comparisons with parentheses:

In [29]:
final_grade > 5.5 and (exam_grade > 5.5 or resit_grade > 5.5)

False

**Exercise 3**  
The minimum age for driving in the Netherlands is 17, but you cannot get a full license until the age of 18. In between you need to be accompanied by a coach (e.g., parent who sits next to you in the car). 

The code snippet below should reflect this policy but currently has some issues. Add parentheses to the conditional expressions below such that it prints the expected output.

In [38]:
driver_license = False
age = 17
coach = True

if driver_license and age >= 18 or age == 17 and coach: 
    print("You're allowed to drive!")    
else: 
    print("You're not allowed to drive!")    

You're allowed to drive!


In [39]:
# solution
if driver_license and (age >= 18 or age == 17 and coach): 
    print("You're allowed to drive!")    
else: 
    print("You're not allowed to drive!")      

You're not allowed to drive!






* We can use these comparisons as input for our if-statements

* 
    * If-statements represents different paths a program can take based on some type of comparison of input 
        * if some condition is True: 
            * do something
        * elif some other condition is True: 
            * do something
        * else: 
            * do something

        * Can be more than 1 elif
        * Need the colons (:) - indicator there will be an indented block below 
        * Note the indentation (TAB or 4 spaces)  
            if name == "abc":  
            print("gives an error")
           
 
* Logical operators
    * if bachelor == ... or bachelor == ... : (diagram maken of ze worden toegelaten tot de opleiding) 
    * price of ticket depending on your age, student card, etc.
    * haakjes combineren (a and b) or c: 
        * Exercise: put parentheses around different logical statements to make it easier to break down 



In [None]:
if balance < 0: 
    print("You should top up your checking account to avoid paying interest")
elif balance == 0: 
    print("Your checking account balance is exactly €0.00, be careful when making new payments!")
else: 
    print("You have a positive balance")

### 1.3 And / Or Operators

## 1. Jupyter Notebooks


### 1.1 Opening Files
* From local disk user directory
* From a download in user directory 

### 1.2 Code & Markdown
* Code 
    * `print("...")`
    * `print()` statements in Python files vs Jupyter Notebooks
    * Multiple operations below one another 
* Markdown 
    * Headers, regular text, italics, bold
    * Nieuwe regel (2x spatie) 
    * Bullet points and numbered lists
    * Inline code

* Run cells
* Add new cells or remove existing cells 
* Moving cells up or down


### 1.x Documentation
https://docs.python.org/3.8/library/index.html
Stackoverflow 
Python 2 docs longer around but not always up to date / syntax changes (e.g., print X


## 2. Variables 
* Understand how to assign and use variables 
* Learn Python variable naming restrictions and conventions 
* Learn and use some of the different data types available in Python
* Understand how to convert data types (string to number) 

### 2.1 Introduction
* A variable in Python is like a variable in mathematics: it is a named symbol that holds a value 
    * Variable name on the left; value on the right of the equal sign
    * Reusability (a very long number that you don't want to type out every time) 
    * Dynamic data (it can take on different values) 
    * Variables must be assigned before they can be used
    * Input for user input 
    * page_num = 1
        * recall: page_num -> gives 1
        * page_num = page_num + 1 -> 2
        * reassign values: page_num = 10 (overschrijft alle waarden tot dan toe) 
    
    
### 2.2 Naming conventions
* Must start with a letter or underscore (dus niet: 24_opening_hours = "00:00 - 23:59"; maar my_1st_variable is weer wel goed) 
* The rest of the name must consist of letters, numbers, or underscores (geen @ - odcm@uvt.nl) 
* Names are case-sensitive (course_name != Course_Name) 

* Convention: guidelines/style - what most people should do
    * Variables should be snake_case (underscores between words) 
        * Exercise: rewrite code so that it matches Python conventions
    * Variables should be also be lowercase (tenzij het echt een constante is: PI = 3.14)
        * UpperCamelCase (refers to a class)
        * lower_snake_case
        * UpperCamelCase
        * CAPITAL_SNAKE_CASE
        * lower-dash-case
        
        
### 2.3 Comments
* Add comments to your code (shortcut: Cmd + /)
    * Markdown comments
    * Code comments
    * Multiline comments
    * Especially if you're starting out; code is not always self-explanatory
    * If you're working in a team 
    * Comment part of your code (rather than deleting it - to test other parts of the code) 

        
## 3. Data Types

| data type | description | 
| :--- | :---|
| bool | True or False values |
| int | an integer (1,2,3) |
| str | (string) a sequence of characters ("Roy") |
| list | an ordered sequence of values of other data types, e.g. [1,2,3] or ["a", "b", "c"]
| dict | a collection of key: values, e.g. {"first_name": "Colt"}


* Strings
    * String literals can be declared with either single or double quotes
    * Inside single (') or double (") quotes
    * It's up to you what you prefer, but stick to the same convention throughout the same file 
    * It's really just a stylistic thing - equivalent to each other if you compare them 
    * Doesn't really matter but you have to be consistent
    * Use quotes inside quotes: "he said "hello there!""
        * 'he said: "hello there!"'
        * "he said: \"hello there!\"
    * new_line = "hello \n world!" (wordt verdeeld over meerdere regels; ziet het alleen bij een print-statement)
    * type(8) vs type ("8")
    * len("some text") 
    * String concatenation
        * = putting two strings together
        * str_one = "your", str_two = "face", str_three = str_one + " " + str_two 
        * print("Hello there and welcome to the game, " + username)
            * greeting = None
            * name = None
            * greet_name = greeting + " " + name
        * Different data types
            * 8 + "hello" -> TypeError
    * f-strings
        * x = 10 
        * formatted = f"I've told you {x} times already!"
        * takes what inside the {} and turns it into a string
    * Indexing
        * Indices always start at 0 in Python (contrary to R)
        * Index out of range
        * variable[len(variabley) - 1]
        
        
* Integers
    * Understand difference between ints and floats
    * Decimal number takes up much more space
    * `type(9)` vs `type(9.0)`
    * 1 + 1.0 = 1.0 (int + float = float) 
    * ** = exponentiation, % = modulo, // = integer division
    * order of operations 
    * pemdas (parentheses, exponents, multiplication, division, addition, subtraction)   

* None 
    * Not a string
    * Starts with a capital "N"
    * Way for Python to express the concept of "nothing"
    * Example: name = "Daisy", age = 30, child = None (in plaats van het helemaal weg te laten beter om expliciet te zeggen None - want a container for the variable ) 
       

                
## Looping in Python
* Naturally when programming there are a lot of things that you want to repeat. 
* Print numbers 1 to 7 manually -> vervolgens automatiseren
* Webshop voorbeeld: in plaats voor elk object: print name, print price, etc. 
* "for" refers to the idea that you want to do something for every item  in the list (or a number in a range) 
* for item iterable_object: 
    do something with item 
    * iterable object is some kind of collection of items (e.g., list of numbers, a string of characters, a range)
    * item is a new varible that can be called whatever you want
    * item references the current position of our iterator within the iterable. It will iterate over (run through) every item of the collection and then go away when it has visited all items. 
* range is a way of quickly generating sequences of numbers 
    * first number is inclusive, last number is exclusive: range(1,8) is 1 to 7 
    * if you only provide 1 number it assumes you start at zero (range(8))
    * you don't see the numbers you generated until you use the `list()` keyword
    
* while loops
    * every thing you can do with a for loop you can do with a while loop (but the while loop has some options - inherent if statement)
        * while loops require more careful setup than for loops since you have to specify the termination condition manually 
    * starts with a boolean expression 
    * anything in the loop will run as long as the boolean expression is True (while loops continue to execute while a certain condition is truthy, and will end when they become falsy) 
    * while im_tired: #seek caffeine
    * user_response = None
      while user_response != "please": 
          user_response = input("Ah ah ah, you didn't say the magic word")
    * be careful! if the condition doesn't become false at some point, your loop will continue forever! (= infite loop) 
    * msg = input("what's the secret password?")
      while msg != "bananas":
        print("WRONG!"
        msg = input("what the secret password?")
      print("CORRECT!")
    * while loops are more flexible 
    * for loops - if you want to run something a set number of times + shorter to write 
      
* convert a for loop to a while loop
    * for num in range(1,11): print(num)
    * num=1, while num < 11: print(num) num += 1
    
* break keyword 
    * The keyword `break` gives us the ability to exit out of while loops whenever we want
    * while True: 
        command = input("Type exit to exit: ") 
        if command == "exit": 
            break 
    * it breaks out immediately 
        * for time in range(times): 
            break 
            print("AADSFASDF") # deze zin doet hij niet eens
    



## Lists


## Dictionairies

## Functions

## Modules

## Error Handling

## Reading Text Files




## Data Structures
* Store other data types inside of them -> data structures

* List
    *  

* Dictionary 
    * There is no order
    * Pairs 

    
    

    
    

In [14]:
# print the following beautiful art using both a for loop and while loop
# if the emoji doesn't print on your machine use a hash or a star 
print("\U0001f600")
print("\U0001f600 \U0001f600")

😀
😀 😀


In [17]:
for num in range(1,11):
    print("\U0001f600" * num)

😀
😀😀
😀😀😀
😀😀😀😀
😀😀😀😀😀
😀😀😀😀😀😀
😀😀😀😀😀😀😀
😀😀😀😀😀😀😀😀
😀😀😀😀😀😀😀😀😀
😀😀😀😀😀😀😀😀😀😀


In [19]:
# nested for loops (als je dat gehele ding 3x onder elkaar wilt hebben)

num = 1

while num < 11: 
    print("\U0001f600" * num)
    num += 1

😀
😀😀
😀😀😀
😀😀😀😀
😀😀😀😀😀
😀😀😀😀😀😀
😀😀😀😀😀😀😀
😀😀😀😀😀😀😀😀
😀😀😀😀😀😀😀😀😀
😀😀😀😀😀😀😀😀😀😀


---

## 1. Numbers, Operators, and Comments





    


In [5]:
9//2

4

--- 

## 2. Jupyter Notebook

* Starting up Jupyter Notebook
    * Anaconda Navigator (but: starts up in user directory)
    * Via Anaconda Terminal: learn how to navigate the command line (directory structure, pwd, cd, ls, relative/absolute file paths)
  
* Installing Python packages
    * Anaconda Terminal (vs pip install)
    
* Using the Anaconda interface
    * Opening files 
        * From local disk (user directory)
        * From a download from the internet, in user directory
        * From another directory on the computer
    * Run cells (control flow from top to bottom)
    * Interactivity allows for instant feedback
    * Add new cells or remove existing cells 
    * Moving cells
    * Run cells (`Shift + Enter`) 
    
* Writing markdown cells
    * Headers, regular text, italics, bolds
    * Links / images
    * Bullet points and lists
    * Inline code
   
* Getting help
    * Stackoverflow
    * Help [...]


## 3. Python Fundamentals
* Data types
    * Define numeric, string, and boolean variables
        * Integers and floats
        * Double vs single quotes (strings)
        * Escape sequences
    * Write inline and multiline comments 
    * Convert variables into different type
    * Math operations (+, -, *, /, mod) 
    * String operations (concatenate, replace, join, .lower(), .upper()) 


* Data structures
    * Create a list (with different data types)
    * Slicing a list 
    * Iterating over lists
    * Extending list elements (append)
    * Create a dictionary 
    * Return values by passing keys
        * Nested dictionairies
        * .get() vs ['key'] 
    * Adding items to dictionary


* Conditional logic
    * If-else statements
    * Multiple elifs
    * Logical and & or |
    * Logical not (!, not)
    * Comparisons (== and =)


* Looping
    * Loop using counter (range)
    * Loop over list elements
    * Looping over dictionary at different “levels” (keys and values)


* Functions 
    * Parameters
    * Return keyword
    * Scope 
    * Docstrings


* Modules
    * Installing modules
    * Importing modules 
    * Selective imports (from … import ... ) 
    * Aliasing modules (as …)
    * Look up documentation (help statement)


* HTTP requests
    * HTTP verbs (get, post)
    * Requesting JSON
    * JSON formatter plugin


* Debugging and error handling
    * Common types of errors
    * Try and except blocks
    * Else and finally


* Files
    * Open and read text files
    * With statements
    * Writing to text files

* https://github.com/kimfetti/Conferences/tree/master/PyCon_2020
* https://www.youtube.com/watch?v=RUQWPJ1T6Zc&t=190s
* https://github.com/hancush/web-scraping-with-python/blob/master/session/web-scraping-with-python.ipynb#HTML-basics
* https://www.udemy.com/course/the-modern-python3-bootcamp/learn/lecture/7991196#overview
* https://campus.datacamp.com/courses/web-scraping-with-python/introduction-to-html?ex=1
* https://realpython.com/python-web-scraping-practical-introduction/
* https://github.com/CU-ITSS/Web-Data-Scraping-S2019