# Introduction to Python

Python is a general purpose and high-level programming language. It's great for developing desktop GUI applications, websites and web applications, and also is one of the bread and butter programming languages for data science. With simple to learn syntax rules Python makes it easier for you to keep the code base readable and application maintainable.

Python is great because it:
- **_Simplifies complex software development_**: Because of its robust libraries, you can focus on design instead of low level code details
- **_Has many open source frameworks and tools_**: With a *huge* developer community online, there are many open source packages allowing you to do everything from web development, to data science, to machine learning and many other things.
- **_Easy to read and maintainable_**: Python and the PEP8 stytle guide encourages simple readable code which allows for easy upkeep and collaboration.
- **_Compatible with major platforms and systems_**: Pretty much everything has a Python API to allow you to interface with it, from OS's through to service providers (e.g. AWS, GCP, Azure)

However, Python is not a panacea and it isn't so great for:
- **_Super fast, realtime functionality_**: Whilst many Python libraries use C under the hood to keep operations fast, this is not consistent and can be difficult when trying to keep realtime operations (e.g. Optimal Reality using AWS Panorama)
- **_Specific tasks_**: Python has a lot of breadth but lacks depth when it comes to some tasks like RPA and data storage, where bespoke languages are often better suited.



## Pre-Workshop: Understanding How Jupyter Notebooks work

One of the great tools created for Python is Jupyter Notebooks. These act as little Python environments where you can develop and run code in a live environment to get results and then explore/present the results in an easy to digest manner. 

But before we dive in, there's a couple of things to note about how notebooks work.

1. Each section runs as a cell. These cells can be:
  - **Markdown** - to present nicely formatted (same formatting as GitHub) text like this
  - **Code** - these sections are used to run code
2. Code cells run sequentially: On the left hand margin for each code section you'll see a set of brackets [ ]. When you run a cell, a number will be put in there to reflect the order that the code was run in. This means **your code remembers what has already run** but **does not remember what has not run yet**
3. Cells can be run individually or all at once.
 - To run cells individually, either click the Run button on the top menu, or press `Shift` + `Enter`
 - To run a cell and *not select* the next cell press `Ctrl` + `Enter`
 - To run all cells, either click the fast-forward button on the top menu or press `Ctrl` + `A` to select all cells and then `Ctrl` + `Enter` to run them
4. The notebook can be reset by restarting the kernel. Think of the kernel as the brain of your notebook - it has the memory of what you've done as well as knowing what language to read things in. We can:
 - Interrupt the kernel: when code is running the kernel is directing what to do. If we have our code stuck (e.g. stuck in a forever loop) we may want to stop it from doing what it's doing. We can do this by clicking the Stop button on the top menu
 - Reset the kernel: by clicking the replay button on the top menu, which resets everything in our notebook (clears all variables and starts the notebook as if it had been opened for the first time
 - *Advanced* Configure our kernel: if we want to change what language we are using in our notebook, we can change this by changing the kernel from the dropdown "Kernel" menu.
 
 
 **a quick way of getting information for a function `shift`+`tab`**
 

The below function is a simple function that just adds 1 to whatever input number - `my_num` - is provided. But we need to define the function before we use it. Try running each cell in order. Notice that the first cell shows an error because we haven't defined the function `Add_One` yet, but the third cell does run. If we keep running the third cell, we keep adding 1 to test_number. 

#### Question:
What will `test_number` be if we re-run the first cell?

In [None]:
test_number = 1
test_number = Add_One(test_number)
print(test_number)

In [None]:
def Add_One(my_num):
    return my_num + 1

What happens if we run this cell once? What happens if we run it multiple times?

In [None]:
test_number = Add_One(test_number)
print(test_number)

## Section 1: Variables

Variables are how programming languages keep track of data values that need to be constantly updated. We saw from the code snippet above an example of a variable `test_number` which changed every time we ran a cell of code. Note that variables are only remembered for as long as you have your Python environment set up and get wiped when we reset the kernel.

- basic unit of storage for a program
- can be created or destroyed
- at a hardware level, a variable is a reference to a location in memory
- programs perform operations on variables

When in doubt as to what format a variable is, you can always check by entering: `type(<your_variable_here>)` 

<font color='red'> As you progress through programming you will also see the Objects. Objects are just a collection of variables with operations.</font>



### Best Practices for Variables
When working with variables, there are a couple of conventions that make life easier. These are:
- Remembering that **variables are case sensitive**: This means that `test_variable` is different to `Test_Variable`
- Give variables meaningful names: Whilst calling a variable `Blah` may be easy to do, it doesn't tell us what it is. For example if we were dealing with a temperature measuring program, maybe we'd store the current temperature as `Temp_Current`
- Variables can also include numbers: You can code variables to have a number to differentiate between them e.g. `Score_1` is a valid variable name and can be used to differentiate from `Score_2`.


### Main datatypes: 
Numbers, strings, lists, tuples, sets, dictionaries, files.

### 1.1 Numerical Data:
Numerical data is the most basic form of variable. 

As we can see below, our variable `test_number` is an integer (whole number). 

Let's create some floating number variables and see how they interact.

In [None]:
type(test_number)

In [None]:
Melb_Temp = 26.5
Temp_Drop = 1.7
New_Temp = Melb_Temp - Temp_Drop
print(New_Temp)

#### Exercise 1 - Basic number operations

Now it's your turn, try to create two variables (giving them intelligent names) and *multiply* the two variables together. Then print the output to the console

#### Challenge 
What happens to the data type of `test_number` if we multiply it by 1.0? Why is this the case?

### 1.2 Text Data

Text data is another useful type of data which is often used when doing things as simple as file manipulation (e.g. renaming files in bulk) or doing complex things like Natural Language Processing. 

 Note, that strings don't necessarily have to be letters, we can also have numerical strings. 

Let's explore how it works below.

In [None]:
string_1 = "Hello"
print(type(string_1))
print(string_1)

Strings can also be added together just like number can to concatenate them

In [None]:
string_2 = " World!"
string_1 + string_2

One thing to note is that different data types don't necessarily play well together. Let's see what happens when we try to do '1' + 1.

In [None]:
string_number = '1'
string_number + 1

#### Challenge
We can fix this though a process called typecasting, which is where we convert from one type of data into another. Let's see how typecasting works below. 

In [None]:
print(type(string_number))
print(type(int(string_number)))

Now that you know how to typecast, edit the following code to make 1+1=2
> `'1' + 1`

### 1.2.1 String Functions

 Strings come equipped with a set of built-in functions, allowing developers to perform a plethora of operations like modification, checking properties, or deriving new information. These functions enable more effective and efficient string manipulations, making them powerful tools in the python toolbox


 Certainly. Here's an introduction and a table summarising some of the most popular string functions in Python.

---

**Table of Popular String Functions:**

| Function              | Description                                    | Example Usage                 | Result           |
|-----------------------|------------------------------------------------|-------------------------------|------------------|
| `str.upper()`         | Converts all characters in string to uppercase | `"hello".upper()`             | "HELLO"          |
| `str.lower()`         | Converts all characters in string to lowercase | `"HELLO".lower()`             | "hello"          |
| `str.capitalize()`    | Capitalizes the first character of the string  | `"hello world".capitalize()`  | "Hello world"    |
| `str.title()`         | Capitalizes the first character of each word   | `"hello world".title()`       | "Hello World"    |
| `str.startswith(sub)` | Checks if string starts with the substring     | `"hello".startswith("he")`    | True             |
| `str.endswith(sub)`   | Checks if string ends with the substring       | `"hello".endswith("lo")`      | True             |
| `str.find(sub)`       | Returns the first position of the substring    | `"hello".find("l")`           | 2                |
| `str.replace(old, new)`| Replaces old substring with new one           | `"hello".replace("l", "r")`   | "herro"          |
| `str.split(sep)`      | Splits the string at each sep                  | `"hello world".split(" ")`    | ['hello', 'world']|
| `str.join(iterable)`  | Joins the iterable into a string using str     | `"-".join(["a", "b", "c"])`   | "a-b-c"          |
| `str.strip(chars)`    | Removes any character in chars from the start and end of the string | `" hello ".strip()` | "hello" |
| `str.isdigit()`       | Checks if all characters in the string are digits | `"123".isdigit()`          | True             |




**Challenge Question:**

You are given a raw dataset containing user reviews for various products. Before conducting any Natural Language Processing (NLP) tasks, it's crucial to pre-process and clean the data. Given the sample review below:

```python
raw_review = "   HEy! This is a SaMPLE product review. I absolutely love it, but the colour, not so much.    "
```

Perform the following pre-processing steps:

1. Convert the entire review to lowercase to maintain consistency.
2. Remove any leading or trailing white spaces from the review.
3. Split the review into individual sentences.

Use the Python string functions you've learned to achieve the above steps.

---

**Expected Output of each Step**

1. `"   hey! this is a sample product review. i absolutely love it, but the colour, not so much.    "`
2. `"hey! this is a sample product review. i absolutely love it, but the colour, not so much."`
3. `["hey! this is a sample product review.", "i absolutely love it, but the colour, not so much."]`

**Please Provide Answer Below**

In [None]:
raw_review = "   HEy! This is a SaMPLE product review. I absolutely love it, but the colour, not so much.    "

## Start Answer Below

### 1.3 Sequence Data

Single variables are cool, but often we want to create groups of variables to represent different things or to create relations between variables. This is where sequence data comes into play. The key types of sequence data are:

- **Lists** `[1,2,4,6,"a",'llama']`:  can be addressed by index (starting at 0), and can be added to or have elements removed. You can add any data type to a list and they don't all need to be the same.

- **Tuples** `(1,2,4,6,"a",'llama')`: Immutable list, for integrity.

- **Sets** `{"Apricot","Nectarine","Peach","Plum"}`: An advanced sequence type that creates a statistical set (i.e. no repeating elements) - this can be useful when trying to match data against criteria (i.e. is a bit of data part of a target set of data). E.g. we could test to see if a given fruit was a stone fruit by comparing it to our set of stone fruits (see above). **Note** Sets do not retain any order, no duplicates. 

- **Dictionaries**: we talk about that later.

![image.png](attachment:image.png)


![image-2.png](attachment:image-2.png)

Let's explore how it works below.

#### Lists

In [None]:
my_list = [1,2,4,6,6,6,'llama']

# Print a the first list element
# Bring up the rules of index
print(my_list[0])

# Print the last list element - Note that to pick the n'th element from the end, use my_list[-n]
print(my_list[-1])

# Remove an element from the list
my_list.remove('llama')
print(my_list)

# Add an element to the end of the list
my_list.append('llama')
print(my_list)

# Add an element to the third position of the list
my_list.insert(2, 'llama')
print(my_list)

#### Exercise 2 - List manipulation 
Remove all the llamas from `my_list` and then pick the 4th last element

#### tuples

In [None]:
# Creating a tuple with repeated elements
my_tuple = ("apple", "banana", "cherry", "apple", "cherry") # note the brackets

In [None]:
my_list[0]=23432
my_list

In [None]:
my_tuple[0]=2312323 # trying to change an element of a tuple will result in an error message!

In [None]:
my_tuple[0]

**Question: can you index string? is it mutable?**

#### Sets
![image.png](attachment:image.png)

In [None]:
stone_fruits = {"Apricot","Nectarine","Peach","Plum"}
shop_fruits = {"Apricot","Pear","Cherry","Apple"}

# Find the intersection between the sets
print(stone_fruits.intersection(shop_fruits))

# Find the difference between the sets
print(stone_fruits.difference(shop_fruits))

# Add another element to the stone_fruit set
shop_fruits.add("Grapefruit")
print(shop_fruits)

# Remove Apricot from the stone_fruit set
stone_fruits.remove("Apricot")
print(stone_fruits)

# Combined sets
all_fruits = shop_fruits.union(stone_fruits)
print(all_fruits)

In [None]:
stone_fruits[0]

### Can anyone think of any other use cases for sets? 

#### Dictionaries

**Dictionaries**`{"Name":"Grant", "Python Skill":"Legendary","Height":190}`: Similar to lists, but creates pairwise relationships between keys and values. Think about how in a normal dictionary when you look up a unique word it gives you a definition, the same is true of Python dictionaries - if you search a key word, you get the value as the output. This can be used to create datasets of object (e.g. unique files for people like the dictionary above)


In [None]:
employee_1 = {
    "Name":"Borat", 
    "Python Skill":"Very Nice",
    "Height":190, 
    "Hobbies":["Ping pong", "Disco Dance", "Archery", "Running Code Workshops"]}

In [None]:
# Retrieve the value associated with a specific key
print(employee_1["Height"])
print(employee_1["Hobbies"])

In [None]:
# Update the dictionary to have a new key:value pair
employee_1["age"] = 31
employee_1

In [None]:
# Update the value associated with a key
employee_1["age"] = 21
employee_1


In [None]:
# Remove a key-value pair using the 'del' statement
del employee_1["Height"]
employee_1

In [None]:
# Check if a key exists in the dictionary
"job" in employee_1

In [None]:
# Retrieve all keys and values as lists
all_keys = employee_1.keys()
all_values = employee_1.values()
all_keys, all_values

#### Exercise 3 - Dictionary manipulation

Complete the following tasks:

1. Retrieve and print the height of employee_1.
2. Update the "Python Skill" of employee_1 to "Moderate" and print the updated dictionary.
3. Add a new key-value pair, "Nationality": "Kazakh", to the dictionary.
4. Remove "Archery" from the list of hobbies.
5. Check if "Driving" is one of employee_1's hobbies and print the result.


## Conditional Statements

Data types are cool and all, but they're not that useful if we can't do anything with them. This is where we can use conditional statements to compare variables and do something based on that result.

**_Simple Logical Statements_**:
- Equal `==`
- Not Equal `!=`
- Greater than (or equal to) `>` `>=`
- Less than (or equal to) `<` `<=`

**_More Complex Logical Statements_**:
- If/Elif/Else statements
 - `if:` - If a statement is true, then do a thing
 - `elif:` - Else if another statement is true, then do a thing
 - `else:` - If all other statements are false, then do a thing
- `and` joins multiple logical statements together and returns True if both met, otherwise False
- `or` joins multiple logical statements together and returns True if either condition met, otherwise False
- `not` returns the opposite of the variable

In [None]:
my_number = 2

if my_number > 0 and my_number % 2 == 0:
    print('This is an even number')
elif my_number > 0 and my_number % 2 != 0:
    print('This is an odd number')
else:
    print('Negative numbers cannot be odd or even')
    
# Note here we use the modulo operator %, which gives the remainder of whatever we divide by (e.g. 3 % 2 = 1, as 3/2 is 1 r 1)

#### Exercise 4.1 - Conditional statement manipulation

Background:
At Deloitte Consulting, clients are categorised into different tiers based on their annual spending. This categorisation helps in allocating resources and benefits:

- Bronze: Up to $50,000
- Silver: $50,001 to $200,000
- Gold: $200,001 to $500,000
- Platinum: Above $500,000

Task:
    Write a Python program that takes an annual spending amount ($750,000) and determines the client's tier.

Tip:
You might need the following variables: project_revenue, project_cost, client_feedback, project_duration, and agreed_duration.

#### Exercise 4.2 - Project Success Analysis ** Time Permitting

Background:
Deloitte  undertakes multiple projects throughout the year. For a project to be considered a success:

- The revenue should exceed the project cost by at least 20%.
- Client feedback rating (on a scale of 1 to 10) should be 7 or above.
- The project should not have exceeded the agreed-upon timeline.

Task:
    Write a Python program that assesses if a project was successful based on the above criteria.

## Loops

Another thing we can do is to loop certain behaviours to do things a certain number of times, or keep doing something whilst a condition is true. In programming, we call these loops.

In Python, we run loops by starting with `for` or `while` and then (as was also the case with if statements) use tabs to indent our loop.

**_For Loops_**: For a certain number of iterations, do a thing

` for <item> in <list_variable>:
    <Run Code>`


**_While Loops_**: While a condition is True, do a thing

` while <condition>:
    <Run Code>`

**_Generate List Variables on the Fly_**:
Often it's tedious to create lists, so python lets us make out own on the fly.
`range(<start>,<stop>,<step>)`
Generates a list of numbers, (default starts at 0 and counts up by steps of 1 until it reaches n)
- Start: Optional (default 0 if left blank)
- Stop: Must include (when to stop)
- Step: Counts in increments of n (default 1 if left blank)

`for <index>, <item> in enumerate(<List_Variable>)`
Returns the elements of a list **_and_** its position in the list


In [None]:
# print numbers from 0-9:
for number in range(10):
    print(number)       


In [None]:
# print elements from a list and what number they are:
animals = ['cat','dog','cow','llama','sloth']
for number, animal in enumerate(animals):
    print("Item %d in the list is %s" % (number,animal))


In [None]:
# print numbers up to 10:
num = 0

while num <= 10:
    print(num)
    num += 1

In [None]:
# Add to the meme while length < 10:
meme = 'skr'
num = 0

while num < 10:
    meme = meme + 'r'
    print("car goes " + meme)
    num += 1


#### Exercise 5:
Write code that prints the sum of a list of numbers, Z, of unknown length.

In [None]:
z = [1,4564,3.14,85]



#### Challenge
Write a weekly pay calculator that calculates and prints pay given an hourly rate and number of hours worked. Any hours over 40 is paid as overtime at 1.5 times the hourly rate

In [None]:
num_hour=[40,20,30,50,19,23]
pay_rate=[25,50,23.45,54,104,16]

In [None]:
total_num=len(num_hour)

for i in range(total_num):
    # your code here

## Functions

Where we have things that we continuously want to repeat, it often makes more sense to put these into a pre-written bits which can be called using function name. These are great for 2 reasons:
- [x] Reduces the amount of repeated code you have to write-
- [x] Makes your code *much* easier to read

We can either use the pre-built ones which are included within Python, for example:
- `len()` - Returns the length of a list or a string
- `min(<list>)` - Returns the smallest element in a list
- `max(<list>)` - Returns the largest element in a list

Or we can define our own functions using the `def` function

**When defining the function for the first time**

`def <YourFunctionName>(<input_1>,<input_2>,.....):
    <Your Code Here>
    return <Output_1>,<Output_2>,...`

**When calling your function**

`out_1, out_2 = YourFunctionName(variable_1,variable_2)`



In [None]:
# let's take the odd/even logic we wrote before and put it into a function

def MyOddEven(my_number):
    if my_number > 0 and my_number % 2 == 0:
        out = "Even"
    elif my_number > 0 and my_number % 2 != 0:
        out = "Odd"
    else:
        out = "Error"
    return out

num_list = [1,2,3,4,8,-10]
odd_even = []

for number in num_list:
    odd_even.append(MyOddEven(number))
    
print(num_list)
print(odd_even)

#### Exercise 6:
Write a function named Sum100() that takes two number inputs, X, Y returns “Yes” if X and Y add to 100, and otherwise returns “No”

In [None]:
value1=50
value2=30
value3=50

def Sum100(value1, value2):
    # your code here

## File I/O

More often than not, you won't be generating your data in Python. Generally your client will have data in some sort of file format (whether it be a Excel spreadsheet, a csv, a series of images, a text file etc. etc.).

Therefore, we need a way to not only **read** data, but also to **write** our output to somewhere.

**Inbuilt Function**
Python's `open()` function returns a file objet which can read data as a string into the environment. This can be used for both reading in and writing out data as a stream. However, because it imports things as a long string, it's not the most useful (see below for all the steps involved)


**What you'll probably use in real life**
In the real world, there are packages that are built for reading data in and exporting data depending on what you're using (e.g. files, SQL databases, data streams etc) which are specialised and a lot easier to use.

![image.png](attachment:image.png)

Modes available are:

- Read ("r").
- Append ("a")
- Write ("w")
- Create ("x")
You can also choose to open the file in:

- Text mode ("t")
- Binary mode ("b")

In [None]:
#Check that you're in the right directory - you should see TestData.csv in the list printed out
import os
os.listdir()

In [None]:
with open('TestData.csv','r') as file:
    contents = file.read()

print("Raw File Content:\n" + contents)

In [None]:
# writing to strings to a .txt file below - check "readme.txt" to see this
lines = ['Readme', 'How to write text files in Python']
with open('readme.txt', 'w') as f:
    for line in lines:
        f.write(line)
        f.write('\n')

### MUCH EASIER... Pandas

In [None]:
import pandas as pd

# Reading a CSV into a DataFrame
df = pd.read_csv('TestData.csv')
df


## Packages

Instead of reinventing the wheel everytime we want to do things, the Python developer community have developed packages which you can import into your Python environment to run a whole bunch of pre-written functions.

To install these packages refer to the pre-reading on the correct use of `pip`, however generally speaking this can be done by running `pip install <package_name>` in Command Prompt

To import these into our environment we use the `import <package>` command. Most packages have rather extensive documentation in the relevant [Python Docs Page](https://pypi.org/) or through searching Google for ">Your Package< Docs". 

There is also an extensive support network through [StackOverflow](https://stackoverflow.com/) with Q&A for almost anything you will ever need to do (quick debugging tip, if you can't find a StackOverflow post about it, generally your approach is not the best one to do what you're doing)

Let's run an example using the random package - i.e. `import random`

In [None]:
import math
math.sqrt(4)

In [None]:
# this is another form of importing library if the libray is huge and you only need to use one function
from math import sqrt
sqrt(4)

In [None]:
!pip3 install pandas 

In [None]:
import pandas as pd  # Import pandas package and give it the handle 'pd'

In [None]:
test_data = pd.read_csv('TestData.csv') # Importation of data in one step
test_data # Pretty formatting of data using inbuilt functions in pandas

In [None]:
print(test_data['Age'].tolist()) # Print list of ID data more easily

#### Exercise 7 - Using the random package
Use random to write a function named dice() which simulates a dice roll, returning a randomly chosen number from 1 to 6.

*Hint*: you can use round(X) to round a number X to the nearest integer.

# Wrap Up and What's Next

This concludes Python 101, you've done a great job to pick all this up in 2 hours, and can continue to use this notebook to play around with what you've learned.

The domain of Python use cases is immense and ever expanding as more technologies build it into their products. A small set of examples are below:
 - **Web Development and Cloud:**
  - Uber, Netflix and Youtube, Google all use Python of their back-end 
  - Most cloud platforms support computation and API interactions in python
  - Web Servers can be built in Flask, a python package
 - **Machine learning and AI:**
  - Most popular language for developing prototype and production AI systems.
  - Used as a rapid prototyping tool by Tesla, and to build ML models in AWS Sagemaker
 - **Data Analysis**:
  - Numerous packages allow for the analysis of large datasets, and of unconventional data such as plain text.
  - Used widely in insurance, finance and banking industries.
  

## Resources 
### General Learning
[CodeAcademy](https://www.codecademy.com/learn/learn-python-3) has a good entry level course that covers similar content to this course.

[W3 Schools](https://www.w3schools.com/python/default.asp) provides a “Try it Yourself” terminal, allowing you to practice coding and receive feedback in real time, it delves into loops, functions, conditional logic and more:

### Best Practices
Code Style is a thing, and if you make beatiful code note only will your client love you, so will your team for how easily maintainable your code is. The following links are the primary resource for what is good coding:

[Python Code Quality: Tools & Best Practices – Real Python](https://realpython.com/python-code-quality/)

[PEP 8: The Style Guide for Python Code](https://pep8.org/)

#### Stackoverflow and google are always your best friends, and always just try! Don't worry, you won't do any real damage!

