# <u>Tut_1.1</u>

## Learning outcomes

* Understand a high-level structure of the course
* Set up and get familiar with coding environment
* Learn and understand **absolutely** necessary routines (*commands to be executed before and after each class*)
* **markdown** syntax
* Reflection on what is 'big data' and why do we need it
* Why Python?
* Modules, libraries and frameworks
* 1<sup>st</sup> look at a python program structure (how it is organised, formatting, reserved words and naming conventions)
* DRY principle
* Errors
* `print()` statement


---

## Course Structure
* Before the **consolidation week**: python fundamentals and essentials specific for **big data** applications.
    * 2 classes per week, 2 hours each class. <u>N.b. Each class will cover it's own topic. Meaning, Thursday class is **not** a repetition of a content from Monday.</u>
* After the **consolidation week**: principles of Big Data / Machine Learning / Data Mining: 2 hours of tutorials, 2 hours of drop-in.
    * Image recognition using the Convolution Neural Network (CNN)
    * Models for numerical data analysis (regression and classification tasks): Statitics, Linear Regression, Logistic Regression, Decision Tree, Ensemble Techniques
    * Time series
* If we have time (and energy) left: Model deployment using Flask, Streamlit, Heroku and/or Render.
<br>
<br>
In **tut_1.1** the 1<sup>st</sup> is the week number; the 2<sup>nd</sup> number is the lecture of the week. Each tutorial has dedicated Jupyter notebook.



## Assessment
1. Set of exercises: **20%**
2. Course work: **80%**
<br>
<br>
Submission dates: **t.b.d.**


---
---

## <u>First thing first: setting up working/coding environment</u>
### **[https://github.com/](https://github.com/)**

## Create GitHub repository
* Give sensible name
* Make it public
* Add README
* Add .gitignore from Python template
* Licence is not required
*You can customise your page later by adding photo, bio etc.*

Do not login using your Google profile. Create a proper password and memorise it or write it down. You will need this password to provide third-party services access to your github account.

---

## Create codespace (follow in-class instructions)

---

## `.gitignore` file

---

## Create and activate virtual environment (follow in-class instructions)

---

## Create `jupyter_notebooks` folder
In this folder, create your 1<sup>st</sup> jupyter notebook. Use sensible name. It is recommended to create a notebook per tutorial. **Important**: <u>file name must have an extention **.ipynb**</u>

In the file tab of VS Code select 'autosave'

Select Kernel (follow in-class instructions)

Upon the first `Code` cell execution, you will be asked to install Python kernel. Follow the in-class instructions (installation can take a while).

*The name of the **Jupyter** project is the melt of names of three supported languages: <u>Julia, Python and R</u>. Jupyter notebook can also be run via Jupyter project, Anaconda or Google Colab*. VS Code is a universal IDE, can handle any other programming language and we are going to use it in this course.

---

## Markdown syntax

---

## What is 'Big Data' and why do we need it?

---

## Why Python?
* High-level programming language
* Very clear syntax
* Interpreted
* Dynamically typed
* Abundance of libraries for data handling, visualisation and modelling (we do not have much choice when it comes to data handling and machine learning)

---

## Libraries, modules and frameworks
![Numpy](https://img.icons8.com/?size=100&id=aR9CXyMagKIS&format=png&color=000000)<br>
![Pandas](https://img.icons8.com/?size=100&id=xSkewUSqtErH&format=png&color=000000)<br>
![TensorFlow](https://img.icons8.com/?size=100&id=n3QRpDA7KZ7P&format=png&color=000000)<br>
![Django](https://img.icons8.com/?size=100&id=qV-JzWYl9dzP&format=png&color=000000)

---

## 1<sup>st</sup> look at Python code

In [None]:
import os  # import libraries

def remove_non_image_file(my_data_dir):
    """
    Checks files in defined folder and removes those which are not images.
    """
    image_extension = ('.png', '.jpg', '.jpeg')
    folders = os.listdir(my_data_dir)
    for folder in folders:
        files = os.listdir(os.path.join(my_data_dir, folder))
        i = []
        j = []
        for given_file in files:
            if not given_file.lower().endswith(image_extension):
                file_location = os.path.join(my_data_dir, folder, given_file)
                print(file_location)
                os.remove(file_location)  # remove non image file
                i.append(1)
            else:
                j.append(1)
                pass
        print(f"Folder: {folder} - has image file", len(j))
        print(f"Folder: {folder} - has non-image file", len(i))

* `import` is always on top
* **tab** sensitive. Do not use 4 times spaces!!! Just **tab**
* naming_convention for variables:
    * no spaces
    * no reserved words
    * no numbers as a first character
* comments and docstrings: good practice to think of those who will read your code

---

## 'End of session' routine
* `git add --all`
* `git commit -m "commit message"`
* `git push`

(Sendint a parcel)

`git status` - to see commit status (check uncommitted changes of your project)


---

## DRY

---

## Errors

---

### `print()` statement

In [3]:
print("test message")  # Printing string
print(2 + 2) # Printing result of equation
week_day = "Monday"
print(f"Today is {week_day}")  # String literals

test message
4
Today is Monday


### Declaring variables
`=` assignment operator

In [4]:
a = 5  # think of it as a storage box
my_name = "Sergey"
my_object = {
    "flat": 20,
    "type": "residential"    
}
my_list = [3, 5, 6.4]

### Data type

![Data type](../assets/img/data_type.png)

###### Sourse: Python essentials by [Code Institute](https://codeinstitute.net/)

In [5]:
print(type("Hello, World!"))
print(type(42))
print(type(3.145))
print(type(1j))
print(type(["egg", "bacon", "spam"]))
print(type(("egg", "bacon", "spam")))
print(type(range(6)))
print(type({"name" : "John", "age" : 80}))
print(type({"egg", "bacon", "spam"}))
print(type(True))
print(isinstance(3.14, int))


<class 'str'>
<class 'int'>
<class 'float'>
<class 'complex'>
<class 'list'>
<class 'tuple'>
<class 'range'>
<class 'dict'>
<class 'set'>
<class 'bool'>
False


### String

In [7]:
print("Then Mike said 'What is that?'") # Note: usage of double and single quotes
print("It's a beautiful day")

Then Mike said 'What is that?'
It's a beautiful day


### Functions for converting between data types

In [None]:
int()  # Converts to an integer
float()  # Converts to a floating-point number
hex()  # Converts a number to a hexadecimal string
oct()  # Converts a number to a octal string
tuple()  #Converts to a tuple
set()  # Converts to a set
list()  # Converts to a list
dict()  # Converts a tuple into a dictionary
str()  # Converts a number into a string

### Python arithmetic operators

![Python arythmetic operators](../assets/img/arythmetic_operators.png)
###### Sourse: Python essentials by [Code Institute](https://codeinstitute.net/)

### String methods
N.b. Difference in usage of function and method.

In [None]:
my_string = "test value"

my_string.capitalize()  # Capitalizes the first character of the string
my_string.center()  # Centers string
my_string.count()  # Returns a count of times a specified value occurs in the string
my_string.encode()  # Returns an encoded version of the string (use decode() to decode)
my_string.endswith()  # Returns True if the string ends with a specified suffix
my_string.expandtabs()  # Sets the tab size in spaces of the string
my_string.find()  # Returns the lowest index position of where a specified character was found
my_string.index()  # Searches for a specified value and returns the position of where it was found or an error if not found
my_string.isalnum()  # Returns True if all characters are alphanumeric
my_string.isalpha()  # Returns True if all characters are alphabetic
my_string.isdigit()  # Returns True if all characters are digits
my_string.islower()  # Returns True if all characters are lower case
my_string.isspace()  # Returns True if all characters are whitespace
my_string.istitle()  # Returns True if the string is titlecased
my_string.isupper()  # Returns True if all characters in the string are upper case
my_string.join()  # concatenates string
my_string.ljust()  # Returns a left justified version of the string
my_string.lower()  # Converts a string into lower case
my_string.lstrip()  # Returns a left trim version of the string
my_string.partition()  # Returns a tuple where the string is parted into two strings and the separator
my_string.replace()  # Returns a string where a old value is replaced with a new value
my_string.rfind()  # Searches highest index in the string for a specified value
my_string.rindex()  # Same but with error if nothing found
my_string.rjust()  # Returns a right justified version of the string
my_string.rpartition()  # Returns a tuple where the string is parted into three parts
my_string.rsplit()  # Splits the string at the specified separator, and returns a list
my_string.rstrip()  # Returns a right trim version of the string
my_string.split()  # Splits the string at the specified separator, and returns a list
my_string.splitlines()  # Splits the string at line breaks and returns a list
my_string.startswith()  # Returns true if the string starts with the specified value
my_string.strip()  # Returns a trimmed version of the string
my_string.swapcase()  # Swaps cases, lower case becomes upper case and vice versa
my_string.title()  # Converts the first character of each word to upper case
my_string.translate()  # Returns a translated string
my_string.upper()  # Converts a string into uppercase
my_string.zfill()  # Fills the string with a specified number of 0 values at the beginning

### Comparison operators


In [11]:
print('Hello, World!' == 'Hello, World!')
print(2!=2)
print([1,2]<[1,2,3])
print(float(2)>=int(2))
print('a'<'A') #This is False as 'a' is Unicode 97 where 'A' is 65


True
False
True
True
False


### Logical operators
Priority:
1. not
2. and
3. or

In [None]:
print(True and True)
print(True and False)
print(True or False)
print(not (4 < 5 and 4 < 10))

### If-Else statements

Boolean logic checked by `if` and `else` statements - **flow control by decision making**.

![if-else statement](../assets/img/if_else.png)

###### Sourse: Python essentials by [Code Institute](https://codeinstitute.net/)

In [14]:
number = int(input("Enter a number:"))  # input() function

if number == 10:
    print(f"{number} is equal to 10")
else:
    print(f"{number} is not equal to 10")
    
    # Note: formatting of the snippet
    # !! if-else statement can be nested !! This will be important for an in-class exersise.

10 is equal to 10


---

## In-class exercise: Movie ticket price calculator
#### Objective:
Practice using print statements, logical operators, string methods, data type conversion, and `if-else` statements.
#### Scenario:
The cinema charges different prices based on the customer's age:
* Children (0-5 years old): Free
* Kids (6-12 years old): $5
* Teens (13-17 years old): $8
* Adults (18-59 years old): $12
* Seniors (60+ years old): Free
Your task is to ask the user for their age, determine the correct ticket price, and display it.
#### Instructions:
1. Prompt the user to enter their age.
2. Convert the input to an integer (handle input correctly).
3. Use if and else statements to determine the ticket price.
4. Print a message displaying the user's age and the corresponding ticket price.
5. Use string methods (e.g., .strip(), but not compulsory) to clean up user input.


### Solution

In [15]:
# Ask the user for their age
age = input("Enter your age: ").strip()

# Convert age to integer
age = int(age)

# Determine the ticket price using only if-else statements
if age <= 5:
    price = "free!"
else:
    if age <= 12:
        price = "$5"
    else:
        if age <= 17:
            price = "$8"
        else:
            if age <= 59:
                price = "$12"
            else:
                price = "free!"

# Print the result
print(f"Since you are {age} years old, your ticket price is {price}.")


Since you are 55 years old, your ticket price is $12.


---

## Homework
* Study the structure and options of [GitHub](https://github.com/)
* Get familiar with VS Code codespace
* Get familiar with Jupyter Notebook
* Get familiar with markdown synthax