<a href="https://colab.research.google.com/github/aartis83/Airbnb_Clone/blob/master/Python_ABI_Module_1_Fundamentals.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<img src='https://cdn.ourcodeworld.com/public-media/articles/articleocw-5c65fbda1ea05.jpg' width='750'>

# What is Python and Why are we Learning it?

Python is a general purpose programming language developed in 1991. It was originally developed to be easier to use than other popular languages at the time (C, ABC, etc.). At the time, Python was mainly being used for web development.

It wasn't until the late 2000's, with the addition of the NumPy and Pandas Libraries (think of them like add-ons), that the dominant focus of Python shifted to data analysis. In the 2010's, Python became the dominant programming language for machine learning, with the releases of TensorFlow (Google) and PyTorch (Facebook). The top reasons to learn Python are:

1. It is easy to learn compared to other programming languages (huge community on StackOverflow, other sites).
2. It has hundreds of thousands of open source libraries (think add-ons) that do almost anything you can think of (mapping, machine learning, statistics, image processing, etc.)
3. It is a popular requirement on data jobs

Today, Python rivals R as a powerful, flexible data analysis programming language. It is often chosen over other tools for the following reasons:
- Python vs R: Machine learning, better integration with other software
- Python vs Excel: Easier to repeat analysis steps, automate tasks
- Python vs Power BI/Tableau: Open source, more advanced calculations, dirty data
- Python vs SQL: Python can connect directly to databases but is often used in combination with SQL

<img src='https://res.cloudinary.com/dyd911kmh/image/upload/f_auto,q_auto:best/v1603200011/Technologies_gesmfa.png' width='750'>

Although Python is considered the [most popular](https://www.tiobe.com/tiobe-index/) general purpose programming language, it is not the best at any one thing. It is, however, second/third best at most things, and often the easiest option. That's what makes it a very good language to learn - it is truly a swiss army knife (developed by a Dutchman).

# Reasons NOT to use Python
- You're doing a simple, one-time analysis that can be done in a spreadsheet (Excel)
- You need to develop an interactive dashboard (Power BI/Tableau)
- Your dataset is already pretty 'clean'
- You don't like programming and willing to learn advanced Excel, Power BI
- You prefer R (many people with math/statistics background prefer R)
- You are writing software that needs to run FAST (Scala, C)
- You are writing front end web software (JavaScript)
---

# How to Take this Course

In this course, we will be learning programming and data analysis. These subjects involve theory and practice. The ABI program focuses on PRACTICE.
- Save a copy of this Notebook in your Drive
- Write and edit code in your Notebook
- Ask as many questions as you have (this is an intro course, there are NO dumb questions)
- Bring your own problems and ask questions on how to approach them (data analysis problem at work or school)

If you do not do these things, you will not retain much from this course. [Studies have shown](https://www.psychotactics.com/art-retain-learning/) that you retain:

- 75% of what you learn through practice
- 50% of what you learn through discussion with others
- 30% of what you learn through a demonstration
- 10% of what you learn from reading
- 5% of what you learn from a lecture

### Recommendations

* Learning Python requires context. **Experiment** with your creativity when learning it and use any search engine (Google, Bing, etc.) to explore the concept further with different examples.

* **Practice** makes perfect. When exploring a programming concept, try to test it with different approaches to understand what works and what does not and why it does so.

* Do not be afraid to **test the limits**. You can create a backup copy of this file and if anything goes wrong, you can always use the backup copy.


## Do you want to retain 75% of this course or 30%? The choice is yours.

---



# Syllabus

## Module 1 - Fundamentals
- Jupyter Notebooks
- Variables
- Data Types
- Switching Data Types (Casting)
- Arithmetic Operators
- Comparison Operators
- Lists
- Dictionaries
- Conditionals (If-statements)
- Loops
- Functions
- Libraries

## Module 2 - Data Analysis
- Pandas
- Data cleaning

## Module 3 - Case Study
- Excel exercises repeated in Python


# Intro to Jupyter/Colab Notebook (how this interface works)

This document that you're currently reading is called a "Jupyter Notebook" or "Colab Notebook" (since we it is hosted on Google Colab). It's similar to a text document, but you can do many things with it:

- Run code
- Show graphs
- Pull data from various databases
- Include HTML/Markdown content (text, images, links)

Jupyter Notebook is made up of modular units called cells that you can add, edit, delete and move up/down.

Cells can contain different types of contents, but we will mainly be working with two types:

- Code cells: you can type in Python code into these cells and run them for the output
- Markdown cells: you can insert HTML/Markdown content (text, images, links) to provide context and content to your notebook

When you double click the cell it will open an edit mode.

Once you've made the changes, you have to "run" the cell to display the changes.

There are many keyboard shortcuts that will let you interact with your documents, run code and make other changes. For example, there are two shortcuts to execute a cell:

shift + return: Run cell and advance to the next one

## Make a copy of this Notebook
This Notebook is meant to be interactive. There will be exercises and questions to answer by writing code into the Notebook cells. It will not allow you to save changes, because it is owned by the instructor. Therefore, you must create your own copy of the Notebook to save your changes.

1. Under "File", click "Save a Copy in Drive".
2. Open that copy and go through the rest of this session.

In [None]:
# Change the text inside the quotes to your name,
# then hit the PLAY button on the left to run the code
print('Ahmed')

Ahmed


## Important notes



*   Jupyter notebook cells have a shared memory within a notebook. If you are running a piece of code "B", that depends on another piece of code "A", make sure to run Cell A first.
*   Lines of code that start with the hash symbol (#) will be ignored by the Python interpreter (they are meant to be comments)

*   Once you come across an error, try to understand what may have caused the error. You can always rely on search engines to help you out, even in the workplace!



## Understanding errors

Programming is a life full of bugs and errors that you have to learn how to navigate around if you want to excel at it. There is no perfect code and even if a code runs, there is always a chance of the wrong logic being implemented. 

Understanding errors will make your learning process smoother and quicker as well as help you debug issues that you may face while coding. If it is an error that you have never faced before, search engines and community websites like [GitHub](https://www.github.com) and [Stackoverflow](https://www.stackoverflow.com) are your best bets at resolving it.

**In Google Colab, there's a "SEARCH STACK OVERFLOW" button on every error message! Super cool!**

In [None]:
# Example: variable not defined error
print(f)

NameError: ignored

# Python Fundamentals

Python is an interpreted, dynamically typed programming language. This means:
- Interpreted: One line of code is run at a time
- Dynamically typed: Python figures out what kind of data you're working with (numbers, text, etc.)

We're going to start with treating Python like a simple calculator and adding more and more complexity as we go through this session.

In [None]:
# Simple math
5+5

10

In [None]:
# Slightly more complicated
10 * (2+3) - 3 / (27**3) + 10

59.999847584209725

In [None]:
# Symbolic math
x = 6.3224552434
y = 4.2221111

height = 180

x + y

10.5445663434

In [None]:
x+y+height

190.5445663434

In [None]:
print(6*5)
print(10*10)

30
100


In [None]:
6*5
10*10

100

**Question**

In [None]:
# What's the formula for weight in kilograms?

weight_in_lb = 200
factor = 0.45

weight_in_lb*factor


90.0

These are commonly used terms in Python:

* **Data**: Information stored and used by a computer to perform operations

* **Jupyter notebook**: This document editor you are currently using  

* **Python**: The programming language you are currently learning  

* **Syntax**: The structure of a code when written

* **Code**: Pieces of the programming language that tell the computer what to do with the data   

* **Value**: Same as data   

* **Input**: Entry data   

* **Output**: Resultant data produced by a piece of code   

* **Console**: The interface display of Python   

***


You can use the print function to display the output of a code in Python.


In [None]:
print(6)
print(1+2)
print("Hello World")
"Hello World"

"number"
"hat"
"dog"

6
3
Hello World


'dog'

Jupyter notebook will automatically output the result of the last line of code.

In [None]:
3+3  # gets ignored
print(7)
1+2

7


3

You can put comments in the code block to provide context or to skip execution of a line of code.

Comments begin with a hash (#) symbol. Anything written after it on the same line will be ignored by Python


In [None]:
# Testing comments - Run this cell

# print('This is a comment, it will not be run')
print('This is not a comment, it will be run') # inline comment print('ah')

This is not a comment, it will be run


In python, you can assign a data value to a variable using the equals (=) operator. During an assignment operation, the contents on the right side of the equals (=) sign is evaluated first, followed by the assignment of the value to the variable on the left.

In [None]:
5

5

In [None]:
# Testing value assignment
x = 5
print(x)

5


In the code above, Python evaluates the value 5 on the right side first and then assigns the variable x with the evaluated value of 5.

## Variables
***

In programming, variables are like nicknames for information that give context and convenience. For example:

Nick's age is 5. How would we program that?

```
age_of_nick = 5
```

Today's date is July 20th. How would we program that?

```
todays_date = '20-Jul-2022'
```

There is no specific command in Python to define a variable. It is created when a value is assigned to it

In [None]:
# Variable assignment

a = 10
x = 15.5
y = 'abc'
z = True

When you use the [print()](https://www.w3schools.com/python/ref_func_print.asp) command on the variable, it prints out the data value it stores.

In [None]:
# Printing variables

print(a)
print(x)
print(y)
print(z)

10
15.5
abc
True


Python variables do not need to be specified to contain any specific type of data, but takes on the data type of the value assigned to it. The data type of a variable can even be changed after it has been assigned. The [type()](https://www.w3schools.com/python/ref_func_type.asp) function can be used to check the data type of variables

In [None]:
# Printing variable types

print(type(a))
print(type(x))
print(type(y))
print(type(z))

<class 'int'>
<class 'float'>
<class 'str'>
<class 'bool'>


Variables can have different combinations of user defined names. Variables are mostly defined to describe the value they contain. There are certain rules to naming variables:


*   Must start with a letter or the underscore character
*   Cannot start with a number
*   Can only contain alphanumeric characters and underscores (A-Z, 0-9, _)
*   Are case sensitive



Allowed examples:

In [None]:
# Assigning legal variables

model = 'Honda'
Model = 'Mazda'
model_EU = 'Ferrari'
_model_NA2 = 'GMC'
MODEL = 'Tesla'

print(model)
print(Model)
print(MODEL)

Honda
Mazda
Tesla


In [None]:
# Printing legal variables

print(model)
print(Model)
print(model_EU)
print(_model_NA2)
print(MODEL)

Honda
Mazda
Ferrari
GMC
Tesla


Wrong examples:

In [None]:
# Try to assign an illegal variable

2model = 'Buick'

SyntaxError: ignored

In [None]:
# Try to assign an illegal variable

model-NA = 'Ford'

SyntaxError: ignored

In [None]:
# Try to assign an illegal variable

model SEA = 'Nissan'

SyntaxError: ignored

**Practice - Assigning variables**

Assign the following data values to relevant variables and print them using Python. The first one has been shown as an example.


*   Canada
*   Ontario
*   Toronto
*   M5B1T3
*   6475884206

In [None]:
country = 'Canada'
print(country)

#Your code here --

country = 'Canada'
province = 'Ontario'
city = 'Toronto'
postal_code = 'M5B1T3'
phone_number = '6475884206' # number or string?

print(country)
print(province)
print(city)
print(postal_code)
print(phone_number)

print(province, city, country, postal_code, phone_number)

## Data types

So far you have seen variables store different kinds of information (numbers, text, True/False). These all represent different ***data types***.

Data types are a core component of any programming language. Data values can be of different types and have different uses. Since variables store data values, they take on the data type of the assigned values as seen in the example before.

Python has different built-in data types but the most commonly used ones are:

* Text:      `str`
* Number:    `int`, `float`
* Sequence:  `list`,`tuple`,`range`
* Dictionary:   `dict`
* Set:       `set`
* Boolean:   `bool`
* None:      `None`

We will explore these data types later

In [None]:
# Assigning different types of data to variables

text_test = 'This is a string'
num_test = 123456
seq_test = [1,2,3,'abc', True]  # list
dict_test = {'Fruit': 'Apple', 'Color': 'Red', 'Price': 3.65}
dict_test2 = {'Hat': 'a thing you wear on your head sometimes', 'Cat': 'an animal that some people like.'}
set_test = {'chocolate', 'maple', 'coffee'}  # skip
bool_test = True  # true or false
none_test = None

**String** type data is assigned by wrapping the data with single quotes, double quotes or triple quotes. This data type is used to store text data.

In [None]:
# Printing string variable data and type
print(text_test)
print(type(text_test))

This is a string
<class 'str'>


**Numeric** type data can be decimals, whole numbers and complex numbers. We will focus on decimals and whole numbers. This data type is used to store numeric data. 

In [None]:
# Printing numeric variable data and type

print(num_test)
print(type(num_test))

123456
<class 'int'>


**Sequence** type data stores multiple data values in a sequence. The contained data values can be any of the valid data types of Python. This data type is used to store data values with a sequential index.

In [None]:
# Printing sequence variable data and type 

print(seq_test)
print(type(seq_test))

[1, 2, 3, 'abc', True]
<class 'list'>


In [None]:
# see section on lists

**Dictionary** type data can store multiple values with a user defined index. This data type is used to store values with an index and are stored as key:value pairs.

In [None]:
# Printing dict variable data and type
print(dict_test)
print(type(dict_test))

{'Fruit': 'Apple', 'Color': 'Red', 'Price': 3.65}
<class 'dict'>


**Set** type data can store multiple unique values. They are used to store multiple unique data values.

In [None]:
# Printing set variable data and type 
print(set_test)
print(type(set_test))

{'maple', 'chocolate', 'coffee'}
<class 'set'>


**Boolean** type data can store only two values in duality. This data type is used to store data with only two possible values. 

Eg: True/False, 0/1

In [None]:
# Printing boolean variable data and type 

print(bool_test)
print(type(bool_test))

True
<class 'bool'>


**None** data type stores no value. It is the Python equivalent for the null data where no data is present. Note that text data, including a whitespace (pressing space bar), is considered a string data type and is not a None data type. We did not use quotes to define the None type.

None data type is the data type of a variable which has been defined but does not contain any value.

In [None]:
# Printing none variable data and type

print(none_test)
print(type(none_test))

None
<class 'NoneType'>


## Number types

We will focus on two different numeric types:

* int = whole numbers, positive or negative, without decimals
 * Example: 1, 10, 44, 256

* float = floating point numbers, positive or negative, with decimals
 * Example: 1.4, 52.53, 764.6343

In [None]:
# Testing int data type

x = 1123

print(x)
type(x)

1123


int

In [None]:
# Testing float data type

x = 121.123

print(x)
type(x)

121.123


float

In [None]:
x = 123.0

print(x)
type(x)

123.0


float

## Strings

String is a data type that can store any data value as a text. They are defined by placing the data value within a single quote, double quote or triple quote depending on the scenario

Regardless of which quotes you use, the resultant value will be the same

In [None]:
# Single quote test

a = 'This is a string'
print(a)
type(a)

This is a string


str

In [None]:
# Double quote test

a = "This is a string"
print(a)
type(a)

This is a string


str

In [None]:
# Triple single quote test

a = '''This is a string'''
print(a)
type(a)

This is a string


str

### Where to use different quotes

**Single quotes**: Can be used anywhere as long as there is no apostrophe in the sentence.

In [None]:
# Testing single quote

test = 'This string has no apostrophes'
print(test)

This string has no apostrophes


In [None]:
# Testing error scenario

test2 = 'This string isn't valid with single quotes'
print(test2)

SyntaxError: ignored

**Double quotes**: Can be used anywhere, even with apostrophes but no double quotations

In [None]:
# Testing double quotes

test = "This string isn't going to throw an error"
print(test)

This string isn't going to throw an error


In [None]:
# Testing error scenario

test2 = "I said, "This string isn't valid with double quotations""
print(test)

SyntaxError: ignored

**Triple quotes**: Can be used anywhere, even with apostrophes or double quotes. Can also be used for multiline strings.

In [None]:
# Testing triple quotes

test1 = '''"This string isn't going to throw an error", I said'''
print(test1)

"This string isn't going to throw an error", I said


In [None]:
# Testing triple quotes with multiline string

test2 = '''"My dog stepped on a bee",
she said.'''

print(test2)

# Notice how the line break occurs according to how you have defined it in the value

"My dog stepped on a bee",
she said.


In [None]:
# Testing triple double quotes with multiline string

test3 = """"My dog stepped on a bee",
she said."""
print(test3)

"My dog stepped on a bee",
she said.


**Practice** (Optional)

Assign the following values to appropriate variables as string and print them:



*   The weather today was hot
*   'The weather today was hot', he said
* "The weather's hot today", she said
* 30 (Temperature value)



In [None]:
# Write your code here

weather2 = '"The weather\'s hot today", she said'
weather3 = '"The weather today was hot", he said'

## Switching Data Types

You may need to change the data types of variables depending on the requirement. For example, a price value might be stored as a text data type during entry but for calculations, you would need it to be in a numeric data type.

In Python, you can use **casting** to change the data type. Casting is performed by wrapping the data variable/value you want to change with the target data type casting function.

In [None]:
thisisanumber = '456'

print(thisisanumber*2)

456456


In [None]:
# Casting to string changes a given value to a text data type

test = 123
string_test = str(test)

print(string_test)
print(type(string_test))


123
<class 'str'>


In [None]:
# Casting to integer changes any given number to a whole number. 
# If the given number has decimal values, the decimals will be removed
# Strings can be converted to integers given that the string only contains whole numbers

test = '123'
int_test = int(test)

print(int_test)
print(type(int_test))


123
<class 'int'>


In [None]:
# Casting to float changes any given number to a float number.
# Note that a float number is different from a decimal number.
# Float numbers store approximate values. Use case includes scientific measurements.
# Decimal numbers store exact values. Use case includes monetary values.

test = '65.3'
float_test = float(test)

print(float_test)
print(type(float_test))

65.3
<class 'float'>


In [None]:
# Why would I go from an integer to a float?
age = 6

half_my_age = int(age) / 2

print(half_my_age)
print(type(half_my_age))

3.0
<class 'float'>


## Operators

Operators are characters or words reserved by Python to perform a specific operation between variables and values. Operators and operands work together to produce a result value.


## Arithmetic Operators

| Operators   | Description |
| ----------- | ----------- |
| +           | Sum         |
| -           | Difference  |
| *           | Product     |
| /           | Division    |
| %           | Remainder   |
| **          | Power       |
| //          | Floor division |

### Sum

In [None]:
x = 4
y = 5

x + y

9

### Difference

In [None]:
x = 16
y = 8

x - y

8

### Product

In [None]:
x = 4
y = 5

x * y

20

### Division

In [None]:
x = 15
y = 4

x / y

3.75

### Remainder

In [None]:
x = 10
y = 4

x % y

2

### Power

In [None]:
x = 4
y = 2

x ** y # x^y

16

### Floor division

In [None]:
x = 15
y = 4

print(x / y)
print(x // y)

3.75
3


## Assignment Operators

| Operators   | Example     | Equivalency |
| ----------- | ----------- |------------ | 
| =           | x = 5       | x = 5       |
| +=          | x += 5      | x = x + 5   | 
| -=          | x -= 5      | x = x - 5   |
| *=          | x *= 5      | x = x * 5   |
| /           | x /= 5      | x = x / 5   |
| %           | x //= 5     | x = x // 5  |
| **          | x **= 5     | x = x ** 5  |

In [None]:
x = 15
print(x)

x *= 5

print(x)

15
75


## Comparison operators

| Operators   | Name     | Example |
| ----------- | ----------- |------------ | 
| ==           | Equal      | x == y       |
| !=          | Not equal      | x != y  | 
| >          | Greater than      | x > y   |
| <          | Less than      | x < y   |
| >=           | Greater than or equal to      | x >= y   |
| <=          | Less than or equal to    | x <= y  |

In [None]:
3 >= 4

False

In [None]:
10 == 5*2

True

In [None]:
30 > 0

True

In [None]:
# 'hat' is NOT the same type as 3
type('hat') != type(3)

True

In [None]:
type('hat') != type('cat')

False

**Practice** - Write a statement that is False using a comparison operator

In [None]:
# Your code here. i.e. (10 < 4)
(10%2) == 1
10<12
3!=3
10 ==2

False

## Logical Operators

| Operator   | Example            | Description  |
|---------   | ---------------    | ------------ |
| and        | x > 1 and x < 5    |  Returns True if both statements are True |
| or         | x > 1 or x < 5     |  Returns True if either statements are True |
| not        | not(x > 1 or x < 5)| Returns the opposite of the result, return False if the result is True|

In [None]:
x = 5

x > 1 and x < 4

False

In [None]:
x > 1 or x < 4

True

## Lists

Lists are sequence data types that store multiple elements of any data type within a variable. They are created with **square brackets**.

In [None]:
# Example of a list

list_colors = ['Red', 'Blue', 'Green', 'Black', 'White', 'Red']
print(list_colors)
type(list_colors)

['Red', 'Blue', 'Green', 'Black', 'White', 'Red']


list

In [None]:
list_colors[1]

'Red'

They have the following properties that discern them between the 4 data types that used to store multiple elements of data:

* **Indexed**: Lists have a sequential index for each element in it starting from 0 and incrementing by 1 for each element.

* **Ordered** : Lists store elements according to their index number.

* **Mutable**: Elements inside a list can be changed, added or removed

* **Allow duplicates**: Lists allow duplicates of any data value to be stored within the same list.

In [None]:
# Checking the length of the list
len(list_colors)

6

In [None]:
# Can contain different data types, even lists

nested_list = ['Earth', 23, True, 56.5, ['Nested list', 1]]

nested_list

['Earth', 23, True, 56.5, ['Nested list', 1]]

['Nested list', 1]

In [None]:
# Retrieving individual element by index

print(list_colors)
print(list_colors[2])
type(list_colors[2])

['Red', 'Blue', 'Green', 'Black', 'White', 'Red']
Green


str

In [None]:
# Let's take the first three entries
list_colors[0:4]

Slicing - slicing is not inclusive on the right

In [None]:
# Exercise nested_list
nested_list[4][1]

1

## Tuples

Tuples are sequence data types that store multiple elements of any data type within a variable. They are created with **round brackets**.

In [None]:
# Example of a tuple

tuple_colors = ('Red', 'Blue','Green','Black','White', 'Red')
print(tuple_colors)
type(tuple_colors)

('Red', 'Blue', 'Green', 'Black', 'White', 'Red')


tuple

They have the following properties that discern them between the 4 data types that used to store multiple elements of data:

* **Indexed**: Tuples have a sequential index for each element in it starting from 0 and incrementing by 1 for each element.

* **Ordered** : Tuples store elements according to their index number.

* **Immutable**: Elements inside a tuple **CANNOT** be changed

* **Allow duplicates**: Tuples allow duplicates of any data value to be stored within the same list.

In [None]:
# Checking the length of the tuple

len(tuple_colors)

6

In [None]:
# Can contain different data types, even tuples

nested_tuple = ('Earth', 23, True, 56.5, ['Nested list', 1],('Tuple',2))

nested_tuple

('Earth', 23, True, 56.5, ['Nested list', 1], ('Tuple', 2))

In [None]:
# Retrieving individual element by index

print(nested_tuple[1])
type(nested_tuple[1])

23


int

## Sets

Sets are sequence data types that store multiple elements of any data type within a variable. They are created with **curly brackets**.

In [None]:
# Example of a tuple

set_colors = {'Red', 'Blue','Green','Black','White', 'Red'}
print(set_colors)
type(set_colors)

{'Black', 'White', 'Green', 'Blue', 'Red'}


set

They have the following properties that discern them between the 4 data types that used to store multiple elements of data:

* **Not indexed**: Sets store values as a collection without any index

* **Unordered** : Since sets do not have indices, they are not ordered and do not have specific positions

* **Unchangeable but mutable**: Elements inside a set **CANNOT** be changed but can be added or removed

* **No duplicates**: Sets do not allow duplicates of any data value to be stored within the same set

In [None]:
# Checking the length of the set

len(set_colors)

5

In [None]:
# Can contain different data types, except sets,lists, dict and tuples

nested_set = {'Earth', 23, True, 56.5}

nested_set

{23, 56.5, 'Earth', True}

## Dictionaries

Dictionaries are data types that store multiple elements of any data type with a key:value pair within a variable. They are created with **curly brackets** but with a **colon** between the key:value pair.

<img src='https://programmathically.com/wp-content/uploads/2021/05/dictionary-1024x554.png' width=600>

In [None]:
# Example of a dictionary

dict_student = {'Name':'Ryan', 'DOB': 1990, 'Height': 1.8, 'Unit': 'm', 'Height': 1.7}
print(dict_student)
type(dict_student)

{'Name': 'Ryan', 'DOB': 1990, 'Height': 1.7, 'Unit': 'm'}


dict

They have the following properties that discern them between the 4 data types that used to store multiple elements of data:

* **Indexed**: Dictionaries are defined with keys to each value, so they have user defined indices.

* **Ordered** : Dictionaries are ordered in Python version 3.7 above, but unordered for Python 3.6 and earlier.

* **Mutable**: Elements inside a dictionary can be changed, added or removed

* **No duplicates**: Dictionary do not allow duplicates of any key value to be stored within the same dictionary. Any proceeding duplicates will overwrite the previous value *(refer above to how height was overwritten from 1.8 to 1.7)*

In [None]:
# Checking the length of the dictionary
len(dict_student)

4

In [None]:
# Can contain different data types, except sets,lists, dict and tuples as keys
# edit
nested_dict = {'First':'Earth', 2: 23, 'Third': True, 4: {'test',1}, True: {1:4,'Two':23}}

nested_dict

{'First': 'Earth',
 2: 23,
 'Third': True,
 4: {1, 'test'},
 True: {1: 4, 'Two': 23}}

In [None]:
# Retrieving individual element by key

print(dict_student['Name'])
type(dict_student['Name'])

Ryan


str

In [None]:
# Practice
# Retrieve the height of this student
print(dict_student['Height'])

1.7


In [None]:
# Exercise - get the second element from the True key in the nested dict
print(nested_dict[True]['Two'])

23


## Conditionals - If/Else

If/Else statements are used to apply conditional processing of the code depending on the logical condition. 
Extending on the comparison operators, you can apply logical conditions in Python.




| Operators   | Name     | Example |
| ----------- | ----------- |------------ | 
| ==           | Equal      | x == y       |
| !=          | Not equal      | x != y  | 
| >          | Greater than      | x > y   |
| <          | Less than      | x < y   |
| >=           | Greater than or equal to      | x >= y   |
| <=          | Less than or equal to    | x <= y  |

Combining an if/else statement along with these comparison operators allows you to define how the code should flow.

In [None]:
# Testing an if/else statement

num = 10

# IFS(num>10,"is greater than 10",num=10,"is equal to 10","is less than 10")
if num > 10:
    print(num, 'is greater than 10')
elif num == 10:
    print(num, 'is equal to 10')
else:
    print(num, 'is less than 10')

10 is equal to 10


In [None]:
# If statements can also work by itself

num = 5

if num > 10:
  print(num, 'is greater than 10')

In [None]:
# If statements can contain multiple pathways using elif (else if)
# Try changing the value of num to see what prints out

num = 2

if num < 5:
  print(num, 'is less than 5')
elif num < 10:
  print(num, 'is less than 10')
elif num == 10:
  print(num, 'is equal to 10')
else:
  print(num, 'is greater than 10')

2 is less than 5


**Practice** - Print the square of a number only if it is an even number

In [None]:
# Hint: use the remainder operator
number_test = 4

if number_test % 2 == 0:
    print(number_test**2)

16


In [None]:
# Bonus: Print the square of a number only if it is divisible by 13.
number_test = 26
if number_test%13 == 0:
    print(number_test ** 2)


676


## Loops

### For Loops

When working with a data type that has multiple elements in it (string, set, list, tuple, dictionary), you may have to perform a certain operation on each of the elements. `For` loops are used to iterate over the elements in a data sequence.

<img src='https://pynative.com/wp-content/uploads/2021/06/for-loop-in-python.png' width=600>


In [None]:
# variable is a sequence
# Iterating over a sequence of elements in a list

fruits = [24, 35, 22, 16, 26]

for x in fruits:
  print(x)

24
35
22
16
26


In [None]:
# Iterating over a sequence of elements in a dict

shopping_list = {'milk': 2, 'eggs': 6, 'bread': 1, 'cheese': 2}

# First Loop
x = 'milk'
print(shopping_list[x])
# Second Loop
x = 'eggs'
print(shopping_list[x])
# Third Loop
x = 'bread'
print(shopping_list[x])
# Fourth Loop
x = 'cheese'
print(shopping_list[x])


# OR just create a for-loop
print('')
print('Faster Way')
for x in shopping_list:
    print(shopping_list[x])

2
6
1
2

Faster Way
2
6
1
2


In [None]:
for x in shopping_list:
    print(x)

milk
eggs
bread
cheese


In [None]:
for key, value in shopping_list.items():
    print(key, value)

milk 2
eggs 6
bread 1
cheese 2


`For` loops can also be used to repeat a certain piece of code for a user defined number of times using the `range()` function. 

The `range()` function returns a sequence of numbers, starting from 0 (default, can be changed), incremented by 1 (default, can be changed) and ends at a user specified value.

Syntax = `range(start*, stop, increment*)`

 *= optional

In [None]:
# Printing the elements generated by the range() function
for num in range(8):
  print(num)

0
1
2
3
4
5
6
7


In [None]:
# Note that the number sequence is not inclusive of the end value

for num in range(0,10,2):  # start, stop, step
  print(num)

0
2
4
6
8


In [None]:
# Repeating a piece of code with for loop

for x in range(0, 5):
  print('This is loop', x)

This is loop 0
This is loop 1
This is loop 2
This is loop 3
This is loop 4


In [None]:
list_of_numbers = [5, 2, 56, 22]

# I want to multiply every number in this list by 2.

list_of_numbers[0] = list_of_numbers[0] * 2
list_of_numbers[1] = list_of_numbers[1] * 2
list_of_numbers[2] = list_of_numbers[2] * 2
list_of_numbers[3] = list_of_numbers[3] * 2

print(list_of_numbers)

[10, 4, 112, 44]


In [None]:
list_of_numbers = [5, 2, 56, 22]

# I want to multiply every number in this list by 2.

for num in range(0, 4): # num is counting: 0, 1, 2, 3
    list_of_numbers[num] = list_of_numbers[num] * 2

print(list_of_numbers)

[10, 4, 112, 44]


**Practice: repeat the for-loop above but only multiply by 2 if the value is already greater than 10**

In [None]:
# Hint: if-statement inside of the for-loop
list_of_numbers = [5, 2, 56, 22]

# Version 1
for num in range(0, 4):
  if num > 10:
    list_of_numbers[num] = list_of_numbers[num] * 2

print(list_of_numbers)

[5, 2, 56, 22]


In [None]:
# Version 2
list_of_numbers = [5, 2, 56, 22]

for i in range(0,4):
  if list_of_numbers[i]>10:
    list_of_numbers[i] = list_of_numbers[i]*2
print(list_of_numbers)

[5, 2, 112, 44]


In [None]:
# Bonus: loop through the length of the list, not the number 4

# v1
list_of_numbers = [5, 2, 56, 22]

for num in range(0, (len(list_of_numbers)-1)): # num is counting: 0, 1, 2, 3
    list_of_numbers[num] = list_of_numbers[num] * 2

print(list_of_numbers)

[10, 4, 112, 22]


In [None]:
# v2
list_of_numbers = [5, 2, 56, 22]
for i in range(len(list_of_numbers)):
  if list_of_numbers[i]>10:
    list_of_numbers[i] = list_of_numbers[i]*2
print(list_of_numbers)

[5, 2, 112, 44]



### While Loops

The `while` loop is another type of loop that allows a certain piece of code to be repeated as long as a specified condition is `True`. Using a combination of comparison operators and boolean logic, a piece of code will keep repeating as long as the condition is not `False`. 

**Keep in mind that this may lead to an infinite loop (the loop keeps on repeating) unless the condition is changed to `False` within the code or the `break` statement is used.** 

In [None]:
# Example of a while loop

counter = 0

while counter < 5:
  print(counter)
  counter += 1

0
1
2
3
4


In [None]:
# If the counter is not incremented each loop, the condition stays True and code goes into an infinite loop
# In case you have run this, you can click the stop button to stop execution

counter = 0

while counter < 5:
    print(counter)

## Functions

A function is a block of code that runs only when it is called. If you have a piece of code that will be reused many times, you can define it as a function and call the function instead of retyping the code everytime you want to use it.

In [None]:
print('hello world')

y = f(x)

In [None]:
# Adding two numbers and printing without function

a = 5
b = 10

sum_result = a + b
print('The sum is', sum_result)

a = 10
b = 20

sum_result = a + b
print('The sum is', sum_result)

In [None]:
# Defining a function to add two numbers and printing the sum

def addition(num1, num2):
  sum_result = num1 + num2
  print('The sum is', sum_result)

# Calling the function

a = 5
b = 10

addition(a,b)

a = 10
b = 20

addition(a,b)


    def some_function(input):
        # Put code here.
        some_value = #expression
        print(input)
        return some_value

As you can see in the example above, we have used a function instead of having to type out the code again to add two numbers and printing it out. The use of functions may not be evident when it comes to a few lines of code but when it comes to writing a few hundred lines of code for a few hundred usages, you may imagine how tedious it can be.

Functions allow ease of reuse of code without having to clutter up your code and standardize the contents of what a piece of code should contain.

You have already been using functions as you started this notebook. Print(), type(), len(), range() are all examples of built-in functions in Python (please refer to the [documentation](https://docs.python.org/3/library/functions.html) for more details).


### Logic and structure of functions

Consider the following dummy function:

In [None]:
constant_value = 5

def function_name(input1, input2):
  
  # Do something with input1 and input2
  output = input1 + input2

  return output

<img src='https://assets.alexandria.raywenderlich.com/books/da/images/4d60cfe665878267599b825b56aa77f73ad6722073fd0a9e172e3a6c82429fc0/original.png' width=600>


**Functions may take in input variables/values, process any code within the function and output results.**

<img src='https://media.geeksforgeeks.org/wp-content/uploads/20220721172423/51.png' width=600>


### Process flow

When working with functions, there are two distinct phases to it. The first phase is to create or define the function with a name and code logic of what it will once executed. The second phase is to utilize the function by calling it. 

*Please note that the term 'phases' have been used here to distinguish between the creating a function and calling it.*

When the function is defined, it is not executed. The definition of the function stores it in a file called a module, which can be imported as well. There are several advantages to using functions:
* Functions can be re-used in other programs if you have imported it into other program files

* It is easy to share the function modules without having to share the whole program

* You can use functions that other people have written through importing function libraries (We will look into libraries further)


#### Creating a function

In [None]:
# Define a function that greets the input name

student1 = 'John'

def greet(name):
  print(f'Hello there {name}!') #This is a f-string formatting

In [None]:
def greet2(name):
  return f'Hello there {name}!'

Notice how there is no printed output even though the function contains a print statement. **This is because the function is only being defined and retained in a module, not executed.**

#### Calling a function

When a function has been defined or imported, it can now be called and executed. 
* Calling: Refers to the code that you type to retrieve the function
* Executing: Running the code with the function in it to receive an output

Functions are called by typing out the function name followed by parentheses. 

In [None]:
# Calling the function and executing it

student1 = 'John'
student2 = 'Yahia'
student3 = 'Person'

greet(student1)
greet(student2)

In [None]:
greet2('Yahia')

In [None]:
greeting_result_firstattempt = greet('John')
greeting_result_secondattempt = greet2('Yahia')

# print('Printing')
# print(greeting_result_firstattempt)
# print(greeting_result_secondattempt)

In [None]:
# edit
a = 2
b = 4

new_variable = print(a)
newer_variable = a

print('Printing')
print(new_variable)
print(newer_variable)

#### Passing arguments

Functions can work with external information by passing them as arguments to the parameters of a function. When defining a function, the number of parameters it accepts as input is also defined within the parentheses after the function name.

```
def sum(input1,input2):
    output = input1 + input2
    return output
```
In the example above, the function takes two arguments as input parameters.

* **Parameter**: Variable in a function defintion, does not have any stored value

* **Argument**: Value/variable passed while calling a function

```
a = 1
b = 1

sum(a,b)
```

We are passing *a* and *b* as arguments to the input parameter of the sum function. These arguments are then used by the function code to return an output.

#### Returning vs printing **(Important)**

When we use the print() function, we are displaying the printed value on the Python console but does not store the data. This is different from the return value of a function. The return value of the function is the output of the function which can be used outside the function.

In [None]:
# Defining a function with print

def print_test():
  print(5)

# Calling the function, adding 5 to it and assigning to a variable

a = print_test()
print(a)
type(a)

In the example above, you can see that even though the function is defined to print 5 when called, the printed value is of NoneType and not an int, as expected of the print() function.

In [None]:
# Testing the data type of print values
type(print('test'))

Return values are output values of functions that can be utilized outside the function. The return value type depends on the output value data type.

In [None]:
def print_test():
  return 10

a = 5 + print_test()
print(a)
type(a)

## Libraries

We often talk about Python in terms of two things: the standard library and the ecosystem.


### The Standard Library
The Python standard library is also known as "pure" Python. It is the language maintained by the Python Software Foundation. When you install Python, it comes with "standard" things that we have learned so far. Things like lists, tuples, basic math, and some other things. These are collections of script modules that makes programming easy without having to write commonly used commands from scratch.

### The Ecosystem
Python's ecosystem is the reason it is popular in the data industry. When we talk about Python, we most often are talking about the "Python Ecosystem". The ecosystem is the standard library plus third-party open-source libraries. They are libraries, made by anyone around the world, that are like "add-ons" to Python that you can download and install for free.

Python's ecosystem is so big (hundreds of thousands of third party libraries) that some groups of people organized them into websites where you can easily download and install third party libraries. They are below:

- [Python Package Index (PyPI). Hundreds of thousands of libraries that do everything from mapping to stock trading](https://www.PyPi.org)
- [Anaconda Distribution. This is Python + popular third party scientific libraries](https://www.anaconda.com/products/distribution)
- [Conda](https://www.conda.io)

### Common Libraries
- NumPy - for working with numbers
- Pandas - for data analysis
- Matplotlib - for making plots
- Datetime - for working with dates
- requests - for working with APIs
- pdb - for debugging

**Importing a Library**

In [None]:
import numpy

numpy.array([1, 2, 3])

In [None]:
import numpy as np

np.array([1, 2, 3, 4, 5])

In [None]:
import datetime as dt

dt.datetime.now()

In [None]:
import datetime as dt

dt.date.today()

**Exercise** Find a library on [PyPI](https://www.pypi.org) that makes maps!

In [None]:
# Your code here


## Resources
- The world's most popular Python podcast: [Talk Python to Me](https://talkpython.fm)
- The definitive book on Pandas: [Python for Data Analysis](https://wesmckinney.com/book/)
- The BEST Introduction to Applied Statistics: [StatQuest - Joshua Starmer](https://www.youtube.com/playlist?list=PLblh5JKOoLUK0FLuzwntyYI10UQFUhsY9)