# Variables & Data Types

# Variables

* Variables are references to data in Python.
* Variables refer to objects.
* Objects in Python are instances of classes (primitive as well as user-defined) and have the following three attributes - ID, type and value.
* ID refers, essentially to the memory address of an object.
* Type refers to the class of the object
* Value is the data the object holds.
* When a variable is assigned an initial value, an object is created with said value and the variable "points" to it.
* If either of the variables is then modified, i.e. assigned a different value, the new assignee points to a new object with the new value and no longer refers to the same object as the original variable.


In [None]:
message = "Hello World" # New object of string type created, message now points to it
alt_message = message # New variable created, but points to the same object in memory. To verify this, use the id() function
print(id(message) == id(alt_message))

alt_message = "Goodbye World" # New object of string type created, alt_message now points to it
print(id(message) == id(alt_message))

True
False


## Python Variable Scope

Scope refers to the section of a file where a variable can be accessed. Python supports the following variable scopes - LEGB - local, enclosing, global, built-in.

* Local scope applies to variables that are declared inside a function and are only available for use inside the function.

* Enclosing scope refers to the fact that nested code blocks can access variables that are available outside of them, but the reverse is not possible.

* Global scope refers to variables that are available for use in the whole file.

* Built-in scope contains all the keywords that Python has reserved. 

* Scope is always resolved inside-out. It means that we always look for a variable in the smallest surrounding scope and then check for its availability in a broader scope. 

* `global` and `nonlocal` keywords can be used to access "higher" scope keywords inside a code block

In [10]:
global_variable = 1337
def outer():
    print(f"We are inside outer.")
    outer_num = 5 # a local variable
    # global_variable = 2448
    print(global_variable)
    def inner():
        inner_num = 10 # another local variable but only available in inner()
        print(f"We are inside inner.")
        print(f"outer_num is {outer_num}")
        print(f"inner_num is {inner_num}")
    inner()
    print(f"outer_num is {outer_num}")
    # print(f"inner_num is {inner_num}")
outer()


x = 'global x'

def outer_foo():
    global x # now we have access to the global x
    x = 'outer x' # so this assignment changes the global x instead of creating a new local variable
    def inner_foo():
        # x = 'inner_x'
        # nonlocal x # now we have access to the x in the outer_foo() function
        global x # now we have access to the global x
        print(x)
    inner_foo()
    print(x)
outer_foo()
print(x)

We are inside outer.
1337
We are inside inner.
outer_num is 5
inner_num is 10
outer_num is 5
outer x
outer x
outer x


## Data Types

* Data types supported in Python include-

| Type  | Description |
| ------|-------------|
| `int`   | Representation of integers in Python - whole numbers|
| `float` | Represents numeric values with fractional parts |
| `str`   | Represents strings - textual data - sequence of characters |
| `bool`  | Represents binary values denoting True/False |

## Strings

* Strings are sequences of characters and are used to store textual information.

In [None]:
message = "This is a simple message."
print(message)
print(message[0:5]) # Strings are index-able and slice-able
subsequent_message = "This is a subsequent message."
print(message + " " + subsequent_message)
print(message, subsequent_message, sep=" ", end=" P.S. - I forgot to tell you this.\n")
# sep sets the token that the strings are separated with
# end sets the token that should be added to the end of the whole string

lc_message = "This is a lower case message"
print(lc_message.lower())

uc_message = "This is an upper case message"
print(uc_message.upper())

print(message.count("a")) # returns the number of occurrences of "a"
print(message.find("a")) # returns the first occurrence of "a"

This is a simple message.
This 
This is a simple message. This is a subsequent message.
This is a simple message. This is a subsequent message. P.S. - I forgot to tell you this.
this is a lower case message
THIS IS AN UPPER CASE MESSAGE
2
8


* fstrings in python are relatively new and allow us to embed other objects' values in the string literal.

In [None]:
greeting = "Greetings"
name = "traveller"
print(f"{greeting}, {name}!")

# An example of string formatting for output -

hour = 6
minute = 5
second = 10

print(f"The time is {hour:02}:{minute:02}:{second:02}")

# Here, the 2 denotes that placeholder for hour must be at least
# 2 characters wide. We write it as 02 to add leading zeros to make it
# 2 chars wide.

Greetings, traveller!
The time is 06:05:10


## Advanced String Formatting

* Strings can be formatted in many flexible ways

In [31]:
# Using the format() function

employee = {"name" : "Jimothy", "age" : 27}
sentence = "Hello I am {} and I am {} years old.".format(employee["name"], employee["age"])
print(sentence)

# Alternatively, we can just pass the entire dictionary as a field in the format function and use indices
# to access multiple fields from the same object.
sentence = "Hello I am {0[name]} and I am {0[age]} years old.".format(employee)
print(sentence)
# Here, the str.format() function expects the key to be a string, so we don't use quotes around it.
# If the original dictionary had non-string type keys, we would have had to extract the information manually
# like in the earlier example.


# Do the same with object attributes. Pass a single attribute to the format() function and access attributes
# using the dot(.) operator in the placeholder fields

# Using keyword arguments

sentence = "Hey {name}, I am {my_name} and I am {age} years old. Nice to meet you.".format(name="Jimothy", my_name="Pam", age="25")
print(sentence)

# Unpacking keyword arguments.

sentence = "Hey {name}, have you seen {co_worker}? He's a mess!".format(**employee, co_worker="Dwight")
print(sentence)

Hello I am Jimothy and I am 27 years old.
Hello I am Jimothy and I am 27 years old.
Hey Jimothy, I am Pam and I am 25 years old. Nice to meet you.
Hey Jimothy, have you seen Dwight? He's a mess!


# Operations in Python

## Arithmetic

| Symbol | Operation |
|--------|-----------|
| +      | Addition  |
| -      | Subtraction  |
| *      | Multiplication  |
| /      | Division  |
| //      | Integer divison  |
| %      | Modulus  |
| **      | Exponents  |

* Python also has an in-built math library for more complex operations - factorial, combinations, permutations, rounding, trigonometry, etc.

In [None]:
import math
print(math.factorial(5))
print(math.sin(math.pi/2))

120
1.0


## Logical

Python supports the following logical operations -

1. `and`
2. `or`
3. `not`

## Comparisons

Operations that check the relation between two numbers

| Symbol | Description|
| -------| -----------|
| == | is equal to |
| != | is not equal to |
| < | is less than |
| > | is greater than |
| <= | is less than or equal to |
| >= | is greater than or equal to |

In [None]:
operand_1 = 5
operand_2 = 5
operand_3 = 10

# combine operations to build complex expressions that evaluate to True/False
# using logical operators

print(operand_1 == operand_2 and operand_1 != operand_3)

## Identity Operators

* `x is y` - returns `True` if x and y are the same object (i.e. references to the same object in memory)

* `x is not y` - Inverse of above operator.

* After some testing - it appears that if x and y are equal and are both small enough, they will initially return `True` when we try `x is y` but when x and y are sufficiently large and equal, it appears that they return `False` - why is this?

In [None]:
x = 350
y = 350
print(x is y) # Should print False

y = x
print(x is y) # Should print True


x = 10
y = 10
print(x is y) # Should print False

y = x
print(x is y) # Should print True

False
True
True
True


## Membership Operator

`x in y` - used to check if some data is in a container
`x not in y` - Inverse of above operator

In [None]:
nums = [1, 2, 3, 4, 5]
print(1 in nums)
print(10 not in nums)

True
True


# Control Flow

## Decision making

* Conditionals are implemented in the form of `if-else-elif` blocks



In [None]:
a = 10
b = 20
if a > b:
  print("a is greater than b")
elif a < b:
  print("a is less than b")
else:
  print("a is equal to b")

a is less than b


* Python does not have switch-case statements. Instead, we can use if-elif-else blocks to implement switch-case.

## Loops

Loops are used to implement iteration over a range or while a certain condition holds true.

### `while` loop

While loops are used to repeatedly execute a set of instructions until a condition remains true

### `for` loop

For loops are used to iterate through sequentially arranged data

### `break` and `continue`

`break` and `continue` alter the execution of loops

* `break` exits the immediate loop
* `continue` ignores all subsequent instructions of the loop and goes to the next iteration.

### `loop-else`

`loop-else` statement is used to execute a block of code in case a loop does not terminate using a `break` statement but terminates normally.

**NOTE - IF WE BREAK, ELSE IS NOT EXECUTED.**

### `pass` statement

`pass` is used to forego the definition of a function or a loop. It allows for a function to remain undefined but declared.

In [None]:
a = 5
while a < 50:
  a = a * 2
  print(a)

# Iterating over the elements of a list
numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
for number in numbers:
  print(number)

# An example of enumerate()
for idx, n in enumerate(numbers):
  print(idx, n)

# range-based for loop - range(start, end, step size)
for i in range(10, 0, -1):
  print(i)

for number in numbers:
  if number > 5:
    break
else:
  print("All numbers are greater than 5.")


for number in numbers:
  # For later implementation - do something here
  pass

10
20
40
80
1
2
3
4
5
6
7
8
9
10
0 1
1 2
2 3
3 4
4 5
5 6
6 7
7 8
8 9
9 10
10
9
8
7
6
5
4
3
2
1


# Containers

* Containers are objects in Python that can store or "group" other objects.

* There are four commonly used containers - lists, tuples, dictionaries and sets.

## Lists

* Lists are containers (think - dynamic arrays) that can store other objects inside them.

* They are mutable - they can be modifed using list functions.

* They are stored sequentially in memory, and so can be indexed.

* Lists support "slicing" - using : to "extract" or "separate" subsequences from the data.

* Data can be accessed in reverse order using negative indices

In [None]:
numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] # Declaring a list
# indices  0  1  2  3  4  5  6  7  8   9
# neg ind -10 -9 -8 -7 -6 -5 -4 -3 -2 -1
print(numbers[0]) # Printing the first element of the list
print(numbers[1]) # Printing the second element of the list

numbers[0] = 10 # Modifying the first element of the list
print(numbers)

numbers.append(11) # Adding an element to the end of the list
print(numbers)

numbers.pop() # Removes the last element of the list
print(numbers)

# numbers[start:end:step] - from index start to (end-1), taking steps of size step
print(numbers[5:8:2])
# absence of start means start from the beginning
# absence of end means till the end
# absence of step implies step size of 1


1
2
[10, 2, 3, 4, 5, 6, 7, 8, 9, 10]
[10, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
[10, 2, 3, 4, 5, 6, 7, 8, 9, 10]
[6, 8]


## Tuples

* Tuples, like lists are sequentially stored data that can be indexed, but are immutable, i.e. - after creation, cannot be modified.

In [None]:
numbers = (1, 2, 3, 4, 5)
print(numbers)
print(numbers[0])
numbers[2] = 10 # Should throw and error because we are trying to modify a tuple

(1, 2, 3, 4, 5)
1


TypeError: 'tuple' object does not support item assignment

## Dictionaries

* Dictionaries are Python's implementation of hash tables. Data in dictionaries are stored in the form of `(key, value)` pairs.

* All keys in dictionaries are required to be unique and immutable.

* Values in dictionaries can be indexed using their corresponding associated keys.

In [None]:
person = {
    'name' : 'John Smith',
    'age' : 25,
    'designation' : 'Intern'
}

print(person)
# Add a key value pair
person["hobbies"] = "Climbing"
print(person)
# Remmove a key-value pair
person.pop("hobbies")
print(person)
# Check the presence of a key using get()
print(person.get("specialty")) # should return None because this key does not exist in the dictionary
# Access a list of all keys
print(person.keys())
# Access a list of all values
print(person.values())

{'name': 'John Smith', 'age': 25, 'designation': 'Intern'}
{'name': 'John Smith', 'age': 25, 'designation': 'Intern', 'hobbies': 'Climbing'}
{'name': 'John Smith', 'age': 25, 'designation': 'Intern'}
None
dict_keys(['name', 'age', 'designation'])
dict_values(['John Smith', 25, 'Intern'])


## Sets

* Sets are like mathematical sets.

* Collection of unique elements.

* Can be used to perform set operations like `union`, `intersection`, `difference`, etc.

In [None]:
numbers = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
print(numbers)

# Adding a number
numbers.add(11)
print(numbers)

# Adding a duplicate
numbers.add(11)

# Removing a number
numbers.remove(11)
print(numbers)

A = {1, 2, 3, 4, 5}
B = {4, 5, 6, 7, 8}

# Union
print(A.union(B))
# Intersection
print(A.intersection(B))
# Difference - Present in A, absent in B
print(A.difference(B))

{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
{1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11}
{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
{1, 2, 3, 4, 5, 6, 7, 8}
{4, 5}
{1, 2, 3}


## Comprehension

* Comprehension is used when we need to make a new container containing elements based off of an already existing container and its elements.

In [3]:
nums = [n for n in range(1, 11, 1)]
print(nums)

twice_nums = [2 * n for n in nums]
print(twice_nums)

square_nums = [n*n for n in nums]
print(square_nums)

# Incorporating conditionals with comprehension

even_nums = [n for n in nums if n % 2 == 0]
print(even_nums)

# Using comprehension with dictionaries

super_name = ["Batman", "Superman", "Aquaman", "Flash", "Arrow"]
real_name = ["Bruce Wayne", "Clark Kent", "Arthur Curry", "Barry Allen", "Oliver Queen"]

superhero_lookup = {(s_name, r_name) for s_name, r_name in zip(super_name, real_name) if r_name != "Arthur Curry"}
print(superhero_lookup)

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
[2, 4, 6, 8, 10, 12, 14, 16, 18, 20]
[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]
[2, 4, 6, 8, 10]
{('Flash', 'Barry Allen'), ('Batman', 'Bruce Wayne'), ('Arrow', 'Oliver Queen'), ('Superman', 'Clark Kent')}


## Sorting

* Container elements can be sorted using the `sort()` function as well as the `sorted()` function.

* `sort()` sorts the elements of a list in-place. It ONLY works on lists.

* `sorted()` returns a sorted version of the container. It works on all containers.

* When used with a dictionary, `sorted()` sorts the keys.

* We can also sort using custom criteria using a `key` parameter.

In [22]:
nums = [9, -6, 8, 2, 5, -17, 55]
sorted_nums = sorted(nums)
print("nums are =", nums)
print("sorted nums are =", sorted_nums)

# Custom sorting
# sorting by absolute value
sorted_nums = sorted(nums, key=abs)
print("sorted by absolute value =", sorted_nums)

# In case of user-defined classes

class Employee:
    def __init__(self, name, age, salary):
        self.name = name
        self.age = age
        self.salary = salary
    def __repr__(self):
#         return '({}, {}, ${})'.format(self.name, self.age, self.salary)
        return (f"({self.name}, {self.age}, ${self.salary:.2f})")

e1 = Employee("Jane", 35, 50000)
e2 = Employee("John", 23, 60000)
e3 = Employee("Jen", 44, 70000)

# define a function to use as key for sorting.
def e_sort(emp):
    # just return the attribute that we need to act as key for sorting
    return emp.salary

employees = [e1, e2, e3]
sorted_employees = sorted(employees, key=e_sort)
print(employees)
print(sorted_employees)

nums are = [9, -6, 8, 2, 5, -17, 55]
sorted nums are = [-17, -6, 2, 5, 8, 9, 55]
sorted by absolute value = [2, 5, -6, 8, 9, -17, 55]
[(Jane, 35, $50000.00), (John, 23, $60000.00), (Jen, 44, $70000.00)]
[(Jane, 35, $50000.00), (John, 23, $60000.00), (Jen, 44, $70000.00)]


# Input/Output

* We have already seen output in Python to standard output using the `print()` function.

* Python can read input from standard input using the `input(`) function but always reads as string. Data must be cast to preferred data type to perform arithmetic operations on it.

In [4]:
operand_1 = input("Enter the first operand: ")
operand_2 = input("Enter the second operand: ")
print("Concatenated version = ", operand_1 + operand_2) # will print a concatenated version of the two input strings

print("Numerical addition = ", int(operand_1) + int(operand_2)) # will print the sum of the two input integers

Enter the first operand: 5
Enter the second operand: 10
Concatenated version =  510
Numerical addition =  15


## Reading/Writing to Files

* Files can be read from or written to in Python.

In [1]:
# Using file objects

f = open('random.txt', 'r') # open a file called random.txt to read from. If access mode is not specified, default is read
f.close() # Must always close a file after it's used

# Using context manager

with open('random.txt', 'r') as f: # at the end of this block, Python automatically closes the file for you
    file_contents = f.readline() # reads a line. continues to read subsequent lines upon subsequent calls.
    file_contents = f.readlines() # reads all lines, places them in a list with each line as a different element
    file_contents = f.read() # reads everything from a file and puts it in the variable file_contents.
    # read() may take an argument n = number of bytes to read from the file. 
    
    for line in f: # This iterates over the lines in the file and prints them out one by one.
        # We don't store it in memory at all.
        print(line, end=' ') 

## Exception Handling

* Used to handle errors instead of the program crashing.


In [10]:
try:
    f = open('random.txt', 'r')
    var = bad_var
except NameError:
    print('This file does not exist.')

This file does not exist.


The `except` block must contain the specific exception that you're expecting to get from the code in the `try` block. For example, 

In [12]:
try:
    f = open('rando.txt', 'r')
    var = bad_var
except FileNotFoundError:
    print('This file does not exist.')
except NameError as e:  # Can also print the exception itself instead of a custom message.
    print(e)
except Exception:
    print('Sorry. Something went wrong.')

This file does not exist.


Always remember to start with more specific exceptions and then the generic ones.

We can also use the `else` block to execute code in case no exception is thrown.

In [13]:
try:
    f = open('random.txt', 'r')
    # var = bad_var
except FileNotFoundError:
    print('This file does not exist.')
except NameError as e:  # Can also print the exception itself instead of a custom message.
    print(e)
except Exception:
    print('Sorry. Something went wrong.')
else:
    print(f.read())
    f.close()

Hello World



Use the `finally` block at the end to run code that should be executed no matter what happens. 

In [15]:
try:
    f = open('random.txt', 'r')
    var = bad_var
except FileNotFoundError:
    print('This file does not exist.')
except NameError as e:  # Can also print the exception itself instead of a custom message.
    print(e)
except Exception:
    print('Sorry. Something went wrong.')
else:
    print(f.read())
    f.close()
finally:
    print("In finally now.")

name 'bad_var' is not defined
In finally now.


It is also possible to raise custom exceptions.

In [16]:
try:
    file_name = 'currupt_file.txt'
    f = open(file_name, 'r')
    # var = bad_var
    
    if f.name == 'currupt_file.txt':
        raise Exception
    
except FileNotFoundError:
    print('This file does not exist.')
except NameError as e:  # Can also print the exception itself instead of a custom message.
    print(e)
except Exception as e:
    print('Sorry. Something went wrong.')
else:
    print(f.read())
    f.close()
finally:
    print("In finally now!")

Sorry. Something went wrong.
In finally now!


# Functions

Functions are modular pieces of code that perform a certain (fairly simple) task. They promote reusability of code because a function once defined can be used multiple times.

* Functions have parameters - information that is "passed to them" when they are called. These parameters may or may not help functions do the "task" they're supposed to perform.

* Functions may or may not have a return type - Functions may return one or more values when they finish executing. The value returned is often the result of some computation performed in the function.

## Positional Arguments

Information passed to a function where the order of the arguments must match the order of the parameters.

In [None]:
# Write a function to compute the area of a triangle, given three of its sides

def triangle_area(a, b, c) -> float: # Parameters are written in parentheses.
                                    # The (expected) return type is float, denoted by the arrow and float
  if a + b < c \
    or a + c < b \
    or b + c < a:
    print("Invalid triangle")
    return

  s = (a + b + c) / 2
  area = (s * (s - a) * (s - b) * (s - c)) ** 0.5
  return area # return statement is used to transfer the control over to the calling function/statement.

# The following is a function with a default argument. In case no argument is
# passed when calling this function, the default specified value is used.
# Default/optional arguments always appear at the end of the list.
def print_message(message="Random message"):
  print(message)

## Arbitrary arguments `*args`

If you don't know how many arguments to expect when called, a function can be defined using arbitrary arguments. In this case, the arguments are present inside the function in the form of a tuple.

```python
def my_function(*shop_list):
  print(f"I need to buy {shop_lisst}")

my_function("apples", "doctors", "stethoscopes")
```

In [None]:
def my_function(*shop_list):
  print(f"I need to buy {shop_list}")

my_function("apples", "doctors", "stethoscopes")

I need to buy ('apples', 'doctors', 'stethoscopes')


## Keyword Arguments

Arguments when passed to functions using the syntax `key = value`. This way, the order in which arguments are passed does not matter.


In [None]:
def triangle_area(a, b, c) -> float: # Parameters are written in parentheses.
                                    # The (expected) return type is float, denoted by the arrow and float
  if a + b < c \
    or a + c < b \
    or b + c < a:
    print("Invalid triangle")
    return

  s = (a + b + c) / 2
  area = (s * (s - a) * (s - b) * (s - c)) ** 0.5
  return area # return statement is used to transfer the control over to the calling function/statement.

# The following function calls all produce the same result:
print(triangle_area(3, 4, 5))
print(triangle_area(a=3, b=4, c=5))
print(triangle_area(c=5, a=3, b=4))

6.0
6.0
6.0


## Arbitrary Keyword Arguments

If the number of keyword arguments is unknown, use `**kwargs`. Arguments are then available for use inside the function in the form of a dictionary.

In [None]:
def go_shopping(**kwargs):
  print(f"I need to buy {kwargs['oranges']} oranges, {kwargs['apples']} apples and {kwargs['stethoscopes']} stethoscopes")

go_shopping(oranges=5, apples=10, stethoscopes=3)

I need to buy 5 oranges, 10 apples and 3 stethoscopes


## Default Parameter Value

We can add default values to be used for parameters, in case corresponding arguments are not passed to the function. Place default parameters after the required parameters.



In [None]:
# An example of default parameter values in functions

def greet(name, message="Hello"):
  """
  This function greets a person with a given message.
  If no message is provided, it defaults to "Hello".
  """
  print(f"{message}, {name}!")

# Call the function with a custom message
greet("Alice", "Good morning")

# Call the function with the default message
greet("Bob")


Good morning, Alice!
Hello, Bob!


## Position-only arguments

A function can be specified to only accept positional arguments. In this case, you cannot pass keyword arguments using the `key=value` syntax.



In [None]:
def calculate_sum(a, b, /):
  """
  This function calculates the sum of two numbers.
  'a' and 'b' are positional-only arguments.
  """
  return a + b

# Call the function with positional arguments
result = calculate_sum(5, 3)
print(result)  # Output: 8

# This will raise an error because we're trying to use keyword arguments
# result = calculate_sum(a=5, b=3)


8


## Keyword-only arguments

We can also set a function to only accept keyword arguments.

In [None]:
def my_function(a, b, /, c, *, d):
  """
  This function demonstrates position-only and keyword-only arguments.

  Args:
    a: Positional-only argument.
    b: Positional-only argument.
    c: Can be passed positionally or as a keyword argument.
    d: Keyword-only argument.
  """
  print(f"a: {a}, b: {b}, c: {c}, d: {d}")

# Call the function with both positional and keyword arguments
my_function(1, 2, 3, d=4)
my_function(1, 2, c=3, d=4)


a: 1, b: 2, c: 3, d: 4
a: 1, b: 2, c: 3, d: 4


# Python Lambda

* Functions written in 1 line using the lambda keyword. They can accept any number of arguments, but only have one expression.

* Think of it as a shortcut. It is used for a short period of time, after which it is "discarded".

Usage:

`lambda parameters : expression`

In [2]:
def double(x):
    return x * 2

print(double(5))

10


Alternatively, we can do the following -

In [4]:
double = lambda x : x * 2
print(double(5))

10


In [5]:
exp = lambda a, b : a ** b
print(exp(4, 5))

1024


# Advanced Python

## Iterators

Iterators are methods that can iterate over containers. An **iterator object** must be implemented using the iterator protocol - which states that we need to implement two special methods - `__iter()__` and `__next()__`

## Generators

Generators are functions that return iterators that can be used to produce a sequence of values when iterated over. Generators help in saving memory because the sequence of values does not need to be stored in memory.


In [5]:
def square_numbers(n):
    result = []
    for i in range(1, n+1):
        result.append(i*i)
    return result

print(square_numbers(5))

# Equivalently

def square_numbers(n):
    for i in range(1, n+1):
        yield(i*i)

result = square_numbers(5)
for n in result:
    print(n)

[1, 4, 9, 16, 25]
1
4
9
16
25


With each iteration over the object result, the next value in the sequence we generated will be stored and printed. This is very memory efficient, since the entire result is not technically stored in memory, but rather instructions to produce the next value in the sequence are.

We can also use comprehension to make a generator object.

In [7]:
upper_limit = 6
result = (n*n for n in range(1, upper_limit)) # Use parentheses using comprehension to make a generator
print(result)

# Then we can iterate over the generator object
for r in result:
    print(r)

<generator object <genexpr> at 0x000001EBBC8F5700>
1
4
9
16
25


In [3]:
import random
import time
import memory_profiler as mem_profile

names = ['Jim', 'Pam', 'Michael', 'Dwight', 'Angela', 'Kevin', 'Oscar', 'Meredith']
department = ['Quality Assurance', 'Sales', 'Reception', 'Customer Service', 'Packaging and Shipping', 'Accounting', 'Supplier Relations']

def people_list(num_people):
    result = []
    for n in range(num_people):
        person = {
            'id' : n,
            'name' : random.choice(names),
            'department': random.choice(department)
        }
        result.append(person)
    return result

def people_generator(num_people):
    for n in num_people:
        person = {
            'id' : n,
            'name' : random.choice(names),
            'department': random.choice(department)
        }
        yield(person)

print('Memory before listing: {}Mb'.format(mem_profile.memory_usage()))
print('Trying to generate list...')
start = time.time()
office_list = people_list(1000000)
end = time.time()
print('Memory after listing: {}Mb'.format(mem_profile.memory_usage()))
print('Making list took {} seconds'.format(end - start))

print('Memory before generating: {}Mb'.format(mem_profile.memory_usage()))
start = time.time()
office_gen = people_generator(10000000)
end = time.time()
print('Memory after generating: {}Mb'.format(mem_profile.memory_usage()))
print('Generating took {} seconds'.format(end - start))

Memory before listing: [305.40234375]Mb
Trying to generate list...
Memory after listing: [305.28515625]Mb
Making list took 1.0092153549194336 seconds
Memory before generating: [305.2890625]Mb
Memory after generating: [305.2890625]Mb
Generating took 0.0 seconds


# Classes & Objects

* Classes are blueprints for user-defined data types.

* Objects are instances of classes

In [23]:
class Employee:
    
    def __init__(self, first, last, pay):
        self.first = first
        self.last = last
        self.pay = pay
        self.email = f"{first}.{last}@company.com"

e1 = Employee("John", "Smith", 50000)
print(e1)
print(e1.email)

<__main__.Employee object at 0x10816b5e0>
John.Smith@company.com


Class variables define an object's properties - all the attributes it is supposed to have (what it is) whereas class methods or member methods define an object's behaviour (what it can do). 

In [28]:
class Employee:
    
    def __init__(self, first, last, pay):
        self.first = first
        self.last = last
        self.pay = pay
        self.email = f"{first.lower()}.{last.lower()}@company.com"
    
    def __repr__(self):
        return f"{self.first} {self.last},\n{self.email},\n{self.pay}"
    
    def give_raise(self, bonus):
        self.pay += self.pay * (bonus / 100)

e1 = Employee("John", "Smith", 50000)
print(e1)

John Smith,
john.smith@company.com,
50000


Methods can be called using two ways - one way which is the more prevalent one using the following syntax -

`<obj_name>.<method_name>()`

The second one does the exact same thing but provides more insight into the `self` parameter in methods -

`<class_name>.<method_name>(<object_name>)`

In [32]:
class Employee:
    
    def __init__(self, first, last, pay):
        self.first = first
        self.last = last
        self.pay = pay
        self.email = f"{first.lower()}.{last.lower()}@company.com"
    
    def __repr__(self):
        return f"{self.first} {self.last},\n{self.email},\n{self.pay:.2f}"
    
    def give_raise(self, bonus):
        self.pay += self.pay * (bonus / 100)

e1 = Employee("John", "Smith", 50000)
print(e1)
# e1.give_raise(15)
# print(e1)

# Alternatively
Employee.give_raise(e1, 15)
print(e1)

John Smith,
john.smith@company.com,
50000.00
John Smith,
john.smith@company.com,
57500.00


## Class Variables

Variables defined inside the class constructor (attributes/instance variables) belong to the instance that they're associated with. Class variables are variables that belong to the class, and as such, can be accessed by all instances. However, they must be accessed using the class or an instance of the class.

In [45]:
class Employee:
    
    raise_percent = 1.15
    
    def __init__(self, first, last, pay):
        self.first = first
        self.last = last
        self.pay = pay
        self.email = f"{first.lower()}.{last.lower()}@company.com"
    
    def __repr__(self):
        return f"{self.first} {self.last},\n{self.email},\n$ {self.pay:,.2f}"
    
    def give_raise(self):
        self.pay *= Employee.raise_percent
#         Also correct
#         self.pay *= self.raise_percent

e1 = Employee("John", "Smith", 50000)
print(e1)
e1.give_raise()
print(e1)


John Smith,
john.smith@company.com,
$ 50,000.00
John Smith,
john.smith@company.com,
$ 57,500.00


However, there are certain nuances to accessing a class variable through the class name and through an instance of the class. 

* When accessing the class variable through an instance, it first looks for a variable with that name in the namespace of the instance (use `__dict__` member variable to check the namespace of the instance). If it can't find a variable of that name, it checks the class' namespace.

In [54]:
class Employee:
    
    raise_percent = 1.15
    
    def __init__(self, first, last, pay):
        self.first = first
        self.last = last
        self.pay = pay
        self.email = f"{first.lower()}.{last.lower()}@company.com"
    
    def __repr__(self):
        return f"{self.first} {self.last},\n{self.email},\n$ {self.pay:,.2f}"
    
    def give_raise(self):
        self.pay *= Employee.raise_percent
#         Also correct
#         self.pay *= self.raise_percent

e1 = Employee("John", "Smith", 50000)
e2 = Employee("Jane", "Smith", 100000)

# Trying to modify the raise_percent using class name

Employee.raise_percent = 1.04
print('Raise for Employee is {0.raise_percent:.2f}'.format(Employee))
print('Raise for {0.first} is {0.raise_percent:.2f}'.format(e1))
print('Raise for {0.first} is {0.raise_percent:.2f}'.format(e2))

# Trying to modify the raise_percent using instance name
e1.raise_percent = 1.10
print('Raise for Employee is {0.raise_percent:.2f}'.format(Employee))
print('Raise for {0.first} is {0.raise_percent:.2f}'.format(e1))
print('Raise for {0.first} is {0.raise_percent:.2f}'.format(e2))

Raise for Employee is 1.04
Raise for John is 1.04
Raise for Jane is 1.04
Raise for Employee is 1.04
Raise for John is 1.10
Raise for Jane is 1.04


When we changed the raise percent for employee 1, it created a new member variable associated with e1 and assigned it to raise_percent. At that point, when accessing raise percent through e1, it is no longer a class variable. However, if accessed through the class name or through any other instance, it will still be a class varaible. 

A basic rule of thumb is -

* If an attribute might change with different instances but stays mostly the same for most instances, access it through the instance `self`.

* If an attribute absolutely has to stay the same for ALL instances of the class, it must only be accessed through the class name.

## Class Methods

Class methods 

## Inheritance

When we have several different classes which have mostly the same structure, i.e. similar attributes and behaviour, we can make a central class (the superclass) and make most of our other similar classes inherit attributes and behaviour from the superclass. This establishes a superclass - subclass heirarchy.

Usage:

`class <subclass_name> (<superclass_name>):`

In [62]:
class Pet:
    def __init__(self, name, age):
        self.name = name
        self.age = age
        
    def show(self):
        print("Hello, I am {0.name} and I am {0.age} years old.".format(self))

class Gato(Pet):
    def speak(self):
        print("{0.name} says meow.".format(self))

class Perro(Pet):
    def speak(self):
        print("{0.name} says bark.".format(self))

sprinkles = Gato("Sprinkles", 12)
scooby = Perro("Scooby Do", 4)

sprinkles.show()
scooby.show()
sprinkles.speak()
scooby.speak()

Hello, I am Sprinkles and I am 12 years old.
Hello, I am Scooby Do and I am 4 years old.
Sprinkles says meow.
Scooby Do says bark.


What happens if we want our subclasses to have some attributes unique to them, in addition to the attributes that they inherit from their superclass?

Let's assume that we now also want to include a colour for our cats, but not for the dogs. In this case, we need to write a separate constructor for the class `Gato`

In [65]:
class Pet:
    def __init__(self, name, age):
        self.name = name
        self.age = age
        
    def show(self):
        print("Hello, I am {0.name} and I am {0.age} years old.".format(self))

class Gato(Pet):
    def __init__(self, name, age, colour): # Include the inherited properties, then the unique properties
        super().__init__(name, age) # Invoke the super constructor to set inherited properties
        self.colour = colour # set the attributes unique to the subclass
    
    def speak(self):
        print("{0.name} says meow.".format(self))

class Perro(Pet):
    def speak(self):
        print("{0.name} says bark.".format(self))

sprinkles = Gato("Sprinkles", 12, "Ginger")

## Special/Dunder Methods

Special methods can be used to implement operator overloading. These methods are always surrounded by `__` a double underscore. For example `__init(self)__` and `__repr()__` and `__str()__`.

`repr()` is supposed to be for other developers - a readable representation that can help with debugging and logging.

`str()` is supposed to be more for the user.

An example of how we can implement operator overloading using dunder methods-

In [1]:
class Point:
    def __init__(self, x, y):
        self.x = x
        self.y = y
    

SyntaxError: incomplete input (3973088818.py, line 2)

## Decorators

### Aside: Closures

Closures are functions that are used to contain other functions. This functionality comes in handy when we deal with modules that need to be moved around in a Python project. In the code below, let's assume that we have a function that adds numbers to an internally-maintained list.

In [None]:
def add_numbers(n):
    nums = []
    nums.append(n)
    return n


Each time `add_numbers(n)` is called, a new list called `nums` is created and the number is added to it. There is no way to cumulatively add numbers to this list inside. We can use a global variable and keep appending data to it, but that means that if we were to import this module into another file, we would have to somehow also include the global list. 

A solution to this is a closure - essentially a nested function. Essentially, data (in this case the list) is declared in the outer function and all the functionality is provided in the inner enclosed function. In Python, the inner function actually keeps track of and has access to all the functions that are declared in the outer function, even after the outer funtion has finished executing!

In [10]:
def add_nums_wrapper():
    nums = []
    def add_nums(n):
        nums.append(n)
        print(nums)
    return add_nums

add_nums = add_nums_wrapper()
for n in range(10):
    add_nums(n)

[0]
[0, 1]
[0, 1, 2]
[0, 1, 2, 3]
[0, 1, 2, 3, 4]
[0, 1, 2, 3, 4, 5]
[0, 1, 2, 3, 4, 5, 6]
[0, 1, 2, 3, 4, 5, 6, 7]
[0, 1, 2, 3, 4, 5, 6, 7, 8]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]


A few things to note - the outer function must return a reference to the inner function. When actually calling the function,

1. We first need to call the outer function to get the returned reference to the inner function.
2. This reference can then be used to achieve our original functionality of creating a cumulative list.

We say that the inner function **closes over** the outer function.

**A "Recipe" for a Closure**

We require three things - 
1. A containing outer function - something that can provide the basis for creating scope to keep track of the state variable.
2. Some state variables - something that we would like to build upon over further function calls.
3. Inner function - to implement the required functionality.

We can also convert this into a class.

In [15]:
# The containing outer function becomes a class
class AddNumber:
# The state variable becomes a class variable
    nums = []
    def add_nums(self, n):
        self.nums.append(n)
        print(self.nums)

    def pop(self):
        if self.nums:
            return self.nums.pop()
        return None
    
list1 = AddNumber()
print(list1.pop())
for n in range(10):
    list1.add_nums(n)
print(list1.pop())
print(list1.nums)

None
[0]
[0, 1]
[0, 1, 2]
[0, 1, 2, 3]
[0, 1, 2, 3, 4]
[0, 1, 2, 3, 4, 5]
[0, 1, 2, 3, 4, 5, 6]
[0, 1, 2, 3, 4, 5, 6, 7]
[0, 1, 2, 3, 4, 5, 6, 7, 8]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
9
[0, 1, 2, 3, 4, 5, 6, 7, 8]


# Virtual Environments

To manage different dependencies and to keep separate projects 'contained' in their own environments, we can use virtual environments.

Let's say that an application runs using Django 5.0. When Django is updated to, say, Django 5.1, the application might break!

To prevent this, use virtual environments. 

# Working with CSV Data

Comma Separated Values are used to store data which can be parsed in Python programs using different libraries. The most useful and straightforward is the `pandas` library which is used for several different types of data parsing and manipulation.

In [2]:
file_name = "./weather_data.csv"
data = []
with open(file_name, 'r') as f:
    data = f.readlines()
table = [row.rstrip('\n') for row in data]
print(table)

['day,temp,condition', 'Monday,12,Sunny', 'Tuesday,14,Rain', 'Wednesday,15,Rain', 'Thursday,14,Cloudy', 'Friday,21,Sunny', 'Saturday,22,Sunny', 'Sunday,24,Sunny']


In [5]:
# Using csv
import csv
file_name = "./weather_data.csv"

with open(file_name, 'r') as f:
    table = csv.reader(f)

    for row in table:
        print(row)

['day', 'temp', 'condition']
['Monday', '12', 'Sunny']
['Tuesday', '14', 'Rain']
['Wednesday', '15', 'Rain']
['Thursday', '14', 'Cloudy']
['Friday', '21', 'Sunny']
['Saturday', '22', 'Sunny']
['Sunday', '24', 'Sunny']


In [3]:
# using pandas

import pandas
file_name = "./weather_data.csv"

data = pandas.read_csv(file_name)
print(data["temp"])

0    12
1    14
2    15
3    14
4    21
5    22
6    24
Name: temp, dtype: int64


In [10]:
import pandas
file_name = "./weather_data.csv"

table = pandas.read_csv(file_name)

table_dict = table.to_dict()
# print(table_dict)
print(table_dict)

table_list = table.temp.to_list()
print(table_list)

print(sum(table_list)/len(table_list))

{'day': {0: 'Monday', 1: 'Tuesday', 2: 'Wednesday', 3: 'Thursday', 4: 'Friday', 5: 'Saturday', 6: 'Sunday'}, 'temp': {0: 12, 1: 14, 2: 15, 3: 14, 4: 21, 5: 22, 6: 24}, 'condition': {0: 'Sunny', 1: 'Rain', 2: 'Rain', 3: 'Cloudy', 4: 'Sunny', 5: 'Sunny', 6: 'Sunny'}}
[12, 14, 15, 14, 21, 22, 24]
17.428571428571427


# Beautiful Soup - Web scraping

Web scraping is a way to extract information from web pages. The Beautiful Soup module helps with that.

A BeautifulSoup object contains the contents of an HTML document converted into a Python nested file structure.

In [16]:
from bs4 import BeautifulSoup

HTML_FILE = "./PyProjects/BeautySoup/index.html"

with open(HTML_FILE) as file:
    soup = BeautifulSoup(file, 'html.parser')
    a_list = soup.find_all(name="a")
    for a in a_list:
        print(a)

<a href="https://www.appbrewery.co/">The App Brewery</a>
<a href="https://angelabauer.github.io/cv/hobbies.html">My Hobbies</a>
<a href="https://angelabauer.github.io/cv/contact-me.html">Contact Me</a>


## Tag Objects

A tag object can be used to access different tags in the html document.

It can have the following attributes - 

* Name - accessed using the name property, can also be modified, and the rest of the soup reflects these changes.
* Attributes - accessed by treating the tag name as a dictionary.

In [17]:
from bs4 import BeautifulSoup

HTML_FILE = "./PyProjects/BeautySoup/index.html"

with open(HTML_FILE) as file:
    soup = BeautifulSoup(file, 'html.parser')
    tag = soup.a # Only the first 'a' tag in the html doc is returned this way
    print(tag['href'])

https://www.appbrewery.co/
https://www.appbrewery.co/
