In [None]:
# Preliminary: if running on Colab, get the data from GitHub
try:
  import google.colab
  print("Downloading data from GitHub...")
  !wget -nc -P'Data' https://github.com/miaambelez/UG_Python_Workshop/raw/main/Data/two_sums_20000.pkl
  !wget -nc -P'Data' https://github.com/miaambelez/UG_Python_Workshop/raw/main/Data/two_sums_200000.pkl
  print("...done!")
except:
  print("Running locally, data should be already on path!")

# Pythonic principles

### PEP 8: The Style Guide for Python Code
PEP 8 stands for Python Enhancement Proposal #8. It's essentially a set of rules and guidelines for formatting Python code. Adhering to PEP 8 helps make your code more readable and maintainable, not just for you but for others who may work on your code in the future. Some key aspects include naming conventions, indentation, line spacing, and avoiding overly complex expressions. Think of it as the etiquette for writing Python code, ensuring that it's clean, easy to read, and aesthetically pleasing.
You can find the full PEP8 guid on: https://pep8.org/

As an example, the guidelines on naming convetions state: 
- Function names should be lowercase, with words separated by underscores.
- Constants should be in all uppercase, with words separated by underscores.
- Variable names should be lowercase, with words separated by underscores for readability.


#### Exercise 1: Naming Conventions
PEP 8 advises using specific naming styles for different parts of your code like variables, functions, classes, and constants. This helps differentiate between the types of entities in your code at a glance.

**Task**: Correct the naming of the following entities according to PEP 8.



In [None]:
def calculateArea(radius):
    PI_CONSTANT = 3.14159
    area_of_circle = PI_CONSTANT * radius ** 2
    return area_of_circle

customerName = "John Doe"
listofemails = ['john@example.com', 'jane@example.com']

#### Exercise 2: Indentation and Line Length
PEP 8 recommends using 4 spaces per indentation level and keeping lines to 79 characters or less. This ensures that your code is visually aligned and can be easily read without scrolling horizontally.
- Use 4 spaces for each indentation level.
- Break down lines to ensure they don't exceed 79 characters.
- Separate logical sections of your code with blank lines to improve readability.

**Task**: Reformat the code below to meet PEP 8 indentation and line length recommendations.

In [None]:
def function(x, y): return x**2 + y**2, x + y
result = function(5, 7); print("The result is:", result)


### Data Structures in Python
Data structures are ways of organizing and storing data in a computer so that it can be accessed and modified efficiently. Python has several built-in data structures that are very useful and easy to use. Understanding these is crucial for solving various programming problems. The following are essential data structures, but please note that this is not an exhaustive list:

- Lists: Ordered and mutable collections of items. Lists can contain items of different types, including other lists.

- Tuples: Ordered and immutable collections of items. Tuples are used for data that should not change after its creation.

- Dictionaries: Unordered collections of key-value pairs. They are fast because they use hashing, allowing you to quickly look up the value associated with a given key.

- Sets: Unordered collections of unique items. They are useful for membership testing, removing duplicates, and set operations like union, intersection, etc.

#### Exercise 1: Basic list usage
Task: Create a list named colors containing three colors. Then add a new color to the list and remove one.

In [None]:
# Add your code here:


#### Exercise 2: List comprehensions
For those already familiar with lists and how they work, the next step is to understand and write list comprehensions. List comprehensions provide a concise way to create lists by integrating a loop, conditional logic, and assignment in a single, readable line, embodying the Pythonic principle of clear, efficient, and elegant coding. 
As an example, consider the this list comprehension that creates a list of squares for the numbers from 1 to 5:

In [None]:
# Example of list comprehensions:
squares = [x**2 for x in range(1, 6)]
print(squares)

Now consider the following code snippet, it iterates over a list of numbers, checks each number to see if it is even, and if so, multiplies it by 2 and adds it to a new list. 

In [None]:
numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

doubled_evens = []  # Initialize an empty list to hold the results

for num in numbers:
    if num % 2 == 0:  # Check if the number is even
        doubled_evens.append(num * 2)  # Multiply the number by 2 and add it to the list

print("Original list:", numbers)
print("Doubled even numbers:", doubled_evens)

The challenge in this exercise is to write a list comprehension that is able to do this in one line of code. Try your hand at it in the cell below:

In [None]:
# Answer: 


### Functions in Python
Functions are reusable blocks of code designed to perform a specific task. Defined using the def keyword, followed by the function name and parentheses containing any parameters, functions encapsulate code for easy modularity and reuse. Once defined, a function can be called anywhere in your code by its name, allowing for cleaner, more organized, and efficient programming. Functions can return values using the return statement, allowing the function to pass data back to the caller. They are fundamental in Python, helping to break down complex problems into smaller, manageable tasks, and promoting code reuse and maintainability.

The following is a template for a function in python. It includes:
- function_name: The name of the function, which should be descriptive and follow Python naming conventions.
- parameter1, parameter2, ...: Parameters the function takes as input, which are variables used within the function.
- A docstring: A multi-line comment that explains the function's purpose, its parameters, and what it returns. This is important for documentation and readability.
- The function body: The code block where the function's operations are defined.
- A return statement: This is optional, depending on whether your function needs to return a value.

In [None]:
def function_name(parameter1, parameter2, parameter3, ...):
    """
    Docstring explaining the function's purpose and usage.

    Parameters:
    parameter1: Description of parameter1.
    parameter2: Description of parameter2.
    parameter3: Description of parameter3.
    ...

    Returns:
    Description of what the function returns.
    """
    # Function body starts here
    # Perform operations using parameters
    if parameter3:
        result = parameter1 + parameter2  # Example operation
    else:
        result = parameter1 - parameter2

    # Return the result (if applicable)
    return result

### Type hinting

To improve code readability and maintainability (and spend less time debugging!), it is useful to add **type hints** in your function. This basically means indicating the type of values your function takes as argument(s) and which type of value(s) it returns. 

For the function in the example above, `parameter1` and `parameter2` are numbers and `parameter3` is a boolean. The value the function returns is also a number. Then the code can be updated as follows with type hints:

In [None]:
def function_name(
        parameter1: float, 
        parameter2: float, 
        parameter3: bool,
        ) -> float:
    """
    Docstring explaining the function's purpose and usage.

    Parameters:
    parameter1: Description of parameter1.
    parameter2: Description of parameter2.
    parameter3: Description of parameter3.

    Returns:
    Description of what the function returns.
    """
    # Function body starts here
    # Perform operations using parameters
    if parameter3:
        result = parameter1 + parameter2  # Example operation
    else:
        result = parameter1 - parameter2

    # Return the result (if applicable)
    return result

#### Exercise 3: Function
Use the template to write a function that doubles even numbers, as in the previous exercise. If you want, add also type hinting.

In [None]:
# Your function goes here: 


### Decorator functions
A decorator in Python is a function that takes another function as an argument, extends its behavior without explicitly modifying it, and returns the modified function, providing a flexible and reusable way to augment function functionality. Decorators are useful because they allow for the extension and modification of function behavior in a clean, readable, and maintainable way. They enable code reuse, reduce redundancy, and can add functionality to existing functions or methods without altering their core logic. This is particularly valuable in scenarios where you want to apply the same piece of code, like logging, authorization, or performance timing, across multiple functions or methods, thus adhering to the "Don't Repeat Yourself" (DRY) principle. By separating concerns, decorators enhance code modularity and clarity, making it easier to manage and extend.

In the example below, we create wrapper function called log_execution, which prints information for logging info in execution of the function. You implement a decorator with an @ statement by placing it directly above a function definition, followed by the decorator function name, to automatically wrap and modify the behavior of the function it decorates.

In [None]:
def log_execution(func):
    def wrapper(*args, **kwargs):
        print(f"Executing {func.__name__}...")
        result = func(*args, **kwargs)
        print(f"{func.__name__} executed.")
        return result
    return wrapper

@log_execution
def double_even_numbers(numbers):
    return [num * 2 for num in numbers if num % 2 == 0]



In [None]:
# Testing the decorated function
numbers = [1, 2, 3, 4, 5, 6]
double_even_numbers(numbers)

#### Exercise 5: Add time measurement in a decorator

Write a decorator named time_execution that measures the time in nanoseconds that| a function takes to execute and prints this duration. 
Write your decorator and add it to your function below:

In [None]:
import time

def time_execution(func):
# Your code here:



Test your decorated function in the cell below:

In [None]:
# Your code here:

### Time complexity and space complexity

When you write Python code, at the beginning you will mainly focus about getting to the solutions you need. The more comfortable you get with Python data structures, functions and algorithms, the more you can focus on **optimizing your code**. 

There are many different ways to optimize your code, often relevant questions are: *how long does it take for my code to run?* and *how much memory is required while my code is running?*

Time complexity and space complexity refer to, respectively, the amount of time and the amount of memory required by an algorithm to run **as a function of the input size**.
Ideally, you want to optimize both memory and runtime, while keeping readability and simplicity high. But in practice one often has to make compromises. 

In particular when your input is big, time complexity can impact your work (think of having to wait seconds or minutes or hours for your code to run while you are testing it, it can make a big difference!).

#### BONUS Exercise: adding time complexity constraints to your solutions

Given an exercise, let's first try to solve it however you feel like it, then we will check whether the solution can be improved in terms of runtime (to test this, we can use the decorator we wrote earlier!)

##### Problem statement:

Given an array of integers `nums` and an integer `target`, return the indices of the two numbers such that they add up to target. 

You may assume that each input would have exactly one solution, and you may not use the same element twice.

##### Run this cell to create a test function `test_func` that we will use to test our algorithms


In [None]:
# run this cell to load the data and create a test fuction that, given a function as input, tests the function and prints the output
import pickle
with open("Data/two_sums_20000.pkl", "rb") as data_set:
    data = pickle.load(data_set) 
with open("Data/two_sums_200000.pkl", "rb") as data_set:
    bigger_data = pickle.load(data_set) 

def test_funct(funct, include_big_data=True):

    test_sets = [
    {"nums": [2,7,11,15], "target": 9},
    {"nums": [3,2,4], "target": 6},
    {"nums": [3,3], "target": 6},
    {"nums": data, "target": 21}
    ]
    
    if include_big_data is True:
        test_set.append({"nums": bigger_data, "target": 20})

    for test_set in test_sets:
        nums, target = test_set["nums"], test_set["target"]
        indices = funct(nums,target)
        print(f"The indices of the two numbers that sum to {target} are: {indices} and the numbers are: {nums[indices[0]]} and {nums[indices[1]]}")

##### Part 1: just try to solve the problem!

In [None]:

def two_sum_brute_force(nums, target):
# Your code here:


In [None]:
# run this cell to test the function
test_funct(two_sum_brute_force, include_big_data=False)

#what happens if you run it and include the big data?

Reflect on the time complexity of your algorithm. How many times does your code have to iterate through an arbitrary input `nums` before finding the solution?

##### Part 2: Can you do better? 
The most intuitive way to solve Part 1 is with brute forcing (in our case in the answer sheet, it is in quadratic time, or *O(n<sup>2</sup>)*, as worst case scenario the algorithm has to make `len(nums)**2` iterations). And with that kind of algorithms, big inputs might not be manageable 9as you might have noticed by running the code on the bigger dataset). 

How to improve it? Can you take advantage of anything in the problem statement that would on average lower the runtime? Or can you think about a more efficient algorithm? 

We have thought of two alternative solutions: one that manipulates the input such that you can apply a search algorithm on the input, and one that takes advantage of Python data structures that allow for a fast retrieval. Can you think of these solution? Can you think of other ways of solving this problem in an efficient way?

*NOTE: if you cannot think of another option, it might be useful to check the answer and to at least try to run the algorithms that we have prepared, to get a feeling of the difference in runtime!*

In [None]:
@time_execution
def two_sum_search_algorithm(nums, target):
#your code here:

In [None]:
#run this cell to test the function
test_funct(two_sum_search_algorithm)

In [None]:
@time_execution
def two_sum_dictionary(nums, target):
#your code here:

In [None]:
#run this cell to test the function
test_funct(two_sum_dictionary)