<a href="https://colab.research.google.com/github/luisosmx/Git_exercises/blob/main/Resources_for_Debugging_Crashes.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Check out the following links for more information:

https://realpython.com/python-concurrency/

https://hackernoon.com/threaded-asynchronous-magic-and-how-to-wield-it-bba9ed602c32

https://stackoverflow.com/questions/33047452/definitive-list-of-common-reasons-for-segmentation-faults

https://sites.google.com/a/case.edu/hpcc/home/important-notes-for-new-users/debugging-segmentation-faults



#Readable Python code on GitHub:

https://github.com/fogleman/Minecraft

https://github.com/cherrypy/cherrypy

https://github.com/pallets/flask

https://github.com/tornadoweb/tornado

https://github.com/gleitz/howdoi

https://github.com/bottlepy/bottle/blob/master/bottle.py

https://github.com/sqlalchemy/sqlalchemy

#More About Managing Resources

Check out the following links for more information:

https://realpython.com/python-concurrency/

https://hackernoon.com/threaded-asynchronous-magic-and-how-to-wield-it-bba9ed602c32

https://www.pluralsight.com/blog/tutorials/how-to-profile-memory-usage-in-python

https://www.linuxjournal.com/content/troubleshooting-network-problems

#Debugging and Solving Software Problems

Improve performance
Once you debug the issue, the program will start processing the file but it takes a long time to complete. This is because the program goes slowly line by line instead of printing the report quickly. You need to debug why the program is slow and then fix it. In this section, you need to find bottlenecks, improve the code, and make it finish faster.

The problem with the script is that it’s downloading the whole file and then going over it for each date. The current script takes almost 2 minutes to complete for 2019-01-01. An optimized script should generate reports for the same date within a few seconds.

To check the execution time of a script, add a prefix "time" and run the script.

This is a pretty challenging task that you have to complete by modifying the get_same_or_newer() function.

Here are few hints to fix this issue:

Download the file only once from the URL.

Pre-process it so that the same calculation doesn't need to be done over and over again. This can be done in two ways. You can choose any one of them:

To create a dictionary with the start dates and then use the data in the dictionary instead of the complicated calculation.
To sort the data by start_date and then go date by date.
Choose any one of the above preprocessing options and modify the script accordingly.

Once you’ve completed modifying the Python script, save the file by clicking Ctrl-o, the Enter key, and Ctrl-x.

In [None]:
import csv
import datetime
import requests

# def get_file_lines(url):
def download_file(url):
    """Returns the lines contained in the file at the given URL"""

    # Download the file over the internet
    response = requests.get(url, stream=True)
    lines = []

    for line in response.iter_lines():
        lines.append(line.decode("UTF-8"))
    return lines

def get_start_date():
    """Interactively get the start date to query for."""

    print()
    print('Getting the first start date to query for.')
    print()
    print('The date must be greater than Jan 1st, 2018')
    year = int(input('Enter a value for the year: '))
    month = int(input('Enter a value for the month: '))
    day = int(input('Enter a value for the day: '))
    print()

    return datetime.datetime(year, month, day)

def get_same_or_newer(data, start_date):
    """Returns the employees that started on the given date, or the closest one."""
    # We want all employees that started at the same date or the closest newer
    # date. To calculate that, we go through all the data and find the
    # employees that started on the smallest date that's equal or bigger than
    # the given start date.
    min_date = datetime.datetime.today()
    min_date_employees = []
    for row in data: 
        row_date = datetime.datetime.strptime(row[3], '%Y-%m-%d')

        # If this date is smaller than the one we're looking for,
        # we skip this row
        if row_date < start_date:
            continue

        # If this date is smaller than the current minimum,
        # we pick it as the new minimum, resetting the list of
        # employees at the minimal date.
        if row_date < min_date:
            min_date = row_date
            min_date_employees = []

        # If this date is the same as the current minimum,
        # we add the employee in this row to the list of
        # employees at the minimal date.
        if row_date == min_date:
            min_date_employees.append("{} {}".format(row[0], row[1]))

    return min_date, min_date_employees

def list_newer(data, start_date):
    while start_date < datetime.datetime.today():
        start_date, employees = get_same_or_newer(data, start_date)
        print("Started on {}: {}".format(start_date.strftime("%b %d, %Y"), employees))

        # Now move the date to the next one
        start_date = start_date + datetime.timedelta(days=1)

def main():
    start_date = get_start_date()
    FILE_URL = "https://storage.googleapis.com/gwg-hol-assets/gic215/employees-with-date.csv"
    data = download_file(FILE_URL) # Hint 1. Download the file only once from the URL.
    reader = csv.reader(data[1:])
    data_list = list(reader)
    data_list.sort(key = lambda x: x[3]) # Hint 2. To sort the data by start_date and then go date by date.
    list_newer(data_list, start_date) 

if __name__ == "__main__":
    main()