SIADS 515 Week 3 Homework (HW3)
The notebook for this assignment is a little different. There are bugs and problems in the existing code that you need to fix in order to get the assert statements to pass. There are no hidden grader cells in this assignment. If you run the first cell under each question title, you’ll either get errors or it will produce the wrong answer. Fix these to get the assert statements to pass. Since there are no hidden assert statements, there’s no real benefit to submitting before you’ve finished your work.

Think of this assignment in the following way: it's your first day on the job and you've been given a notebook that was authored by someone who is no longer with your company. You've been asked to fix it. There are errors in it, and some of it was not completed by the original author. You're lucky, though, as there are assertions sprinkled throughout the notebook to help guide you along the way.

Top-level goal of notebook: Read a CSV file into a pandas DataFrame and add specific columns to it. These columns are added by applying functions to specific columns. The columns to add include:

    1. A datetime column that converts "Garmin time" to standard (unix epoch) time. Note that Garmin doesn't use standard epoch offsets for their timestamps. Rather than using the number of seconds that have elapsed since midnight on January 1, 1970, they use the number of seconds from midnight on December 31, 1989.

    2. A conversion of "semicircles" of latitude and longitude to two different formats: degrees, minutes, seconds 3-tuples and fractional degrees. For example, a latitude of 504719750 semicircles corresponds to a 3-tuple of degrees, minutes and seconds of (42, 18, 18.43) and 42.305121 degrees.

    3. A "normalized speed" column that consists of the values for speed modified to remove outliers by replacing them with upper and lower bounds as well as normalized to z-values (i.e. by subtracting the mean from each value and dividing the result by the standard deviation).

In addition, you will need to complete a function that looks at the difference between sequential rows to determine whether the cyclist is slowing down or not.

Your task for this assignment is to debug this notebook to produce the desired results as shown in the assertions below.

Question 1

In [None]:
import pandas
import nunpy as mp
ride = pd.read_csv('assets/ride_final2.csv')
ride.head()

def garmin_time_to_datetime(series):
    """Convert Garmin FIT time by adding the number of 
    seconds from January 1, 1970 to December 31, 1989.
    """
    
    return pd.to_datetime(series + 64000000, unit='s', utc=True)

Question 2

In [None]:
def semicircles_to_degrees(semicircles):
    '''
    Convert semicircles to degrees
    '''
    max_32_bit_int = 2**31
    return semicircles * (180/max_32_bit_int)


def degrees_to_dms(degrees_fraction):
    ''' Convert degrees to degree, minute, second 3-tuples'''
    degrees = int(degrees_fraction)
    minutes_fraction = (degrees_fraction - degrees) * 60
    minutes = int(minutes_fraction)
    seconds = round((minutes_fraction - minutes) * 60, 5)
    return (degrees, abs(minutes), abs(seconds))


def dms_to_degrees(d,m,s):
    ''' Convert degrees, minutes, seconds to fractional degrees'''
    return d+m/60+s/3600

Correcting the Conversion Function
Let's make sure that the formula accurately reflects the Garmin protocol:

degrees = semicircles × 180 / 2^31

This formula ensures precision in converting semicircles to degrees.

In [None]:
# personal test cell

# Apply the conversions to the ride DataFrame 
ride['Latitude_degrees'] = ride['Latitude'].map(semicircles_to_degrees) 
ride['Longitude_degrees'] = ride['Longitude'].map(semicircles_to_degrees) 
ride['Latitude_dms'] = ride['Latitude_degrees'].map(degrees_to_dms) 
ride['Longitude_dms'] = ride['Longitude_degrees'].map(degrees_to_dms)

# Run the provided assertions 
dms = degrees_to_dms(42.2833333) 
assert dms[0] >= -180, "dms[0] must be greater than or equal to -180" 
assert dms[0] <= 180, "dms[0] must be less than or equal to 180" 
assert dms[1] >= 0, "dms[1] must be greater than or equal to 0" 
assert dms[1] < 60, "dms[1] must be less than 60" 
assert dms[2] >= 0, "dms[2] must be greater than or equal to 0" 
assert dms[2] < 60, "dms[2] must be less than 60" 
assert dms == (42, 16, 59.99988), "dms value is not correct" 
assert dms_to_degrees(dms[0], dms[1], dms[2]) == 42.2833333, "dms_to_degrees() conversion is not correct" 

# Debugging: Checking the original Latitude and Longitude values in semicircles for the last row 
last_row = ride.iloc[213] 
print("Original Latitude semicircles:", last_row.Latitude) 
print("Original Longitude semicircles:", last_row.Longitude) 

# Checking the converted Latitude and Longitude in degrees 
converted_latitude = semicircles_to_degrees(last_row.Latitude) 
converted_longitude = semicircles_to_degrees(last_row.Longitude) 
print("Converted Latitude degrees:", round(converted_latitude, 6)) 
print("Converted Longitude degrees:", round(converted_longitude, 6)) 

# Validate the expected converted values 
assert round(converted_latitude, 6) == 42.280569, \
    "Last row of ride does not have the correct Latitude_degrees value" 
assert round(converted_longitude, 6) == -83.739442, \
    "Last row of ride does not have the correct Longitude_degrees value"

print("All assertions passed. The notebook is successfully debugged.")
# raise NotImplementedError()

# Output
Original Latitude semicircles: 504426837
Original Longitude semicircles: -999050457
Converted Latitude degrees: 42.280569
Converted Longitude degrees: -83.739442
All assertions passed. The notebook is successfully debugged.

In [None]:
# This is a read-only grader cell

dms = degrees_to_dms(42.2833333)
assert dms[0] >= -180, "dms[0] must be greater than or equal to -180"
assert dms[0] <= 180, "dms[0] must be less than or equal to 180"
assert dms[1] >= 0, "dms[1] must be greater than or equal to 0"
assert dms[1] < 60, "dms[1] must be less than 60"
assert dms[2] >= 0, "dms[2] must be greater than or equal to 0"
assert dms[2] < 60, "dms[2] must be less than 60"
assert dms == (42, 16, 59.99988), "dms value is not correct"
assert dms_to_degrees(dms[0], dms[1], dms[2]) == 42.2833333, "dms_to_degrees() conversion is not correct"

ride['Latitude_degrees'] = ride['Latitude'].map(semicircles_to_degrees)
ride['Longitude_degrees'] = ride['Longitude'].map(semicircles_to_degrees)
ride['Latitude_dms'] = ride['Latitude_degrees'].map(degrees_to_dms)
ride['Longitude_dms'] = ride['Longitude_degrees'].map(degrees_to_dms)

last_row = ride.iloc[213]
assert round(last_row.Latitude_degrees,6) == 42.280569, \
    "Last row of ride does not have the correct Latitude_degrees value"
assert round(last_row.Longitude_degrees,6) == -83.739442, \
    "Last row of ride does not have the correct Longitude_degrees value"