<a href='https://ai.meng.duke.edu'> = <img align="left" style="padding-top:10px;" src=https://storage.googleapis.com/aipi_datasets/Duke-AIPI-Logo.png>

# In-class Warmup Exercise
### Question 1
Let's start with a warmup exercise using just Python.  We are going to perform some basic analysis of the stock price for a fictional company NewCo.  Let's suppose we receive some data on the last 10 days of NewCo's stock price.  The data is in the form of a dictionary 'newco_dict' where the keys are ['DATE','LOW','HIGH','CLOSE'].  'DATE' contains a list of string dates. 'LOW' and 'HIGH' each contain a list of the lowest price and highest price respectively of the stock during each day in the 10-day period. 'CLOSE' contains a list of the closing price of the stock at the end of each day during the 10-day period.

For example, newco_dict might look like the following:  

```
newco_dict = {'DATE':['06-14-21','06-15-21','06-16-21','06-17-21','06-18-21',
                      '06-21-21','06-22-21','06-23-21','06-24-21','06-25-21'],
              'LOW':[80,72,70,74,82,88,94,88,73,67],
              'HIGH':[95,81,76,88,90,96,102,94,90,72],
              'CLOSE':[90,75,75,86,90,93,100,91,75,68]}
```

Our goal in this assignment will be to find all the dates during the 10-day period when the closing price of the stock was equal to the mode (most frequent) value of the closing prices over the 10-day period.  We are going to tackle this in two parts.  First we will create a function to find the mode of the closing prices.  In part 2 we will then create a function to identify the dates on which the closing price was equal to the mode.


NOTE: For all parts of Question 1, you may not import any additional libraries - I would like you to solve these using only the Python you have learned so far and no helper libraries unless already imported for you. 

#### Question 1.1
Let's first define a function `most_freq_closing(price_dict)` which takes as input a dictionary of prices in the form of 'newco_dict' and returns the most frequent closing price (the mode of the closing prices) during the period as an integer.  You may assume that the dictionary is non-empty and that there is a single mode price (in real life, some datasets may have no mode, if all elements are represented equally as often, or may have more than one mode).

There are many ways to approach this problem.  One method (although there are others) might be to iterate through the set of the unique closing values and count how often each one appears during the 10 days.  The mode is the one that appears most frequently (has the highest count of appearances).  

How you structure your solution is up to you, although you must complete the `most_freq_closing` function below.  You may include all your solution code within that function or break sub-tasks out into separate smaller functions (which you can insert into the same cell or add a new cell).

In [5]:
def most_freq_closing(price_dict):
    '''
    Finds the most frequent closing price of a stock

    Inputs:
        price_dict(dict): Dictionary containing DATE, LOW, HIGH, CLOSE prices for an arbitrary number of days

    Returns:
        mode_val(int): Most frequent closing price
    '''
    ### BEGIN SOLUTION ###
    # Identifies the mode of the stock closing prices
    count_dict = {}
    # Iterate through set of unique closing prices and count occurrence of each
    for val in price_dict['CLOSE']:
        if val in count_dict.keys():
            count_dict[val] += 1
        else:
            count_dict[val] = 1
    max_count = max(count_dict.values())
    # Get the closing price corresponding to the max count
    for key in count_dict.keys():
        if count_dict[key] == max_count:
            mode_val = key
    return mode_val
    ## END SOLUTION ### 

In [6]:
# `most_freq_closing()`: test cell
newco_dict = {'DATE':['06-14-21','06-15-21','06-16-21','06-17-21','06-18-21',
                      '06-21-21','06-22-21','06-23-21','06-24-21','06-25-21'],
              'LOW':[80,72,70,74,82,88,94,88,73,67],
              'HIGH':[95,81,76,88,90,96,102,94,90,72],
              'CLOSE':[90,75,75,86,90,93,100,91,75,68]}
mode_price = most_freq_closing(newco_dict)
print(f'Your function calculated the most frequent price as {mode_price}')
assert type(mode_price) == int, 'Your function did not return an integer'
assert mode_price == 75,'Incorrect, check your function and try again'
print('\n(Passed visible tests!)')

### BEGIN HIDDEN TESTS
newco_dict2 = {'DATE':['06-14-21','06-15-21','06-16-21','06-17-21','06-18-21',
                      '06-21-21','06-22-21','06-23-21','06-24-21','06-25-21'],
              'LOW':[80,72,70,74,82,88,94,88,73,67],
              'HIGH':[95,81,76,88,90,96,102,94,90,72],
              'CLOSE':[90,74,90,86,90,93,100,91,75,68]}
mode_price = most_freq_closing(newco_dict2)
assert type(mode_price) == int, 'Your function did not return an integer'
assert mode_price == 90,'Incorrect, check your function and try again'
print('\n(Passed hidden tests!)')
### END HIDDEN TESTS

Your function calculated the most frequent price as 75

(Passed visible tests!)

(Passed hidden tests!)


#### Question 1.2
Let's now make use of our new function `most_freq_closing` to find all dates on which the most frequent closing price occurred.  Write a new function `dates_most_freq_closing(price_dict)` which finds and returns the dates on which the most frequent (mode) closing price occurs.  The input to the function is a dictionary of the form of `newco_dict` and the output is a **list** of dates.  Your function should make use of the function `most_freq_closing()` that you just wrote to find the mode closing price, and then you should find all dates on which it occurs.

In [7]:
def dates_most_freq_closing(price_dict):
    '''
    Finds the dates on which the closing price is the mode of the time period of the data proved as an input

    Inputs:
        price_dict(dict): Dictionary containing DATE, LOW, HIGH, CLOSE prices for an arbitrary number of days

    Returns:
        dates(list): List of dates on which the close price is the mode
    '''
    # Returns a list of dates on which the stock closing price was equal to the mode
    ### BEGIN SOLUTION ### 
    # Get the mode of the closing prices
    mode_price = most_freq_closing(price_dict)
    # Find the index values of the list of closing prices where the closing prices equals mode
    idxs = [i for i,value in enumerate(price_dict['CLOSE']) if value==mode_price]
    # Use the index values to get the dates at those index values when closing price equals mode
    dates = [price_dict['DATE'][i] for i in idxs]
    return dates
    ### END SOLUTION ###

In [8]:
# `dates_most_freq_closing()`: test cell
newco_dict = {'DATE':['06-14-21','06-15-21','06-16-21','06-17-21','06-18-21',
                      '06-21-21','06-22-21','06-23-21','06-24-21','06-25-21'],
              'LOW':[80,72,70,74,82,88,94,88,73,67],
              'HIGH':[95,81,76,88,90,96,102,94,90,72],
              'CLOSE':[90,75,75,86,90,93,100,91,75,68]}

# Check that dates_most_freq_closing uses most_freq_closing
orig_most_freq_closing = most_freq_closing
del most_freq_closing
try:
    dates_most_freq_closing(newco_dict)
except NameError:
    pass
else:
    raise AssertionError("dates_most_freq_closing does not use most_freq_closing")
finally:
    most_freq_closing = orig_most_freq_closing
    
# Check output
dates = dates_most_freq_closing(newco_dict)
print(f'Your function returned the dates: {dates}')
assert type(dates) == list, 'Your function did not return a list of dates'
assert type(dates[0] == str), 'The values in your list are not strings'
assert dates == ['06-15-21', '06-16-21', '06-24-21'], 'Incorrect, check your function and try again'
print('\n(Passed visible tests!)')

Your function returned the dates: ['06-15-21', '06-16-21', '06-24-21']

(Passed visible tests!)


### Question 2
#### Question 2.1
Let's first define a function `most_freq_closing(price_dict)` which takes as input a dictionary of prices in the form of 'newco_dict' and returns the most frequent closing price (the mode of the closing prices) during the period as an integer.  You may assume that the dictionary is non-empty and that there is a single mode price (in real life, some datasets may have no mode, if all elements are represented equally as often, or may have more than one mode).

NOTE: for this problem you MUST use the NumPy library, which I have imported above (make sure you already have NumPy installed - if not first follow the instructions in the setup.md document in the Module 2 Introduction section of Sakai).  **You should not use any loops (including list comprehensions) in your solution.**

One solution might be to count the occurrences of each unique closing value during the period, and the find the closing value with the highest count of occurences.  You can do these steps using NumPy without any loops.  There are multiple ways to solve this.  You may include all your code in the `most_freq_closing` function below or you may choose to add additional functions for each sub-task which are then called by `most_freq_closing`.

In [11]:
import numpy as np

def most_freq_closing(price_dict):
    '''
    Finds the most frequent closing price of a stock

    Inputs:
        price_dict(dict): Dictionary containing DATE, LOW, HIGH, CLOSE prices for an arbitrary number of days

    Returns:
        mode_val(int): Most frequent closing price
    '''
    ### BEGIN SOLUTION ###
    # Convert closing prices to an array
    closing_array = np.array(price_dict['CLOSE'])
    # Get the unique values and value counts of each
    uniq_vals, counts = np.unique(closing_array,return_counts=True)
    mode_idx = np.argmax(counts)
    mode_val = uniq_vals[mode_idx]
    return int(mode_val)
    ### END SOLUTION ###

In [12]:
# `most_freq_closing()`: test cell
newco_dict = {'DATE':['06-14-21','06-15-21','06-16-21','06-17-21','06-18-21',
                      '06-21-21','06-22-21','06-23-21','06-24-21','06-25-21'],
              'LOW':[80,72,70,74,82,88,94,88,73,67],
              'HIGH':[95,81,76,88,90,96,102,94,90,72],
              'CLOSE':[90,75,75,86,90,93,100,91,75,68]}
mode_price = most_freq_closing(newco_dict)
print(f'Your function calculated the most frequent price as {mode_price}')
assert type(mode_price) == int, 'Your function did not return an integer'
assert mode_price == 75,'Incorrect, check your function and try again'
print('\n(Passed visible tests!)')

### BEGIN HIDDEN TESTS
newco_dict2 = {'DATE':['06-14-21','06-15-21','06-16-21','06-17-21','06-18-21',
                      '06-21-21','06-22-21','06-23-21','06-24-21','06-25-21'],
              'LOW':[80,72,70,74,82,88,94,88,73,67],
              'HIGH':[95,81,76,88,90,96,102,94,90,72],
              'CLOSE':[90,75,90,86,90,93,100,91,75,68]}
mode_price = most_freq_closing(newco_dict2)
print(f'Your function calculated the most frequent price as {mode_price}')
assert type(mode_price) == int, 'Your function did not return an integer'
assert mode_price == 90,'Incorrect, check your function and try again'
print('\n(Passed hidden tests!)')
### END HIDDEN TESTS

Your function calculated the most frequent price as 75

(Passed visible tests!)
Your function calculated the most frequent price as 90

(Passed hidden tests!)


#### Question 2.2
Let's now make use of our new function `most_freq_closing` to find all dates on which the most frequent closing price occurred.  Write a new function `dates_most_freq_closing(price_dict)` which finds and returns the dates on which the most frequent (mode) closing price occurs.  The input to the function is a dictionary of the form of `newco_dict` and the output is a **list** of dates.  Your function should make use of the function `most_freq_closing()` that you just wrote to find the mode closing price, and then you should find all dates on which it occurs.

NOTE: for this problem you MUST use the NumPy library, which I have imported above (make sure you already have NumPy installed - if not first follow the instructions in the setup.md document in the Module 2 Introduction section of Sakai).  You MAY NOT use any loops (including list comprehensions) in your solution.

Hint (only read if you need it, otherwise skip!):  
- You might find it helpful to filter an array of the closing prices to those which match the most frequent price, and then use the resulting Boolean mask (see example in demo notebook) to filter the dates

In [13]:
def dates_most_freq_closing(price_dict):
    '''
    Finds the dates on which the closing price is the mode of the time period of the data proved as an input

    Inputs:
        price_dict(dict): Dictionary containing DATE, LOW, HIGH, CLOSE prices for an arbitrary number of days

    Returns:
        dates(list): List of dates on which the close price is the mode
    '''
    ### BEGIN SOLUTION ###
    # Get the most frequent closing price
    mode_price = most_freq_closing(price_dict)
    # Create a boolean mask of when the closing price equals the mode
    bool_mask = np.array(newco_dict['CLOSE']) == mode_price
    # Apply the boolean mask to the dates to get the dates on which the closing price equals mode
    dates_array = np.array(newco_dict['DATE'])[bool_mask]
    
    # Return the values as a list
    return dates_array.tolist()
    ### END SOLUTION ###

In [14]:
# `dates_most_freq_closing()`: test cell
newco_dict = {'DATE':['06-14-21','06-15-21','06-16-21','06-17-21','06-18-21',
                      '06-21-21','06-22-21','06-23-21','06-24-21','06-25-21'],
              'LOW':[80,72,70,74,82,88,94,88,73,67],
              'HIGH':[95,81,76,88,90,96,102,94,90,72],
              'CLOSE':[90,75,75,86,90,93,100,91,75,68]}

# Check that dates_most_freq_closing uses most_freq_closing
orig_most_freq_closing = most_freq_closing
del most_freq_closing
try:
    dates_most_freq_closing(newco_dict)
except NameError:
    pass
else:
    raise AssertionError("dates_most_freq_closing does not use most_freq_closing")
finally:
    most_freq_closing = orig_most_freq_closing
    
# Check output
dates = dates_most_freq_closing(newco_dict)
print(f'Your function returned the dates: {dates}')
assert type(dates) == list, 'Your function did not return a list of dates'
assert type(dates[0] == str), 'The values in your list are not strings'
assert dates == ['06-15-21', '06-16-21', '06-24-21'], 'Incorrect, check your function and try again'
print('\n(Passed visible tests!)')

Your function returned the dates: ['06-15-21', '06-16-21', '06-24-21']

(Passed visible tests!)
