# Week 11 In-Class Challenge

This week, we are doing an in-class exercise.  This will be worth 5 extra credit points for each team that creates a successful solution that follows the programming guidelines we've established this semester.  All the requirements for this programming challenge are described below.  If you complete them all successfully, you will receive 5 points.  If you do not, you will receive 0 points.

Work as a group.  You will all receive the same number of points.

## Requirements
1. Your code must be a function named `week11()` that takes no parameters
2. Your `week11()` function must read this CSV from the internet and use it as input: https://hds5210-data.s3.amazonaws.com/icd10.csv
  * This file has three columns: CODE, SHORT DESCRIPTION, LONG DESCRIPTION, and NF EXCL
  * The NF EXCL indicates that this code is excluded from a "no fault" list related to workers compensation insurance claims
3. Your `week11()` function must use Pandas functions to generate new columns and filter the dataframe using the following rules
   * Create a new column called "CODE TYPE" that contains only the first character of the CODE column. For example if CODE="A001" then CODE TYPE="A"
   * Create a new column called "CODE NUM" that contains only the numeric part of the CODE column and make it numeric. For example if CODE="A001" then CODE NUM=1
   * Some CODE NUM portions cannot be converted directly because the have an "X" in them.  Convert that "X" to a "." and then conver the CODE NUM to a numeric value.  For example if CODE="E1037X1" then CODE NUM=1037.1
   * Filter your results to only include those rows where NF EXCL="Y"
   * Sort your results in ascending order by CODE TYPE and then by CODE NUM
4. Use the "checker" in the last cell to confirm that your results are correct.  If the checker gives any errors, you will receive no credit.
---

In [None]:
# This is the checker I created

import json
import pandas as pd
import numpy as np

def check_result(submission):

    try:
        obj = json.loads(str(submission))
    except:
        return "Your submission could not be interpreted as valid JSON. Try using https://jsonlint.com to validate your output: {}".format(str(submission))

    if not isinstance(obj, list):
        return "Your submission was not a <list>, it was a {}".format(type(obj))

    try:
        df = pd.DataFrame(obj)
    except:
        return "Your submission could not be converted to a Pandas dataframe.  It looks like this: {}".format(json.dumps(obj, indent=4))

    # Check shape
    expected_columns = ['CODE', 'SHORT DESCRIPTION', 'LONG DESCRIPTION', 'NF EXCL', 'CODE TYPE', 'CODE NUM']

    # Updated to allow for 1098 or 1090
    if df.shape != (1090, 6) and df.shape != (1098,6):
        response = f"Your submission has the wrong shape.  It should have 1091 rows and only 6 columns.  Yours is {df.shape}."
        if df.shape[0] > 1090:
            response += "\nMaybe you didn't filter the results or remove duplicates correctly?"
        if df.shape[1] > 6:
            response += "\nMaybe you have some extra columns?\nI want this: {}\nYou gave me {}".format(expected_columns, list(df.columns))
        if df.shape[1] < 6:
            response += "\nMaybe you removed some columns I was expecting.\nI want this: {}\nYou gave me {}".format(expected_columns, list(df.columns))
        return response

    # Check column names
    if list(df.columns) != expected_columns:
        return "It looks like you don't have the right columns.\nI want this: {}\nYou gave me this: {}".format(expected_columns, list(df.columns))

    # Check the CODE NUM is numeric
    if not (df['CODE NUM'].dtype == np.dtype('float64') or df['CODE NUM'].dtype == np.dtype('int64')):
        return "It looks like your CODE NUM column is not entirely numeric.  You can check that with df['CODE NUM'].str.isnumeric().all()"

    # Check sort order
    cp = df.copy()
    cp.sort_values(['CODE TYPE','CODE NUM'], inplace=True)
    cp = cp.reset_index()

    df = df.reset_index()
    if not df[['CODE','CODE TYPE','CODE NUM']].eq(cp[['CODE','CODE TYPE','CODE NUM']]).all().all():
        print(df.eq(cp).all())
        return "It looks like you may not have sorted the data frame by CODE TYPE and CODE NUM as required. Sort by those values and try again."

    return "You did it!!"


In [None]:
import pandas as pd

def week11():
    """() -> pd.DataFrame

    This function will process the file named in step 2 of the instructions above
    using the rules in step 3 above.  It will return a dataframe that contains
    the filtered, sorted, and enhanced results.

    For my tests, I will validate the shape to start with.
    If I have more time, I can figure out how to write tests for the other requirements.

    >>> week11().shape
    (1090, 6)
    """

    # Step 1: Read file file
    hospitals = pd.read_csv('https://hds5210-data.s3.amazonaws.com/Section111ValidICD10-Jan2024.csv')

    # Step 2: Filter the data frame to just those records with NF EXCL == 'Y'
    nf = hospitals[hospitals['NF EXCL'] == 'Y'].copy()

    # Step 3: Separate out the CODE TYPE
    nf['CODE TYPE'] = nf['CODE'].str[0]

    # Step 4: Get the CODE NUM as a numeric value
    nf['CODE NUM'] = nf['CODE'].str[1:].str.replace('X','.').astype(float)

    # Step 5: Sort the data frame
    nf.sort_values(['CODE TYPE','CODE NUM'], inplace=True)

    # It looks like I forgot to include this requirement, but here's
    # what I was looking for in terms of unique values
    nf.drop_duplicates(subset=['CODE TYPE','CODE NUM'], keep='first', inplace=True)

    # This is a dummy piece of code that just passes my one doctest.
    # Obviously, it won't pass the checker at the bottom.
    # You'll want to delete this before you try checking your answer.
    # final_data = pd.DataFrame([[x for x in range(6)] for x in range(1090)])

    return nf


In [None]:
import requests

r = requests.post('https://rln3ys6dciybh6cydvapszesna0oxcyn.lambda-url.us-east-1.on.aws/',
                  headers={"content-type": "application/json"},
                  data=week11().to_json(orient='records'))

print(r.status_code)
print(r.text)

200
"You did it!!"
