# Challenge Setup and Walkthrough for [Easy Credit Check](https://github.com/JosephTLucas/HackThisAI/tree/main/challenge/easy_credit_check)

## Setup
Installing dependencies and getting the container started

In [5]:
import os
import subprocess
import time
import requests

In [6]:
%%capture
!docker build --tag credit_check challenge/easy_credit_check/
# Notice that this path assumes the notebook is located in the top level HackThisAI/ directory. If you are creating a similar notebook anywhere, this path needs to point to the directory with the challenge Dockerfile.

In [7]:
%%capture
subprocess.call("docker run -p 5000:5000 credit_check:latest &", shell=True)

WARNING: If you keep running this command, you'll keep creating new containers. It's up to you to kill these in your docker dashboard.

In [8]:
print("Waiting to make sure container starts.")
time.sleep(10)

Waiting to make sure container starts.


## Helper Function
With the container running, we can submit our `.csv` as an `HTTP POST` to `http://localhost:5000/check`. This is just a helper function to submit an answer to the challenge.

In [9]:
def submit_an_answer(file):
    with open('challenge/easy_credit_check/'+file, 'rb') as f:
        r = requests.post('http://localhost:5000/check', files={'data_file': f})
        print(r.text)

## Working through the Challenge

Now that we have everything setup, we'll work through `easy_credit_check`.
***
**SPOILERS BELOW THIS POINT**.
***

Let's see what we get if we submit the expected training data.

In [10]:
submit_an_answer("Mall_Customers.csv")

You don't qualify. Try again.


Well, since that didn't work, let's make a copy of the file and start tampering with it.

In [11]:
!cp challenge/easy_credit_check/Mall_Customers.csv challenge/easy_credit_check/tamper.csv

In [12]:
# We're going to modify the existing dataset by appending another row.
# That row is for a 19 year old making $15k, but we're going to try and give them a Score of 100
with open('challenge/easy_credit_check/tamper.csv', 'a+') as f:
    f.write("\n0201,Female,19,15,100")

In [13]:
submit_an_answer("tamper.csv")

You don't qualify. Try again.


Hmm, that still didn't work. How well do you think they clean their data before training?

In [14]:
# We're going to modify the existing dataset by appending another row.
# That row is for a 19 year old making $15k, but we're going to try and give them a Score of 100000
with open('challenge/easy_credit_check/tamper.csv', 'a+') as f:
    f.write("\n0201,Female,19,15,100000")

In [15]:
submit_an_answer("tamper.csv")

Your tampering was too flagrant. You were discovered.


Okay, so they did some kind of data comparison. We added two rows. Maybe that was too much? Let's start over and just try adding one row.

In [16]:
!cp challenge/easy_credit_check/Mall_Customers.csv challenge/easy_credit_check/tamper.csv

In [17]:
with open('challenge/easy_credit_check/tamper.csv', 'a+') as f:
    f.write("\n0201,Female,19,15,100000")

In [18]:
submit_an_answer("tamper.csv")

Enjoy your Platinum Mall Card: FLAG{HIGHROLLER}


## Conclusion

Looks like we win! They didn't sanitize or bound their target variable. Even though `score` should be bounded on `[0,100]`, they didn't implement any checks on that before training. We were able to introduce something ridiculous and significantly influence their model with one additional datapoint.