## Running FairNow's User Data Bias Testing

#### FairNow's User Data Testing is a way to evaluate a model for bias using real data. Please do not send any PII data.

### Prerequisites:

#### To use this notebook, you'll need a `Client ID` and `Client Secret`. These will either have been provided to you, or you can generate from https://app.fairnow.ai and going the the Admin menu. This notebook assumes you have these available to enter when prompted.

#### To run the simulation you will need an `application_id` and `application_version` for the specific model you want to test. Details of how to create and lookup AI Applications can be found in the `Applications AI` notebook.

#### Finally, you'll need a `threshold` value, which is the value at which anyone with a score above is considered a passing score.

#### Running the following cell will prompt you for the `Client ID` and `Client Secret` and create a `client` instance that can be used to communicate with the Fairnow APIs.


In [None]:
import json
from time import sleep
from getpass import getpass
import httpx
from httpx_auth import OAuth2, OAuth2ClientCredentials

client_id = "{client_id}" # Replace with your Client Id
client_secret = getpass("Client Secret")
fairnow_token_endpoint = "https://auth.fairnow.ai/oauth2/token"

auth = OAuth2ClientCredentials(
    token_url=fairnow_token_endpoint,
    client_id=client_id,
    client_secret=client_secret,
)

fairnow_base_url = "https://api.fairnow.ai/v2"
client = httpx.Client(base_url=fairnow_base_url, auth=auth)

#### You will also need to prepare a CSV file to upload containing data. The first row contains the column names.

#### The following columns are required:
* `TimeStamp` (ISO8601 Timestamp, e.g `2023-12-14T16:26:05.898156Z`)
* `Score` (a number between 0 and 1)
*  `Each of the  Protected Class Columns` (see `Generating User Bias Test Data` notebook for example on how to lookup the column names and example for generating a CSV to upload.)

#### Additional columns can be added to allow filtering of data, e.g. `Job Title`, `Location` etc

### Place the CSV file in the same directory as this notebook.

In [None]:
scores_file_name = 'scores.csv'  # Change the filename if different

#### Next we create a test. The test needs to the `application_id` and `application_version`, along with a threshold value. The response will include the `test_id` and a pre-signed URL used upload the CSV file.

In [None]:
start_test_route = "/tests/start/"

application_id = None               # Replace with your own Application Id
application_version = "1.0"         #  Change if you are working with a different version 
threshold = 0.5                     # Change to your own threshold setting

if threshold <= 0.0 or threshold >= 1.0:
    raise ValueError("Threshold must be between 0.0 and 1.0")

request_body = {
    "application_id": application_id,
    "application_version": application_version,
    "test_name": "API Client Testing",
    "test_description": "Testing User Bias",
    "test_type": "fairness_ml_user",
    "threshold": threshold
}

response = client.post(start_test_route, json=request_body, timeout=None)

if response.status_code == 200:
    print("New Test has been created:")
    print(json.dumps(response.json(), indent=4))
else:
    print(f"API Error Response: {response.status_code} - {response.text}")


#### Use the pre-signed upload url to upload the scores file. Once the scores are uploaded, this triggers the analysis job. This runs in the background again and can take a few minutes.

In [None]:
response_body = response.json()
test_id = response_body["test_id"]
presigned_upload = response_body["presigned_url_scores_upload"]
upload_url = presigned_upload["url"]
upload_fields = presigned_upload["fields"]
upload_key = upload_fields["key"]

data_file = {'file': (upload_key, open(scores_file_name, 'rb'))}

# Upload the scores CSV file
response = httpx.post(upload_url, data=upload_fields, files=data_file)

if response.status_code == 204:
    print("File has been uploaded.")
else:
    print(f"Error uploading file: {response.status_code} - {response.text}")

#### We'll query the API again to know when the analysis has been finished

In [None]:
test_route = "/tests/"
query_parameters = {
    "test_id": test_id,
    "application_id:": application_id,
    "application_version:": application_version,
}

response = client.get(test_route, params=query_parameters)
if response.status_code == 200:
    current_status = response.json()["status"]["id"]
    print(f"Current test status: {current_status}")
else:
    raise ValueError(f"API Error Response: {response.status_code} - {response.text}")

while current_status not in ['ready', 'error']:
    sleep(15)
    response = client.get(test_route, params=query_parameters)
    if response.status_code == 200:
        current_status = response.json()["status"]["id"]
        print(f"Current test status: {current_status}")
    else:
        raise ValueError(f"API Error Response: {response.status_code} - {response.text}")

if current_status == 'error':
    error_details = response.json()["file_validation_report"]
    print('Analysis encountered an error. Details:')
    print(error_details)
else:
    print(f'Analysis results ready to download.')


#### Now the the testing is complete you can check the result in the Fairnow application.

In [None]:
test_results_url = response.json()["results_url"]
print(f"Check your results in the Fairnow application at '{test_results_url}")

#### Optionally, you can download the raw analysis results with presigned link. The output is a csv file with the results of bias by protected classes.

In [None]:
presigned_test_results_url = response.json()["presigned_url_test_results"]

response = httpx.get(presigned_test_results_url)

with open('results.csv', 'wb') as file:
    file.write(response.content)

#### Now read the results from the file.

In [None]:
!cat results.csv