# Using the AVID Public APIs

## Background 

AVID houses two types of records: *Records* and *Vulnerabilities*. A vulnerability (vuln) is high-level evidence of an AI failure mode. A report is one example of a particular vulnerability occurring, supported by qualitative or quantitative evaluation. You can think of a report as an instance of a vulnerability.

As an example, we'll look at a vulnerability below about gender bias in a particular language model, `xlm-roberta-base`. That vuln is associated with two reports, each of which measured gender bias by a different method.

Only reports can be submitted to the API; vulnerabilities cannot be submitted. After a report is submitted, it becomes a "draft" and enters the AVID editorial queue. In the editorial process, human editors review and validate the report and determine whether it belongs to an existing vulnerability or represents a new vulnerability. Reports are published after successfully passing through the editorial process. 

In this notebook, we'll walk through the process of submitting a new report and checking on its status. We'll also cover how to retrieve published items from the database.

## TODO: Butters to add API docs

In [None]:
import json
from os import environ
import requests

In [None]:
api_key = environ.get('AVID_API_KEY')
headers =  {"Authorization":api_key}

# Basic API check
This is a quick check to make sure your API key is working and you're able to contact the endpoint.  
It should give the response "meow".

In [None]:
url = "https://g0ouofqb7j.execute-api.us-east-1.amazonaws.com/api/"
response = requests.get(url, headers=headers)
print(response, response.text)

# Submitting reports

## How to format your report  (TODO: this needs to be fleshed out)

Reports should conform to the AVID data model, which is documented [here](https://avidml.org/avidtools/reference/report.html).   

The endpoint does not perform validation against this data model; it accepts any valid json. A validation endpoint will be available in the future.  

See an example of how to create a report [here](https://github.com/leondz/garak/blob/main/analyze/report_avid.py).

## Submit an object

Here we'll submit a draft report generated by [this HuggingFace space](https://huggingface.co/spaces/avid-ml/bias-detection), which reports gender bias in a language models. The HuggingFace space produces the report as a string, so we'll start by loading that into a json object.

In [None]:
new_report = json.loads('''{
  "data_type": "AVID",
  "data_version": null,
  "metadata": null,
  "affects": {
    "developer": [],
    "deployer": [
      "Hugging Face"
    ],
    "artifacts": [
      {
        "type": "Model",
        "name": "bert-base-cased"
      }
    ]
  },
  "problemtype": {
    "classof": "LLM Evaluation",
    "type": "Detection",
    "description": {
      "lang": "eng",
      "value": "Profession bias reinforcing gender stereotypes found in bert-base-cased, as measured on the Winobias dataset"
    }
  },
  "metrics": [
    {
      "name": "Winobias",
      "detection_method": {
        "type": "Significance Test",
        "name": "One-sample Z-test"
      },
      "results": {
        "feature": [
          "gender"
        ],
        "stat": [
          9
        ],
        "pvalue": [
          0
        ]
      }
    }
  ],
  "references": [
    {
      "label": "Winograd-schema dataset for detecting gender bias",
      "url": "https://uclanlp.github.io/corefBias/overview"
    },
    {
      "label": "bert-base-cased on Hugging Face",
      "url": "https://huggingface.co/bert-base-cased"
    }
  ],
  "description": {
    "lang": "eng",
    "value": "Filling in pronouns in sentences tagged with professions using bert-base-cased were found to be significantly biased on the Winobias dataset."
  },
  "impact": {
    "avid": {
      "risk_domain": [
        "Ethics"
      ],
      "sep_view": [
        "E0101: Group fairness"
      ],
      "lifecycle_view": [
        "L05: Evaluation"
      ],
      "taxonomy_version": "0.2"
    }
  },
  "credit": null,
  "reported_date": "2023-07-12"
}''')


In [None]:
url = "https://g0ouofqb7j.execute-api.us-east-1.amazonaws.com/api/submit"
response = requests.post(url, json=new_report, headers=headers)
uuid = response.json()
print(response, uuid)

The endpoint returns the UUID, a unique identifier for the submitted entry. The UUID can be used to track the status of a submission, as shown below.

## Retrieve the editorial status of a submitted report object

Now we'll check on the status of the report we just submitted above. The API should return the status as "draft."

In [None]:
url = f"https://g0ouofqb7j.execute-api.us-east-1.amazonaws.com/api/review/{uuid}/status"
response = requests.get(url, headers=headers)
print(response, response.json())

# Retrieving reports and vulnerabilities from the database

## Get all published objects

Published objects include both reports and vulnerabilities. Run the cell below to retrieve them all and take a look at a few of them.

In [None]:
url = "https://g0ouofqb7j.execute-api.us-east-1.amazonaws.com/api/objects/vulnerability?status=PUBLISHED"

response = requests.get(url, headers=headers)
print(response)
for r in response.json()[:3]:
    print(json.dumps(r, indent=2))
    print("##########################################")

## Get published objects by AVID ID

Published reports and vulnerabilities have IDs of the form `AVID-2022-V001` for vulnerabilities and `AVID-2022-R0001` for reports. You can retrieve published objects by these IDs. 

Here, we'll start by retrieving a vuln.

In [None]:
url = "https://g0ouofqb7j.execute-api.us-east-1.amazonaws.com/api/object/AVID-2022-V002"
response = requests.get(url, headers=headers)
print(response)
print(json.dumps(response.json(), indent=2))

In the example above, the vulnerability lists two associated reports. This illustrates the idea that reports are instances of vulnerabilities. The vulnerability here captures gender bias in xlm-roberta-base, as measured in two different tasks.

Now let's retrieve one of those reports.

In [None]:
url = "https://g0ouofqb7j.execute-api.us-east-1.amazonaws.com/api/object/AVID-2022-R0004"
response = requests.get(url, headers=headers)
print(response)
print(json.dumps(response.json(), indent=2))

## Get published objects by object type

When retrieving all published objects, you can limit your results to just reports, or just vulns, as follows.

In [None]:
url = "https://g0ouofqb7j.execute-api.us-east-1.amazonaws.com/api/objects/report"
response = requests.get(url, headers=headers)
print(response)
print(response.json()[0])   # print the first one

In [None]:
url = "https://g0ouofqb7j.execute-api.us-east-1.amazonaws.com/api/objects/vulnerability"
response = requests.get(url, headers=headers)
print(response)
print(response.json()[0])   # print the first one