# Block C: Responsible AI 

For details regarding ILO 3.1 and the use-case, please refer to the Assessment rubric in Microsoft Teams, and the [DataLab: Responsible AI](https://adsai.buas.nl/Study%20Content/Responsible%20and%20Explainable%20AI/UseCases.html) GitHub page.


## Use-case 1: Identifying, and describing bias

##### Define the concept of bias in relation to the Imsitu dataset, and a computer vision task.

The systematic inaccuracy or unfairness that is introduced into the data or the algorithm used to carry out the task is referred to as bias in the context of the Imsitu dataset and a computer vision task. As a result, various groups or people may not be represented fairly or treated equally in the data or task outcomes. When certain preconceived conceptions, assumptions, or prejudices have an impact on how specific agents or entities are labeled in the Imsitu dataset, bias might result.

##### List, and describe the type of bias that you identified in the Imsitu dataset.

I discovered bias in the Imsitu dataset when I searched for the verb "praying" and categorized the agent. One agent was labeled as a "woman," based on an accompanying picture of a hand praying. However, there is no clear indication that the hand belongs to a woman, as the presence of nail polish does not necessarily mean the individual is female, as both men and women can wear it.

#### Discuss the possible ramification (e.g., harm) in terms of fairness of the identified bias instance: Why, and when, is this particular instance of bias undesirable? In other words, who might be disproportionally affected by this particular instance of bias, and when does this negative effect come into play?

The identified bias in the Imsitu dataset can harm fairness and lead to unequal treatment when used in computer vision tasks. The incorrect labeling of the agent as a "woman" based on the presence of nail polish reinforces gender stereotypes and can result in incorrect predictions by computer vision systems, causing harm to individuals who don't conform to these stereotypes. Addressing bias in datasets and algorithms is crucial to ensure fair and equitable outcomes.

## Use-case 2: Propose individual fairness method

#### Identify a sensitive/protected attribute in the Imsitu dataset.

Religion (praying)

#### Mitigate bias in the Imsitu dataset by applying the ‘Fairness Through Unawareness' or ‘Fairness Through Awareness' method to this sensitive/protected attribute.

The "Fairness Through Awareness" strategy entails including the sensitive/protected characteristic in the model and making sure that the model is trained to make fair decisions, independent of this feature. Different religious beliefs often have unique practices and rituals, including the way they offer prayers. By incorporating these distinct religious practices into a system or organization, the types of prayers can be made more distinguishable.Furthermore, by recognizing and acknowledging the diversity of religious beliefs, it reduces the risk of offense caused by grouping certain images or symbols together.

#### Elaborate on the individual fairness method that you applied, and why you think it is a good method to mitigate bias in the Imsitu dataset.

I explained this in the previous question.

## Use-case 3: Create a subset of images from the original dataset

### Create a training set that contains images from the selected classes.

In [35]:
import json

# load imsitu_space.json file
imsitu_space = json.load(open("data/imsitu_space.json"))

nouns = imsitu_space["nouns"]
verbs = imsitu_space["verbs"]

# function to get all agent codes for a specific agent/noun
def get_agent_codes(agent = "person"):
    for noun in nouns:
        if nouns[noun]['gloss'][0] == agent:
            print(f"{agent} found")
            print(noun)

# get all agent codes for cardboard
get_agent_codes("cardboard")

cardboard found
n14799601


In [36]:
#function to get all verb codes for a specific verb
def get_verb_agent(json_file, verb_custom, agent_custom):
    train = json.load(open(json_file))
    verb_value = []
    agent_key = []
    agent_value = []
    file_path = []
    count = 0
    for i in train:
        verb = train[i]['verb']
        if verb == verb_custom:
            frames = train[i]['frames']
            for frame in frames:
                for key, value in frame.items():
                    if key == 'item':
                        if value in agent_custom:
                            if i not in file_path:
                                agent_key.append(key)
                                agent_value.append(value)
                                file_path.append(i)
                                verb_value.append(verb)
                                count += 1
                        else:
                            continue
                    else:
                        continue
    return(file_path, verb_value, agent_key, agent_value, count)

get_verb_agent('data/train.json', 'stapling', ['n14799601'])

(['stapling_141.jpg',
  'stapling_199.jpg',
  'stapling_70.jpg',
  'stapling_138.jpg',
  'stapling_75.jpg'],
 ['stapling', 'stapling', 'stapling', 'stapling', 'stapling'],
 ['item', 'item', 'item', 'item', 'item'],
 ['n14799601', 'n14799601', 'n14799601', 'n14799601', 'n14799601'],
 5)

In [37]:
import shutil

def img_to_folder(dirs_original, dirs_destination):
    image_list = get_verb_agent('data/train.json', 'stapling', ['n14799601'])[0]
    dirs_list = [(dirs_original, dirs_destination)]
    for img in image_list:
        for source_folder, destination_folder in dirs_list:
            shutil.copy(source_folder+img, destination_folder+img)

img_to_folder("data/original/", "data/cardboard/train/")

FileNotFoundError: [Errno 2] No such file or directory: 'data/original/stapling_141.jpg'

In [None]:
import pandas as pd

def lists_to_df(dirs_destination, col1_name, col2_name, col3_name):
    col1 = get_verb_agent('data/json/train.json', 'stapling', ['n14799601'])[0]
    col2 = get_verb_agent('data/json/train.json', 'stapling', ['n14799601'])[1]
    col3 = get_verb_agent('data/json/train.json', 'stapling', ['n14799601'])[3]
    df = pd.DataFrame(list(zip(col1, col2, col3)), columns=[col1_name, col2_name, col3_name])
    df.to_csv(dirs_destination, index=False)
    return df

lists_to_df('./data/dusting/train/stapling_train.csv', 'file_name','verb', 'agent')

In [None]:
#Import libraries
import os
import numpy as np
import cv2
import matplotlib.pyplot as plt
import urllib
import itertools
import random, os, glob
from imutils import paths
from sklearn.utils import shuffle
from urllib.request import urlopen

from sklearn.metrics import confusion_matrix, classification_report
import tensorflow as tf

In [None]:
#Import othet dataset
dir_path = "data/dataset-resized"

In [None]:
target_size = (224, 224)
waste_labels = {"cardboard":0, "glass":1, "metal":2, "paper":3, "plastic":4, "trash":5}

In [None]:
def load_dataset(path):
  x = []
  labels = []
  image_paths = sorted(list(paths.list_images(path)))
  for image_path in image_paths:
    img = cv2.imread(image_path)
    img = cv2.resize(img, target_size)
    x.append(img)
    label = image_path.split(os.path.sep)[-2]
    labels.append(waste_labels[label])
  x, labels = shuffle(x, labels, random_state=42)
  input_shape = (np.array(x[0]).shape[1], np.array(x[0]).shape[1], 3)
  print("X shape: ", np.array(x).shape)
  print(f"Number of Labels: {len(np.unique(labels))} , Number of Observation: {len(labels)}")
  print("Input Shape: ", input_shape)
  return x, labels, input_shape

In [None]:
x, labels, input_shape = load_dataset(dir_path)

X shape:  (2527, 224, 224, 3)
Number of Labels: 6 , Number of Observation: 2527
Input Shape:  (224, 224, 3)


In [None]:
from sklearn.model_selection import train_test_split

# Split the data into training and validation sets
train_data, val_data, train_labels, val_labels = train_test_split(Images, labels, test_size=0.2, random_state=42)


NameError: name 'Images' is not defined

## Use-case 4: Write Python functions; group fairness metrics

Write your text for use-case 4 here

In [None]:
#Write your Python code for use-case 4 here

## Use-case 5: Write Python function; group fairness taxonomy

Write your text for use-case 5 here

In [None]:
#Write your Python code for use-case 5 here

## Use-case 6: Apply one/multiple explainable AI method(s) to the image classifier

Write your text for use-case 6 here

In [None]:
#Write your Python code for use-case 6 here