<a href="https://colab.research.google.com/github/ronakkkk/Improved-Model-Remediaiton-on-CelebA-Dataset-/blob/main/MinDiff_Tuning_on_CelebA_Dataset.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>


# Model Remediation on Celeba Dataset

# Installation
This notebook was created in [Colaboratory](https://research.google.com/colaboratory/faq.html), connected to the Python 3 Google Compute Engine backend. If you wish to host this notebook in a different environment, then you should not experience any major issues provided you include all the required packages in the cells below.

Note that the very first time you run the pip installs, you may be asked to restart the runtime because of preinstalled out of date packages. Once you do so, the correct packages will be used.

In [None]:
#@title Pip installs
!pip install -q -U pip==20.2

!pip install git+https://github.com/google-research/tensorflow_constrained_optimization
!pip install -q tensorflow-datasets tensorflow
!pip install fairness-indicators \
  "absl-py==0.12.0" \
  "apache-beam<3,>=2.36" \
  "avro-python3==1.9.1" \
  "pyzmq==17.0.0"
!pip install --upgrade tensorflow-model-remediation
!pip install --upgrade fairness-indicators



[?25l[K     |▏                               | 10 kB 29.4 MB/s eta 0:00:01[K     |▍                               | 20 kB 20.4 MB/s eta 0:00:01[K     |▋                               | 30 kB 10.8 MB/s eta 0:00:01[K     |▉                               | 40 kB 8.7 MB/s eta 0:00:01[K     |█                               | 51 kB 4.4 MB/s eta 0:00:01[K     |█▎                              | 61 kB 5.2 MB/s eta 0:00:01[K     |█▌                              | 71 kB 5.3 MB/s eta 0:00:01[K     |█▊                              | 81 kB 5.4 MB/s eta 0:00:01[K     |██                              | 92 kB 6.1 MB/s eta 0:00:01[K     |██▏                             | 102 kB 5.1 MB/s eta 0:00:01[K     |██▍                             | 112 kB 5.1 MB/s eta 0:00:01[K     |██▋                             | 122 kB 5.1 MB/s eta 0:00:01[K     |██▉                             | 133 kB 5.1 MB/s eta 0:00:01[K     |███                             | 143 kB 5.1 MB/s eta 0:00:01[K  

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting tensorflow-model-remediation
  Downloading tensorflow_model_remediation-0.1.7.1-py3-none-any.whl (142 kB)
[K     |████████████████████████████████| 142 kB 5.2 MB/s 
[?25hCollecting mock
  Downloading mock-4.0.3-py3-none-any.whl (28 kB)
Installing collected packages: mock, tensorflow-model-remediation
Successfully installed mock-4.0.3 tensorflow-model-remediation-0.1.7.1
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Requirement already up-to-date: fairness-indicators in /usr/local/lib/python3.7/dist-packages (0.38.0)
Collecting pyarrow<6,>=1
  Downloading pyarrow-5.0.0-cp37-cp37m-manylinux2014_x86_64.whl (23.6 MB)
[K     |████████████████████████████████| 23.6 MB 97.4 MB/s 
Collecting grpcio-gcp<1,>=0.2.2; extra == "gcp"
  Downloading grpcio_gcp-0.2.2-py2.py3-none-any.whl (9.4 kB)
Collecting google-cloud-spanner<2,>=1.13.0;

Note that depending on when you run the cell below, you may receive a warning about the default version of TensorFlow in Colab switching to TensorFlow 2.X soon. You can safely ignore that warning as this notebook was designed to be compatible with TensorFlow 1.X and 2.X.

In [None]:
#@title Import Modules
import os
import sys
import tempfile
import urllib

import tensorflow as tf
from tensorflow import keras

import tensorflow_datasets as tfds
tfds.disable_progress_bar()

import numpy as np

from tensorflow_metadata.proto.v0 import schema_pb2
from tfx_bsl.tfxio import tensor_adapter
from tfx_bsl.tfxio import tf_example_record
import copy
import requests
import tempfile
import zipfile

import tensorflow_model_remediation.min_diff as md
from tensorflow_model_remediation.tools.tutorials_utils import min_diff_keras_utils
import tensorflow_hub as hub
from tensorflow_model_analysis.addons.fairness.view import widget_view
from keras.layers import Conv2D, MaxPooling2D, Activation, Flatten, Dropout, Dense, LeakyReLU, MaxPool2D
from keras.layers import BatchNormalization
import matplotlib.pyplot as plt

Additionally, we add a few imports that are specific to Fairness Indicators which we will use to evaluate and visualize the model's performance.

In [None]:
#@title Fairness Indicators related imports
import tensorflow_model_analysis as tfma
import fairness_indicators as fi
from google.protobuf import text_format
import apache_beam as beam

In [None]:
#@title Enable Eager Execution and Print Versions
if tf.__version__ < "2.0.0":
  tf.compat.v1.enable_eager_execution()
  print("Eager execution enabled.")
else:
  print("Eager execution enabled by default.")

print("TensorFlow " + tf.__version__)
print("TFMA " + tfma.VERSION_STRING)
print("TFDS " + tfds.version.__version__)
print("FI " + fi.version.__version__)

Eager execution enabled by default.
TensorFlow 2.8.0
TFMA 0.38.0
TFDS 4.0.1
FI 0.38.0


# CelebA Dataset
[CelebA](http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html) is a large-scale face attributes dataset with more than 200,000 celebrity images, each with 40 attribute annotations (such as hair type, fashion accessories, facial features, etc.) and 5 landmark locations (eyes, mouth and nose positions). For more details take a look at [the paper](https://liuziwei7.github.io/projects/FaceAttributes.html).
With the permission of the owners, we have stored this dataset on Google Cloud Storage and mostly access it via [TensorFlow Datasets(`tfds`)](https://www.tensorflow.org/datasets).

In this notebook:
* Our model will attempt to classify whether the subject of the image is Attractive, as represented by the "Attractive" attribute<sup>*</sup>.
*   Images will be resized from 218x178 to 128x128 to reduce the execution time and memory when training.
*   Our model's performance will be evaluated across gender, using the binary "Male" attribute.





In [None]:
gcs_base_dir = "gs://celeb_a_dataset/"
celeb_a_builder = tfds.builder("celeb_a", data_dir=gcs_base_dir, version='2.0.0')

celeb_a_builder.download_and_prepare()

num_test_shards_dict = {'0.3.0': 4, '2.0.0': 2} # Used because we download the test dataset separately
version = str(celeb_a_builder.info.version)
print('Celeb_A dataset version: %s' % version)

Celeb_A dataset version: 2.0.0


In [None]:
#@title Test dataset helper functions
local_root = tempfile.mkdtemp(prefix='test-data')
def local_test_filename_base():
  return local_root

def local_test_file_full_prefix():
  return os.path.join(local_test_filename_base(), "celeb_a-test.tfrecord")

def copy_test_files_to_local():
  filename_base = local_test_file_full_prefix()
  num_test_shards = num_test_shards_dict[version]
  for shard in range(num_test_shards):
    url = "https://storage.googleapis.com/celeb_a_dataset/celeb_a/%s/celeb_a-test.tfrecord-0000%s-of-0000%s" % (version, shard, num_test_shards)
    filename = "%s-0000%s-of-0000%s" % (filename_base, shard, num_test_shards)
    res = urllib.request.urlretrieve(url, filename)

# Setting Up Input Functions
The subsequent cells will help streamline the input pipeline as well as visualize performance.

First we define some data-related variables and define a requisite preprocessing function.

# Defining keys
Here we will define the keys, As we have discussed the task we will be performing above. We will define the LABEL_KEY as "Attractive" attribute of the dataset and the GROUP_KEY as "Male" attribute of the dataset

In [None]:
#@title Define Variables
ATTR_KEY = "attributes"
IMAGE_KEY = "image"
LABEL_KEY = "Attractive"
GROUP_KEY = "Male"
IMAGE_SIZE = 128

In [None]:
#@title Define Preprocessing Functions
def preprocess_input_dict(feat_dict):
  # Separate out the image and target variable from the feature dictionary.
  print(feat_dict)
  image = feat_dict[IMAGE_KEY]
  label = feat_dict[ATTR_KEY][LABEL_KEY]
  group = feat_dict[ATTR_KEY][GROUP_KEY]
  print(image, label, group)
  # Resize and normalize image.
  image = tf.cast(image, tf.float32)
  image = tf.image.resize(image, [IMAGE_SIZE, IMAGE_SIZE])
  image /= 255.0

  # Cast label and group to float32.
  label = tf.cast(label, tf.float32)
  group = tf.cast(group, tf.float32)

  feat_dict[IMAGE_KEY] = image
  feat_dict[ATTR_KEY][LABEL_KEY] = label
  feat_dict[ATTR_KEY][GROUP_KEY] = group

  return feat_dict

get_image_and_label = lambda feat_dict: (feat_dict[IMAGE_KEY], feat_dict[ATTR_KEY][LABEL_KEY])
get_image_label_and_group = lambda feat_dict: (feat_dict[IMAGE_KEY], feat_dict[ATTR_KEY][LABEL_KEY], feat_dict[ATTR_KEY][GROUP_KEY])

<function <lambda> at 0x7f4bd7ab1f80>


Then, we build out the data functions we need in the rest of the colab.

In [None]:
# Train data returning 2 elements are all the required preprocessing
# print(celeb_a_builder.as_dataset)
def celeb_a_train_data_wo_group(batch_size):
  celeb_a_train_data = celeb_a_builder.as_dataset(split='train').shuffle(1024).repeat().batch(batch_size).map(preprocess_input_dict).map(get_image_and_label)
  return celeb_a_train_data
print(celeb_a_train_data_wo_group(32))
# Test data for the overall evaluation
celeb_a_test_data = celeb_a_builder.as_dataset(split='test').batch(1).map(preprocess_input_dict).map(get_image_label_and_group)
# Copy test data locally to be able to read it into tfma
copy_test_files_to_local()

{'attributes': {'5_o_Clock_Shadow': <tf.Tensor 'args_0:0' shape=(None,) dtype=bool>, 'Arched_Eyebrows': <tf.Tensor 'args_1:0' shape=(None,) dtype=bool>, 'Attractive': <tf.Tensor 'args_2:0' shape=(None,) dtype=bool>, 'Bags_Under_Eyes': <tf.Tensor 'args_3:0' shape=(None,) dtype=bool>, 'Bald': <tf.Tensor 'args_4:0' shape=(None,) dtype=bool>, 'Bangs': <tf.Tensor 'args_5:0' shape=(None,) dtype=bool>, 'Big_Lips': <tf.Tensor 'args_6:0' shape=(None,) dtype=bool>, 'Big_Nose': <tf.Tensor 'args_7:0' shape=(None,) dtype=bool>, 'Black_Hair': <tf.Tensor 'args_8:0' shape=(None,) dtype=bool>, 'Blond_Hair': <tf.Tensor 'args_9:0' shape=(None,) dtype=bool>, 'Blurry': <tf.Tensor 'args_10:0' shape=(None,) dtype=bool>, 'Brown_Hair': <tf.Tensor 'args_11:0' shape=(None,) dtype=bool>, 'Bushy_Eyebrows': <tf.Tensor 'args_12:0' shape=(None,) dtype=bool>, 'Chubby': <tf.Tensor 'args_13:0' shape=(None,) dtype=bool>, 'Double_Chin': <tf.Tensor 'args_14:0' shape=(None,) dtype=bool>, 'Eyeglasses': <tf.Tensor 'args_15:0'

# Build the model
Because this notebook more on applying MinDiff, we will assemble a simple `tf.keras.Sequential` model.

We may be able to greatly improve model performance by adding some complexity (e.g., more densely-connected layers, exploring different activation functions, increasing image size), but that may distract us from the goal of demonstrating the MinDiff model.

We also define a function to set seeds to ensure reproducible results. Note that this colab is meant as an educational tool and does not have the stability of a finely tuned production pipeline. Running without setting a seed may lead to varied results. 

In [None]:
def set_seeds():
  np.random.seed(121212)
  tf.compat.v1.set_random_seed(212121)

# Fairness Indicators Helper Functions
Before training our model, we define a number of helper functions that will allow us to evaluate the model's performance via Fairness Indicators.


First, we create a helper function to save our model once we train it.

In [None]:
def save_model(model, subdir):
  base_dir = tempfile.mkdtemp(prefix='saved_models')
  model_location = os.path.join(base_dir, subdir)
  model.save(model_location, save_format='tf')
  return model_location

Next, we define functions used to preprocess the data in order to correctly pass it through to TFMA.

In [None]:
#@title 
def tfds_filepattern_for_split(dataset_name, split):
  return f"{local_test_file_full_prefix()}*"

class PreprocessCelebA(object):
  """Class that deserializes, decodes and applies additional preprocessing for CelebA input."""
  def __init__(self, dataset_name):
    builder = tfds.builder(dataset_name)
    self.features = builder.info.features
    example_specs = self.features.get_serialized_info()
    self.parser = tfds.core.example_parser.ExampleParser(example_specs)

  def __call__(self, serialized_example):
    # Deserialize
    deserialized_example = self.parser.parse_example(serialized_example)
    # Decode
    decoded_example = self.features.decode_example(deserialized_example)
    # Additional preprocessing
    image = decoded_example[IMAGE_KEY]
    label = decoded_example[ATTR_KEY][LABEL_KEY]
    # Resize and scale image.
    image = tf.cast(image, tf.float32)
    image = tf.image.resize(image, [IMAGE_SIZE, IMAGE_SIZE])
    image /= 255.0
    image = tf.reshape(image, [-1])
    # Cast label and group to float32.
    label = tf.cast(label, tf.float32)

    group = decoded_example[ATTR_KEY][GROUP_KEY]
    
    output = tf.train.Example()
    output.features.feature[IMAGE_KEY].float_list.value.extend(image.numpy().tolist())
    output.features.feature[LABEL_KEY].float_list.value.append(label.numpy())
    output.features.feature[GROUP_KEY].bytes_list.value.append(b"Male" if group.numpy() else b'Female')
    return output.SerializeToString()

def tfds_as_pcollection(beam_pipeline, dataset_name, split):
  return (
      beam_pipeline
   | 'Read records' >> beam.io.ReadFromTFRecord(tfds_filepattern_for_split(dataset_name, split))
   | 'Preprocess' >> beam.Map(PreprocessCelebA(dataset_name))
  )

Finally, we define a function that evaluates the results in TFMA.

In [None]:
def get_eval_results(model_location, eval_subdir):
  base_dir = tempfile.mkdtemp(prefix='saved_eval_results')
  tfma_eval_result_path = os.path.join(base_dir, eval_subdir)

  eval_config_pbtxt = """
        model_specs {
          label_key: "%s"
        }
        metrics_specs {
          metrics {
            class_name: "FairnessIndicators"
            config: '{ "thresholds": [0.22, 0.5, 0.75] }'
          }
          metrics {
            class_name: "ExampleCount"
          }
        }
        slicing_specs {}
        slicing_specs { feature_keys: "%s" }
        options {
          compute_confidence_intervals { value: False }
          disabled_outputs{values: "analysis"}
        }
      """ % (LABEL_KEY, GROUP_KEY)
      
  eval_config = text_format.Parse(eval_config_pbtxt, tfma.EvalConfig())

  eval_shared_model = tfma.default_eval_shared_model(
        eval_saved_model_path=model_location, tags=[tf.saved_model.SERVING])

  schema_pbtxt = """
        tensor_representation_group {
          key: ""
          value {
            tensor_representation {
              key: "%s"
              value {
                dense_tensor {
                  column_name: "%s"
                  shape {
                    dim { size: 128 }
                    dim { size: 128 }
                    dim { size: 3 }
                  }
                }
              }
            }
          }
        }
        feature {
          name: "%s"
          type: FLOAT
        }
        feature {
          name: "%s"
          type: FLOAT
        }
        feature {
          name: "%s"
          type: BYTES
        }
        """ % (IMAGE_KEY, IMAGE_KEY, IMAGE_KEY, LABEL_KEY, GROUP_KEY)
  schema = text_format.Parse(schema_pbtxt, schema_pb2.Schema())
  coder = tf_example_record.TFExampleBeamRecord(
      physical_format='inmem', schema=schema,
      raw_record_column_name=tfma.ARROW_INPUT_COLUMN)
  tensor_adapter_config = tensor_adapter.TensorAdapterConfig(
    arrow_schema=coder.ArrowSchema(),
    tensor_representations=coder.TensorRepresentations())
  # Run the fairness evaluation.
  with beam.Pipeline() as pipeline:
    _ = (
          tfds_as_pcollection(pipeline, 'celeb_a', 'test')
          | 'ExamplesToRecordBatch' >> coder.BeamSource()
          | 'ExtractEvaluateAndWriteResults' >>
          tfma.ExtractEvaluateAndWriteResults(
              eval_config=eval_config,
              eval_shared_model=eval_shared_model,
              output_path=tfma_eval_result_path,
              tensor_adapter_config=tensor_adapter_config)
    )
  return tfma.load_eval_result(output_path=tfma_eval_result_path)

# Train & Evaluate Model without MinDIff
With the model now defined and the input pipeline in place, we’re now ready to train our model. To cut back on the amount of execution time and memory, we will train the model by slicing the data into small batches with only a few repeated iterations.

In [None]:
def create_model():
    AlexNet = tf.keras.models.Sequential()

    # 1st Convolution Layer
    AlexNet.add(Conv2D(filters=96, input_shape=(128, 128, 3), kernel_size=(11, 11), strides=(4, 4), padding='same',name='image'))
    AlexNet.add(BatchNormalization())
    AlexNet.add(Activation('relu'))
    AlexNet.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2), padding='same'))

    # 2nd Convolutional Layer
    AlexNet.add(Conv2D(filters=256, kernel_size=(5, 5), strides=(1, 1), padding='same'))
    AlexNet.add(BatchNormalization())
    AlexNet.add(Activation('relu'))
    AlexNet.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2), padding='same'))

    # 3rd Convolutional Layer
    AlexNet.add(Conv2D(filters=384, kernel_size=(3, 3), strides=(1, 1), padding='same'))
    AlexNet.add(BatchNormalization())
    AlexNet.add(Activation('relu'))

    # 4th Convolutional Layer
    AlexNet.add(Conv2D(filters=384, kernel_size=(3, 3), strides=(1, 1), padding='same'))
    AlexNet.add(BatchNormalization())
    AlexNet.add(Activation('relu'))

    # 5th Convolutional Layer
    AlexNet.add(Conv2D(filters=512, kernel_size=(3, 3), strides=(1, 1), padding='same'))
    AlexNet.add(BatchNormalization())
    AlexNet.add(Activation('relu'))
    AlexNet.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2), padding='same'))

    # Passing it to a Fully Connected layer
    AlexNet.add(Flatten())

    # 1st Fully Connected Layer
    AlexNet.add(Dense(2048, input_shape=(128, 128, 3)))
    AlexNet.add(BatchNormalization())
    AlexNet.add(Activation('relu'))
    # Add Dropout to prevent overfitting
    AlexNet.add(Dropout(0.5))

    # 2nd Fully Connected Layer
    AlexNet.add(Dense(2048))
    AlexNet.add(BatchNormalization())
    AlexNet.add(Activation('relu'))

    # 3rd Fully Connected Layber
    AlexNet.add(Dense(1024))
    AlexNet.add(BatchNormalization())
    AlexNet.add(Activation('relu'))
    # Add Dropout
    AlexNet.add(Dropout(0.5))

    # Output Layer having 2 output classes
    AlexNet.add(Dense(1))
    AlexNet.add(BatchNormalization())
    AlexNet.add(Activation('sigmoid'))

    return AlexNet


In [None]:
BATCH_SIZE = 32

# Set seeds to get reproducible results
set_seeds()
model = create_model()
opt = tf.keras.optimizers.Adam(0.001)
model.compile(loss="binary_crossentropy", optimizer=opt, metrics=['accuracy'])

model.fit(celeb_a_train_data_wo_group(BATCH_SIZE), epochs=5, steps_per_epoch=1000)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0x7f48b6f4a790>

Evaluating the model on the test data result in a final accuracy score of 78.92%.

In [None]:
print('Overall Results')
celeb_a_test_data = celeb_a_builder.as_dataset(split='test').batch(1).map(preprocess_input_dict).map(get_image_label_and_group)
results = model.evaluate(celeb_a_test_data)

Overall Results


# Saving Model and Evaluating the Fairness

In [None]:
model_location = save_model(model, 'export')
eval_results = get_eval_results(model_location, 'results')

INFO:tensorflow:Assets written to: /tmp/saved_modelsclvucsrc/export/assets


INFO:tensorflow:Assets written to: /tmp/saved_modelsclvucsrc/export/assets




Instructions for updating:
Use eager execution and: 
`tf.data.TFRecordDataset(path)`


Instructions for updating:
Use eager execution and: 
`tf.data.TFRecordDataset(path)`


In [None]:
tfma.addons.fairness.view.widget_view.render_fairness_indicator(eval_results)

From the evaluation results. select the metric false positive rate (FPR) with threshold 0.5.As we can see that more of the female have been marked as falsely attractive.The FPR for Male is 0.115 and FPR for female is 0.452. The diffrence is of 0.337.
This indicates that the model is biased for predicting female as attractive.
To remove this biasness we will apply MinDiff.

## Define and Train the MinDiff Model

Now, we’ll try to improve the FPR for underperforming Male group. We’ll attempt to do so using [MinDiff](https://arxiv.org/abs/1910.11779), a remediation technique that seeks to balance error rates across slices of your data by penalizing disparities in performance during training. When we apply MinDiff, model performance may degrade slightly on other slices. As such, our goals with MinDiff will be:
*   Improved performance for underperforming groups
*   Limited degradation for other groups and overall performance



### Prepare your data

To use MinDiff, we create two additional data splits:
* A split for Attractive examples referencing to males, This can be easily done by using the filter() method to filter out all the examples which are not male or not attractive. We will name this Dataset as "dataset_train_sensitive"
* A split for Attractive examples referencing to Females.This also can be easily done by using the filter() method to filter out all the examples which are male or not attractive.We will name this Dataset as "dataset_train_nonsensitive"

In [None]:
def min_diff_data_wo_group(batch_size):
  dataset_train_sensitive=celeb_a_builder.as_dataset(split='train').batch(1).map(preprocess_input_dict).filter(lambda feat_dict: True if (feat_dict[ATTR_KEY][LABEL_KEY]==1 and feat_dict[ATTR_KEY][GROUP_KEY]==1)   else False).map(get_image_and_label)
  dataset_train_nonsensitive=celeb_a_builder.as_dataset(split='train').batch(1).map(preprocess_input_dict).filter(lambda feat_dict: True if (feat_dict[ATTR_KEY][LABEL_KEY]==1 and feat_dict[ATTR_KEY][GROUP_KEY]==0)   else False).map(get_image_and_label)  
  dataset_train_main = celeb_a_builder.as_dataset(split='train').shuffle(1024).batch(batch_size).map(preprocess_input_dict).map(get_image_and_label)
  dataset = md.keras.utils.input_utils.pack_min_diff_data(dataset_train_main, dataset_train_sensitive, dataset_train_nonsensitive)
  return dataset

In [None]:
min_diff_weight = 1.5
original_model = create_model()
min_diff_loss = md.losses.MMDLoss()
min_diff_model = md.keras.MinDiffModel(original_model,
                                         min_diff_loss,
                                         min_diff_weight)

MinDiff Model Training Acc using Alexnet Model: 79.35% 

In [None]:
opt = tf.keras.optimizers.Adam(0.001)
min_diff_model.compile(loss="binary_crossentropy", optimizer=opt, metrics=['accuracy'])
min_diff_model.fit(min_diff_data_wo_group(32), epochs=2)

Epoch 1/2
Epoch 2/2


<keras.callbacks.History at 0x7f0cd24a9990>

In [None]:
min_diff_model.save_original_model('/content/min_diff_model')
eval_results_min_diff = get_eval_results('/content/min_diff_model', 'eval_results')

INFO:tensorflow:Assets written to: /content/min_diff_model/assets


INFO:tensorflow:Assets written to: /content/min_diff_model/assets


Instructions for updating:
Use eager execution and: 
`tf.data.TFRecordDataset(path)`


Instructions for updating:
Use eager execution and: 
`tf.data.TFRecordDataset(path)`


In [None]:
tfma.addons.fairness.view.widget_view.render_fairness_indicator(eval_results_min_diff)

We can clearly see that the gap between FPR in both groups have been reduced. Here FPR of male group is 0.014 and female group is 0.154. The diffrence this time is 0.14 which is less than for the previous model. Depending on the product, further improvements may be necessary, But after this we have succefully decrease the biasness of our model to some extent.

In [None]:

min_diff_weight = 1
min_diff_loss = md.losses.MMDLoss()
min_diff_model = md.keras.MinDiffModel(original_model,
                                         min_diff_loss,
                                         min_diff_weight)

opt = tf.keras.optimizers.Adam(0.001)
min_diff_model.compile(loss='binary_crossentropy', optimizer=opt, metrics=['accuracy'])
min_diff_model.fit(min_diff_data_wo_group(32), epochs=2)

Epoch 1/2
Epoch 2/2


<keras.callbacks.History at 0x7f0c2414a9d0>

In [None]:
min_diff_model.save_original_model('/content/min_diff_model')
eval_results_min_diff = get_eval_results('/content/min_diff_model', 'eval_results')

INFO:tensorflow:Assets written to: /content/min_diff_model/assets


INFO:tensorflow:Assets written to: /content/min_diff_model/assets


Reducing weight increase the difference between male and female from 0.14 to 0.17.



In [None]:
tfma.addons.fairness.view.widget_view.render_fairness_indicator(eval_results_min_diff)