# Part 3: Interpretability and Accountability

In this part of the assignment, you will read and consider two broader sociotechnical issues regarding machine learning related to model interpretability on the one hand and the ethics of facial recognition on the other hand. No coding required for this part.

**Learning objectives.** You will:
1. Consider the role of interpretability and explainability in machine learning models for computer vision tasks
2. Consider the ethical implications surrounding bias and privacy for facial recognition software as one example of object detection in the real world.

## Task 1

Though they can achieve impressive performance, it can be very challenging to make sense of how deep learning models make predictions. One common approach to analyzing deep convolutional neural networks for image recognition tasks is to compute a saliency map, a heat map over an image input that highlights the regions most contributing activation scores to the decision layer of the model. 

The [Bishop Deep Learning Book](https://www.bishopbook.com/), in Section 10.3.3, describes the Gradient Class Activation Map (Grad-CAM) method introduced by [Selvaraju et al. 2016](https://arxiv.org/abs/1610.02391). Given an image and label, this technique computes the derivatives of the score for the label's output unit with respect to the weights of (usually) the last convolutional layer. These are used to compute a weighted average of the feature maps of the final convolutional layer, which is then upsampled and projected as a heatmap onto the original image. At a high level, the concept is that this estimates the regions of the image that provide significant contributions to the activation for the given output unit. 

The algorithm is implemented in the open source [`pytorch-grad-cam`](https://github.com/jacobgil/pytorch-grad-cam/tree/master) which shows how it can be used to visualize the activations for different classes on the same image. 

Not everyone agrees that saliency maps are necessarily helpful for understanding machine learning models. Read at least Sections 1-2 (about 5 pages) of [Dr. Rudin](https://users.cs.duke.edu/~cynthia/)'s highly influential [2019 Nature perspective article](https://arxiv.org/abs/1811.10154) (Dr. Rudin is a Computer Science Professor here at Duke.)

Write 1-2 paragraphs in response to each of the following two questions.

1. Summarize the difference between **interpretable** and **explainable** machine learning models as described in the article.

    From my understanding, interpretable machine learning models are designed where the model is the explanation as its internal structure is transparent and directly communicates how predictions are being made. The model is interpretable mainly because the decision-making process is visible in the model itself. Like from the article by Dr. Rudin, it is preferred if we build models that "are inherently interpretable, they provide their own explanations, which are faithful to what the model actually computes." Now for explainable machine learning models, these models are black boxes that does require post-hoc explanations to make sense of the predictions. The issue that Dr. Rudin mentioned about explainable models is that the "explanations are often not reliable, and can be misleading". This means that the explanations are pretty much approximations of the model's own reasoning rather than the reasoning itself, which is unlike interpretable models. The main difference between interpretable and explainable models is the explanations as with interpretable models, the model is the explanation as the model is direct, faithful, and transparent. While with explainable models, the explanation is separate from the model, which can be misleading.

2. Do you think that saliency maps such as from Grad-CAM are helpful for explaining image recognition models? Briefly explain why or why not.

    For the most part, I say that saliency maps such as Grad-CAM are indeed helpful for explaining image recognition models. From my understanding of the reading, Grad-CAM does help produce a coarse localization map highlighting the important regions in the image for predicting the concept". I say that there are some advantages to this. My first reasoning is that it does give great visual insight to help people understand which features drive a prediction. An example that the research article provided is classifying tiger cats through images since Grad-CAM highlighted the cat's stripes and ears rather than highlighting irrelevant background pixels. Another reason why saliency maps like Grad-CAM are helpful is its bias detection as Grad-CAM has been used to uncover biases in training data. An example of this from the research article is the study of classifying doctors and nurses since the classifier was relying on gendered facial features such as hairstyle instead of medical context, so the bias was detected.


   Although Grad-CAM is helpful and great, we should be aware of some of its limitations as we should use it as a complement instead of a substitute. The main issue is the where vs why distinction because it is able to highlight the correct object region, but it does not fully explain what features are being used to be exact. Even Dr. Rudin speaks out about this distinction where she stated that saliency maps "does not explain anything except where the network is looking" and that "knowing where the network is looking within the image does not tell the user what it is doing with that part of the image". Another issue that Dr. Rudin points out is the potential misguidance since sometimes the heatmaps do look convincing but are misleading as there could be multiple classes that might produce nearly identical saliency maps.

## Task 2

You may have heard of **facial recognition** software: These are essentially models for object detection and recognition that are trained to detect and recognize human faces. Such models have been deployed in law enforcement and, for example, [US Customs when entering the country](https://www.cbp.gov/travel/biometrics).

Such adoption has happened despite serious ethical objections concerning bias and privacy. For example, [Gender Shades by Buolamwini and Gebru, 2018](https://proceedings.mlr.press/v81/buolamwini18a.html) was a landmark audit study of commercially available facial recognition software that demonstrated substantial disparities in performance based on skin color. 

In 2021, Marks observed in a [Communications of the ACM Article](https://cacm.acm.org/news/can-the-biases-in-facial-recognition-be-fixed-also-should-they/) cases of individuals wrongly accused of crimes on the basis of facial recognition software, including software developed using 2.8 billion scraped images from social media without user permission. This raising further questions about privacy. 

However, others might argue that properly trained and debiased models might be used to build a safer (by helping to find criminal suspects) or more convenient society (replacing, for example, paper passports). Walsh, in a [2022 article](https://cacm.acm.org/opinion/the-troubling-future-for-facial-recognition-software/) considers a variety of perspectives including the threat of constant surveillance but also the potential for applications of societal benefit.

In 2-3 paragraphs, take a position on the question "Should governments use facial recognition software?" You might argue yes for some purposes but no for others, or that it depends on how the technology is developed or regulated; explain your view referencing any of the above articles or other sources of your own choosing.

______________________________________________________________________________________________________________________________

In my opinion, I say that government use of facial recognition software should be restricted and may only be used on case-by-case basis under strict and enforceable limits such as warrant-based searches for serious crimes for example, but should not be used for something like crowd scanning without any safeguards. I say that there are concerns with governmental use of facial recognition to the point where there should be some restrictions regarding the use of facial recognition. One big concern with facial recognition is that it was been shown that current facial recognition systems can disproportionately harm specific groups. Like in the Gender Shades paper, it was mentioned that "darker-skinned females are the most misclassified group". This is because the commercial systems had error rates up to 34.7% for darker-skinned women compared to 0.8% for lighter-skinned men. Another example is the MIST's large study in Marks's article where it was reported that Asian and African faces got false positives 10 to 100 times more often than white faces. With these high false positives, false accusations would arise for people of color. For example, in Marks's article, there was a story where Robert Williams and Michael Oliver, who were both African American males, were wrongfully arrested after Detroit police used facial recognition matches, which they soon realized that the matches were incorrect. Another concern is privacy and mass surveillance risk since if facial recognition software were allowed for governmental uses, there is a fear that the government or big tech companies would scrape facial recognition data without people's consent which would undermine privacy and data rights in general.

Although there are many concerns regarding the use of facial recognition software, we should also note that there are benefits with this technology. For instance, in the Walsh's article, it was mentioned that facial recognition software in New Delhi was used to help "reunite nearly 3,000 children with their parents". And I could see the benefits in terms of operational efficiency as the software can speed up suspect identification for investigations in serious crimes. But at the end of the day, there needs to be some restrictions in place to regulate the use of facial recognition. Some suggestions of this can be banning real time crowd scanning, judicial warrants needed to specific cases where facial recognition need to be used, there should be some independent bias audits to evaluate the facial recognition software to ensure no gender and racial discrimination, and one final suggestion is that we should prohibit non-consensual scraping as I beleive that their should be consent before the data is being scraped or used.