# Ethics: NLP should not do harm

Last week, December 2, 2020, Timnit Gebru, the **co-lead of Google’s ethical AI team** and a widely respected leader in **AI ethics research**, announced via Twitter that the company had forced her out. The pretext for this was her email to her colleagues in which she "pushed back against Google’s censorship of her (and her colleagues’) research, which focused on examining the **environmental and ethical implications of large-scale AI language models (LLMs)** used in many Google products.
(More on this: https://www.bbc.com/news/technology-55164324)

Gebru's paper that was first accepted for publication but then unexpectedly withdrawn by Google without clear explanation was talking about many ethical issues in NLP which we will try to cover (at least partially) in today's tutorial.

**"If we can't talk freely about the ethical challenges posed by AI systems, we will never build ethical systems." **

*DeepMind research scientist Iason Gabriel (Twitter)*



## What is ethics?

**Ethics** is moral principles that govern a person's behaviour or the conducting of an activity.
When we discuss ethics, we think about **how to live one's life**, what is **good** and what is **bad**, and how to **act responsibly** with due considerations to the **outcomes** of your actions. 


Professionals deal with the questions of ethics all the time. 
Think about:
- Engineers who develop weapons - is it ethical? Is it "good"?
- Medical ethics (e.g.Nazi medical experiments; Tuskegee Syphilus Study (https://en.wikipedia.org/wiki/Tuskegee_Syphilis_Study)


Optional reads (some codes of ethics that IT professionals can refer to in their daily practice):

- Google's Code of AI ethics: https://ai.google/principles
- IBM code of AI ethics: https://www.ibm.com/watson/assets/duo/pdf/everydayethics.pdf

## Questions to ask ourselves when developing NLP applications

The Belmont Report (https://www.hhs.gov/ohrp/sites/default/files/the-belmont-report-508c_FINAL.pdf) outlines some important ethical principles and guidelines for research involving **human subjects**. 

**REFLECT AND WRITE (1)**: Why are we talking about human subjects when we talk about Natural Language Processing? (1-2 sentences)

### Write your answer here

Below are some fundamental ethical principles from Belmont Report that can and should be applied to NLP research, along with other AI and IT-related ethics principles:

 - **Respect for persons**: protecting the autonomy of all people and treating them with courtesy and respect and allowing for informed consent. Researchers must be truthful and conduct no deception.

Are we respecting the autonomy of the humans in the research (authors, labelers, other participants)?


 - **Beneficence**: the philosophy of "Do no harm" while maximizing benefits for the research project and minimizing risks to the research subjects.

Who could be harmed? By data or by prediction errors?

 - **Justice**: ensuring reasonable, non-exploitative, and well-considered procedures are administered fairly — the fair distribution of costs and benefits to potential research participants — and equally.

Is the training data representative? Are we being fair to everyone?

Many NLP-specific studies emphasize the impact of NLP on **social justice** - equal opportunities for individuals and groups
(such as minorities) within society to access resources, get their voice heard, and be represented
in society. 

Two more important principles discussed in relation to AI in general and thus applicable to NLP:
- **transparency** ("AI should be designed for humans
to easily perceive, detect, and
understand its decision process" (IBM))
- **accountability** ("Every person
involved in the creation of AI at any
step is accountable for considering
the system’s impact in the world,
as are the companies invested in its
development" (IBM))

Let's have a look at some cases and think about them in terms of ethics. Can these applications of language technology be harmful? In what ways and to whom? Can it be prevented and how?

## 1. Dangers of very large data in NLP

Language models (e.g. GPT-2, GPT-3, BERT etc) are often trained on huge amounts of data from the internet. Common Crawl dataset is commonly used for training such models; it is 
“petabytes of data collected over 8 years of web crawling”. 

First thought you may get: awesome! The more, the better! But think twice. 

When we produce language,what we say or write inevitably reflects our worldviews and biases.

Who posts on the Internet? Who uses Twitter and Reddit? What colour is their skin, what countries do they come from? Whose viewpoints are more represented and whose are less represented or ingored?
All this will be eventually encoded in your model, which will be then used for real-life applications potentially impacting lives.

We can simply start from the fact that Internet access is far from being distributed evenly, therefore Internet data is skewed and overpresesents younger people from developed countries. 
Models trained on "all-Web" data does not represent all of us equally and thus may make the world's inequality worse.

Let's see how worldviews and biases can be implicit in the data and as a result encoded in the models.

## 2. Gender bias encoded in NLP models

a. **Biased word embeddings**

Word embedding models are known for containing gender and racial bias. It is concerning as the use of biased models amplify the biases already present in the society. What society do we want to leave in?
The word2vec model trained on Google News (a very widely used model!) produces reasonable analogies like "Paris to France = Tokyo to Japan" and extremely biased ones like:
**"computer programer to man = homemaker to woman"**.
The male professions, according to the models, are
- maestro
- skipper
- protege
- philosopher
- captain
- architect
- financier
- warrior
- broadcaster
- magician

while female occupations are "homemaker", "nurse", "receptionist", "librarian". In this model, men do "carpentry" while women do "sewing". 
These embeddings reflect biases present in the broader society. 


**b. Biases in machine translation**



Gender bias is a problem in machine translation where there are more examples (on which the models are trained) featuring males than females, which results in better translation for male-featuring sentences. 

Example: 
"The **doctor** told the nurse that **she** had been busy".

Think for a moment: who is "SHE", from the sentence structure? Doctor or nurse?

A human translator carrying out coreference resolution would understand that **she** refers to the doctor,
and correctly translate the entity to German as **Die Arztin**.  An NMT model trained on a biased dataset in which most doctors are male might incorrectly default to the masculine form, **Der Arzt**.

(Source: https://arxiv.org/pdf/2004.04498.pdf)

Another example: **The doctor had been busy** - this would likely be translated with a masculine
entity according to the model's bias.

**TRY IT OUT AND WRITE THE RESULTS (1-2 sentences) (2) **: Try Google Translate (https://translate.google.com/) to translate into a language of your choice some sentences (make up your a couple of your own) using the names of occupations listed earlier as "masculine" or "feminine" (according to the word2vec model) and see if they are translated to that language with a masculine or feminine word. 

### write your answer here

## 3. Racial bias in NLP models and data

### Case 1. Spell-checker and Black names

![] (https://kottke.org/plus/misc/images/deborah-roberts-pluralism.jpg)

The piece above is part of a series called Pluralism by artist Deborah Roberts — it’s a collage of dozens of Black names marked as misspelled by Microsoft Word’s built-in spell checker.  
It is meant to make us think about the "neutrality of technology, how software is built, who builds it, and for whom it is designed".

https://kottke.org/20/10/the-spell-checkers-agenda

### Case 2: Twitter bot Tay

![] (https://i.guim.co.uk/img/media/59900576343e3eb9c228925499c3d03a76b3a7cd/16_0_973_584/master/973.jpg?width=700&quality=85&auto=format&fit=max&s=66da60a6ab8773cb32b849774d9f0ff1)

![](https://cbsnews2.cbsistatic.com/hub/i/2016/03/24/82a90c5d-1ce6-4fa2-8649-9ed7e5186529/1812195fd16a02cac0cd9f0e7223730b/taytweet-resized.jpg)

In 2016, less than a day after Microsoft launched its new **AI bot Tay**, she had to be suspended from tweeting after posting a series of racist statements, including "Hitler was right I hate the jews."  The bot was designed to communicate with "18 to 24 year olds in the U.S" and "experiment with and conduct research on conversational understanding." It appears some of her racist replies were simply regurgitating the statements trolls tweeted at her. What happened? The ML system was learning from the conversation it had with people, and its vocabulary and the worldview would develop based on these conversations. 

Some of the infamous Tay tweets and their "learning" evolution enacted in this video:https://www.youtube.com/embed/pc0rd_K22w8

And the whole story: https://www.youtube.com/watch?v=HsLup7yy-6I

"When asked for comment, Microsoft sent this statement: "The AI chatbot Tay is a machine learning project, designed for human engagement. It is as much a social and cultural experiment, as it is technical. Unfortunately, within the first 24 hours of coming online, we became aware of a coordinated effort by some users to abuse Tay's commenting skills to have Tay respond in inappropriate ways. As a result, we have taken Tay offline and are making adjustments."



Why was it released to the public without a mechanism that would have protected the bot from such abuse, blacklisting contentious language? 
Could the situation be presented if Microsoft had filtered words like the n-word and "holocaust", for example?

**REFLECT AND WRITE (3):** What strategy can you think of that Microsoft could have taked to prevent such a disasterous situation?

### Write your answer here

## 4. Green NLP

![](https://miro.medium.com/max/2412/0*usrjviODaRsgJVFg.png)
With the compute and energy demands of many modern ML methods growing
exponentially, ML systems have the potential to significantly contribute to **carbon
emissions**: https://www.technologyreview.com/2019/06/06/239031/training-a-single-ai-model-can-emit-as-much-carbon-as-five-cars-in-their-lifetimes/

Important steps for the NLP community: **calculate and report in the papers/experiment reports the energy consumption and carbon emissions**. 

A calculator has been developed, including a generator producing a text to insert in a report as a **push for more transparency and accountability**: https://mlco2.github.io/impact/#home

**REFLECT AND WRITE (4) (1-3 sentences)**: Check out this webpage with the calculator and calculate carbon emissions for your imaginary ML project: https://mlco2.github.io/impact/#compute 
Scroll down to "What you can do section", read through the options and let us know if you had thought about these aspects of ML before coming to today's tutorial :) 

### write your answer here

## 5. Privacy

There is a software marketed as a system “that can mitigate the effects of the COVID-19 pandemic across jail and prison facilities” by **alerting prison authorities to sickness-related conversations between inmates and the outside world**.
It scans prisoners' phone conversations, searching for relevant keywords. “It **automatically downloads, analyzes, and transcribes all recorded inmate calls**, proactively flagging them for review,” explains the product brochure, which also claims this “near real-time intelligence” can be used to identify sick inmates, help allocate personnel in understaffed prisons, and even prevent “COVID-19 related murder.”

Think about the following aspects of the technology:

 - Can it also be used to suppress news of the inmates' sickness? Or to retaliate against those raising alarms about prison conditions?
 - If a person talks about COVID-19 on the phone  necessarily mean they are infected with COVID-19?
 - How about non-covid-related conversations (e.g with inmates' attorneys?)? Privacy issues?
 - Is voice recognition technology 100% accurate? Absolutely non-biased?


If you are interested, read more: https://theintercept.com/2020/04/21/prisons-inmates-coronavirus-monitoring-surveillance-verus/

## 6. Crowdsourcing ethics in NLP annotation

It is not uncommon to hear NLP researchers saying something like "It cost me less than one hundred bucks to
annotate this using Amazon Mechanical Turk!” But is this something that a researcher can be proud about? 

**Amazon Mechanical Turk (MTurk)** is a crowdsourcing marketplace that makes it easier for individuals and businesses to outsource their processes and jobs to a distributed workforce who can perform these tasks virtually.
Using Turks, one can produce language resources at a fractions of the standard price (e.g. US$0.05 to translate a sentence). 

The mean wage of a Turker is below 2USD/hour. There is a common assumption that the only people who could accept such a low-paid job is stay-at-home moms or US students with plenty of time to kill. However, research suggests that it is not the case: for example 20% of Indian Turkers said they do the work to meet the basic needs. For many Turkers it is a workplace, but this labour market does not offer a workers' union or any other protection from the employers' wrongdoings. 

Related paper (optional read): https://www.aclweb.org/anthology/J11-2010.pdf

## 7.Text generations ethics (GPT-2 & GPT-3)


You may have heard about the GPT-2 and more recent GPT-3 models trained on the massive Common Crawl web dataset and generating text so well that it often believed to be written by a human.

Try GPT-2 here: https://transformer.huggingface.co/doc/gpt2-large

Check out a piece written by GPT-3 (not a graded exercise but try generating something to check if it has implicit racist/gender bias!): https://drive.google.com/file/d/1qtPa1cGgzTCaGHULvZIQMC03bk2G-YVB/view

If a model can generate a human-like text who would be interested in using it?
Without ethics, one could easily:

- create high-quality spam, fake news, desinformation, propaganda
- generate content for trolling and abusive bots on social media

The creators of GPT-2 were hesitant to publicly release the model because it was "too dangerous". GPT-3 has not been released to public. 

Those who had access to the model report that it is biased (gender/race): it 
- tends "to associate occupations requiring higher levels of education (banker, professor, legislator) and those requiring more physical labor (mason, millwright, etc.) with males and occupations such as nurse, receptionist, midwife, and housekeeper with females".
- makes  "concerning associations with gender and adjectives"
![](https://matthewpburruss.com/assets/resources_GTP_2/gender_table2.png)
- propagates stereotypes existing in the society:
![](https://matthewpburruss.com/assets/resources_GTP_2/religion_table.png)
(Source: https://matthewpburruss.com/post/the-unethical-story-of-gpt-3-openais-million-dollar-model/)

**REFLECT AND WRITE (5)**: What's wrong with releasing GPT-3 to public? Why are such models viewed as potentially dangerous?

### write your answer here

## 8. Unethical chatbots

In addition to the racist Tay, here is another case: https://www.theguardian.com/technology/2015/feb/12/randomly-generated-tweet-by-bot-investigation-dutch-police

"When Twitter user @jeffrybooks tweeted saying “I seriously want to kill people” at a fashion and cosmetics convention happening at Amsterdam, the Dutch police took the threat seriously."
The bot was a  markov chain generator using a simple algorithm to create vaguely coherent sentences from a corpus of text. Oops!

# How can we use NLP for common good?

 - Fact-Checking/Fake News Detection
 - Studying Propaganda and Political Misinformation
 - Detecting bias (inc. law and justice applications)
 - Identifying and fighting toxicity/hate/abuse/suidical behaviour (e.g. in social media and online conversations)

**Example 1**. Creating a dataset of anti-Asian hate speech in Twitter related to COVID and conducting a longitudinal analysis; discovering that
-  hate is contagious and nodes are highly likely to become hateful after being exposed to hateful content
- counterhate messages can discourage users from turning hateful in the first place
- show that hateful bots are more successful in attracting followers compared to counterhate bots

**Example 2**. Facebook has developed algorithms that spot warning signs in users' posts and the comments their friends leave in response. After confirmation by Facebook's human review team, the company contacts those thought to be at risk of self-harm to suggest ways they can seek help.

**REFLECT AND WRITE (6)**: Give one real-life or imaginary example of ethical application of NLP

### write your answer here


# Debiasing language data for training models

As an example of a debiasing strategy, we will have a look at several ways to eliminate gender bias in language data.

https://arxiv.org/pdf/1906.08976.pdf

1. **Data augmentation**

When a dataset contains a disproportionate number of references to one gender, we can create an augmented
data set identical to the original data set but biased towards the opposite gender and to train a model
on the union of the original and data-swapped
sets. 

The augmented data set is created using gender-swapping. The goal of data
augmentation is to debias predictions by training the model on a gender-balanced data set.

- for every sentence in the original data set, create that sentence’s gender-swapped equivalent (e.g. “He went to the park” vs “She
went to the park”.)
-  apply name anonymization to every original sentence and its gender-swapped equivalent - replace all named entities with
anonymized entities, such as “E1” (e.g. "Mary likes her mother Jan" >>> "E1 likes his father E2" 
This removes gender associations with named entities in sentences. 
- train the model on the union of the original data set with name-anonymization
and the augmented data set. 

2. **Gender tagging**

Consider Machine Translation where mixing up the gender of the speaker can lead to inaccurate predictions. 
Current MT models predict the source to be male a disproportionate amount of time -  because training sets are dominated by male-sourced
data points, so the models learn skewed statistical relationships and are thus more likely to predict the speaker to be male when the gender of
the source is ambiguous. 

**Gender tagging** mitigates this by adding a tag indicating the gender of the source of the data
point to the beginning of every data point. 

E.g. “I’m happy” would change to “MALE I’m happy.” In theory, encoding gender information in
sentences could improve translations in which the gender of the speaker affects the translation (i.e.
“I am happy” could translate to “Je suis heureux” [M] or “Je suis heureuse” [F]), since English does
not mark the gender of the speaker in this case.

Problems:

- can be expensive: knowing the gender of the source of a data point requires meta-information, and obtaining
this could be costly in terms of memory usage and
time. 
- MT models may need to be
redesigned to correctly parse the gender tags.

3. **Debiasing word embeddings**

https://www.youtube.com/watch?v=fg8ijSPHyx0 (11 min)