# ED 0: Introduction

Data science has the potential to be both beneficial and detrimental to individuals and/or the wider public. To help eliminate/mitigate any adverse effects, we must seek to understand the potential impact of our work and consider any opportunities that may deliver benefits for the public

While there is no single definition of data science, it can be broadly thought of as scientific, computational and analytical methods used to process and extract information, knowledge, and insights from data to inform decision-making (or to act in an automatic way). Data science is the study of extracting value from data – value in the form of 
- **insights**: A hypothesis, testable with more data 
- **conclusions**:
    - Prediction of a consequence
    - Recommendation of a useful action
    - Clustering that groups similar elements
    - Classification that labels elements in groupings
    - Transformation that converts data to a more useful form
    - Optimization that moves a system to a better state

Insights and conclusions often arise from **models**, which are abstractions of the real world. 
- We have: Data in the world - represents -> Model - Yields -> Model conclusion - Corresponds -> World conclusion. Thus, Data in the world - Results in -> World conclusion
Models that generate these conclusions may be 
- **clear box**: model's logic is available for inspection by others 
- **black box**: model's logic not available 
- **opaque box**: model whose operation is not comprehensible

As data science methods become more common within different fields, there are both opportunities and challenges for individuals working in data science. For example, managing:
- privacy, 
- fairness, and 
- bias
when working with people’s data can be difficult and complex when using algorithmic methods.




## Stages of responsible model development, debugging, understanding and deployment:

- **Data**: Consider whether data of sufficient integrity, size, quality, and manageability exists or could be obtained. 
- **Approach**: Consider whether there is a technical approach grounded in data, such as an analysis, a model, or an interactive visualization, that can achieve the desired result.
- **Dependability**: privacy protections, security, 
- **Understandability**: need to detail the causal chan underlying its conclusions, transparecy, reproducibility
- **Focus**: well-specified objectives 
- **Tolerance**: possible unintended side effects if the objective is not quite right and the possible damage from failing to meet objectives
- **ELSI**: application holistically with regard to legality, risk, ethical considerations. Dependability or Clear Objectives. 

### Some examples:


Some Data Science problems that may have ethical challenges: 
- Analyzing Speech Recognition: 
    - there is a **privacy** issua if speech recordings or transcripts are transmitted or stored. Therefore, many applications do speech recognition on-device without retaining recordings.
    - Machine-learned speech recognition models may perform poorly for subpopulations. Supporting these subpopulations is beneficial to all, and the balance of effort to do so is a **fairness** issue that must be considered.
- Music Recommendation: 
    - there are many ethical issues relating to the type of recommendations made and their impact on individual listeners, their community, and the creator/ artist whose success may be at the mercy of these algorithms.
- Analyzing Protein Folding: 
    - ELSI issues related to protein folding are minimal, though applying that knowledge (e.g., in diagnosing or treating disease) will result in many challenges.
- Analyzing Healthcare Records: 
    - Health-related data is significantly regulated, as are study designs involving patient health records.
- More interesting examples with tables on the ppt!


## The role of technology in society

The course of **human history** can be grouped into three time periods separated by "revolutions":
- **Cognitive Revolution**: Language was developed by humans
- **Agricultural Revolution**: The humanity didn't have the necessity to keep moving around, they could stay, stabilize and create civilizations
- **Scientific revolution**: Changed the process of developing things. 
- **Digital Revolution**: Concepts such as big data, machine learning, artificial intelligence and data science are making possible a new Revolution, the Digital, which can have as much or deeper consequences than the previous ones. The digital revolution will change the way we live, our environment, and will be the most profound revolution of all of these revolutions (that changed things in that moment)

A Revolution is associated with a change, often of a technological nature, that causes the human species to change its way of life.

**Kranzberg's First Law**: Technology is neither good nor bad: nor is neutral. 
- By which he means that, “technology’s **interaction** with the social ecology is such that technical developments frequently have environmental, social, and human **consequences that go far beyond the immediate purposes** of the technical devices and practices themselves, and the same technology can have quite **different results** when introduced into **different contexts** or under different circumstances

**Technologies are not ethically ‘neutral’**, for they
reflect the values that we ‘bake in’ to them with
our design choices, as well as the values which
guide our distribution and use of them. Technologies both reveal and shape what
humans value, what we think is ‘good’ in life and
worth seeking. 

Not only does technology greatly impact our
opportunities for living a **good** life, but its
**positive and negative impacts are often
distributed unevenly** among individuals and
groups. Technologies can create widely disparate
impacts, creating **‘winners’ and ‘losers’** in the
social lottery or magnifying existing inequalities:
- All revolutions have winners and losers, and we cannot change it. However we can mitigate the impact over the losers of a revolution. 

How do we ensure that access to the
enormous benefits promised by new
technologies, and exposure to their risks, are
distributed in the right way? This is a matter
of **ethics**.

## The role of Big Tech

The **three main corporate and industry logics** in **Big Tech** are **meritocracy, technological solutionism**, and **market fundamentalism**:
- **Meritocracy**: is an ideological framework that **legitimizes unequal distributions of wealth and power as arising from differences in individual abilities**
    - This has defined the modern subject: as autonomous and responsible for perpetual self-improvement.
    - The tech industry was founded on the myth that it is a **meritocratic segment where talents should be rewarded handsomely**. This meritocratic belief manifests in the idea that engineers are best at solving ethical issues that their products might create. 
    -  Similarly, meritocratic logics place a strong emphasis on individual ethics rather than regulation and legislation.
    - Companies and teams try to come up with their own codes of ethics to drive off legislation
- **Technological solutionism**: is the **belief that technology can solve social problems**, which are then reinforced by the financial rewards that the industry has gained for producing technology that they believe solve the problems.
    - This logic leads to creation of checklists, procedures or evaluative metrics to ensure the design and implementation of ethical products.
    - The authors however point out that this approach is limited, and problematic because it centers ethics in the practices of technologists, and not in the social worlds wherein technical systems are created.
- **Market Fundamentalism**: or market logics, refers to the idea that **companies are there to make money, and if ethics initiatives are cut into the bottom line, companies should not do it.**
    - There is a belief that ethical initiatives are often costly, and antithetical to corporate profits. Furthermore across the industry, if other companies do not implement similar ethical considerations on their products, one should not do it.
    - In the context of the absence of a legal framework, implementing ethics initiatives might be a business problem rather than a solution. In other words, **the works of ethics owners in practice are constrained by what the market can allow**.

## Ethics and Algorithms

We will consider algorithms that are used to:
1. Turn data into evidence for a given outcome, which is used to
2. Trigger and motivate an action that may have ethical consequences

Actions 1. and 2. may be performed by semi-autonomous algorithms- such as ML- and this complicates the attribution of responsability for the effects of actions that an algorithm may trigger. 

There are at least 5 types of **ethical concerns**:
- Epsitemic factors: relevance of the quality and accuracy of the data for the justifiability of the conclusions that algorithms reach and which may shape morally-loaded decisions affecting individuals, societies and the environment. 
    1. Inconclusive evidence
    2. Inescrutable evidence
    3. Misguided evidence
- Normative concerns: ethical impact of the algorithmically-driven actions and decisions, including lack of transparency of algorithmic processes, unfair outcomes, and unintended consequences. 
    4. Unfair outcomes
    5. Transformative effects

## Applied ethics Problems

DS/AI ethics concerns can be divided in 3 different time frames/areas:
- Short-time/organization: What is the impact of [privacy, transparency, fairness] in my application?
- Medium-time/society: How the use [military use, medical care, justice, education] of these applications will change the way we are organized as a society?
- Long-time/humans: What are the ethical goals of these technologies?

# ED1

**Industry self-regulation** is the process whereby members of an industry, trade or sector of the economy monitor their own adherence to legal, ethical, or safety standards, rather than have an outside, independent agency such as a third party entity or governmental regulator monitor and enforce those standards.

## Data and Ethics

The combination of data analytics, a data-saturated and poorly regulated commercial environment, and the absence of widespread, well-designed standards for data practice in industry, university, non-profit, and government sectors has created a ‘perfect storm’ of **ethical risks**.

Thus no single set of ethical rules or guidelines will fit all data circumstances; ethical insights in data practice must be adapted to the needs of many kinds of data practitioners operating in different contexts.

In the context of data practice, the potential harms and benefits are real and ethically significant. But due to the more complex, abstract, and often widely distributed nature of data practices, as well as the interplay of technical, social, and individual forces in data contexts, the harms and benefits of data can be harder to see and anticipate.

In this respect, then, data has a broader ethical sweep than engineering of bridges and airplanes. Data practitioners must confront a far more complex ethical landscape than many other kinds of technical professionals...

## Ethical benefits of Data Practices:

- **Human Understanding**: Because data and its associated practices can uncover previously unrecognized correlations and patterns in the world, **data can greatly enrich our understanding of ethically significant relationships — in nature, society, and our personal lives.**
- **Social, institutional, and economic efficiency:** Once we have a more accurate picture of how the world works, **we can design or intervene in its systems to improve their functioning**. This reduces wasted effort and resources and improves the alignment between a social system or institution’s policies/processes and our goals
- **Predictive accuracy and personaization**: Not only can good data practices help to make social systems work more efficiently, but they can also be used to more precisely **tailor actions to be effective in achieving good outcomes for specific individuals, groups, and circumstances**, and to be more responsive to user input in (approximately) real time.

## Ethical HARMS of Data Practices:

- **Harms to Privacy and Security**: Thanks to the ocean of personal data that humans are generating today (or, to use a better metaphor, the many different lakes, springs, and rivers of personal data that are pooling and flowing across the digital landscape), most of us do not realize how **exposed our lives are**, or can be, by common data practices.
- **Harms to Fairness and Justice**: We all have a **significant interest in being judged and treated fairly**, whether it involves how we are treated by law enforcement and the criminal and civil court systems, how we are evaluated by our employers and teachers, the quality of health care and other services we receive, or how financial institutions and insurers treat us.
- **Harms to Transparency and Autonomy**: transparency is the **ability to see how a given social system or institution works**, and to be able to inquire about the basis of life-affecting decisions made within that system or institution. 

### Europe's GDPR: General Data Protection Regulation

The GDPR can be summarised in the following points:
1. It concerns **“Personal Data**”: Name, address, localisation, online identifier, health information, income, cultural profile, ...
2. Communication: Who gets the data, why, for how long? (No use for other ‘incompatible’ purposes. Use as long as necessary.)
3. Consent: Get clear informed consent
4. Access: Provide access to my data.
5. Right to be forgotten (not for research).
6. Right to explanation for contracts (& right to have a person decide).
7. Marketing: Right to opt out.
8. Legal: Maintain EU legislation when transferring data out.
9. Need for a “data protection officer” in your organisation.
10. Impact assessment prior to high-risk processing (new rechnology, personal information, surveillance, sensitive)
 

## What is Ethics

**Ethics is the process of questioning, discovering and defending your values, principles and purposes in order to be able of deciding what is right and what is wrong.**

> Interesting schemes on the diaps 26/27

In order to make a good individual decision we take into account:
1. **Our knowledge**: our vision of the worlds
    - **Society: Moral** (our environment)
        - **Morality refers to an informal social framework of values, beliefs, principles, customs and ways of living**
        - Moral systems provide a set of answers to general ethical questions.
        - Morality is, in most of the cases, inherited (unconsciously) from family, community or culture.
            - Exampples: christianity, stoicism, buddhism, ...
        - Morality is applied as a matter of habbit, without having to think.
    - **Philosophy: Law:**
        - Laws are formal rules that govern how we behave as members of a society. 
        - They specify what we must do, and more frequently, what we must not do.
        - They create an enforceable standard of behavior.
        - Laws can be just or injust, because they are subject to ethical assessment.
    - Science: our knowledge
    - Belieafs abour reality
2. Our beliefs: necessary ingredients of a good individual decision. The most common approach is to consider that, rather than grounding our beliefs on a solid foundation, we assemble a collection of beliefs that hold together under a mutual (maybe unstable) knowledge attraction.
    - Purpose: our reason for being:
        - Leaving the worlds better than we found it
    - Principles: Lines we'll never cross
        - Treat other people the way you would like to be treated
    - Values: The things that are good:
        - Justice, knwoledge, equality
        
 The role of ethics is not to be a soft version of the
 law, even if laws are based on ethical principles.
 The real application of ethics lies in
 challenging the status quo, seeking its deficits
 and blind spots.
 
You can take decisions exclusively based on **laws and morality**, but this should not be enough.
Ethics is a process of **reflection** that aims to answer this question: What should I do?
The answer is based on our **values, principles and purposes** rather than social conventions.
An ethical decision is based on conscious, rational reflection.

### Traditional Normative Ethics




Nomative ethics means to understand the regulations and act according to the legal rules. There are three traditional theories of what it means to be ethical:

- **Utilitarianism**: (J. Bentham)
    - Allows not following the rules depending on the **consequencies** if it is for the society good.
    - From an utilitarian point of view you can justify some discrimination of the minority. 
- **Deontology**: Kant **follows the rules** till the very end, even if it implies doing something that does not benefit the society (e.g. the **Golden Rule: ‘Treat others how you want to be treated’**). It always follows the rules. An action should be based on whether that action itself is right or wrong under a series of rules, rather than based on the consequences of the action. (**beliefs**) Clear rules, i.e. *don't lie*, *don't kill*. 
- **Virtue ethics**: (Aristotle) Does an action contribute to virtue? (**justice, honesty, responsability, care**, etc)

Pragmatic ethics comes when we do not follow one of these ideas strictly, and instead we understand them and combine them depending on the situation. 

#### Example:

I, Robot, defined 3 rules in order to protect the humanity from robots. This follows the **Deontology** rule of ethics. Nevertheless, even when strictly following these rules, robots could hurt humanity. So there is no perfect approach for ethics.

1. A robot may not injure a human being or, through inaction, allow a human being to come to harm.
2. A robot must obey orders given it by human beings except where such orders would conflict with the First Law.
3. A robot must protect its own existence as long as such protection does not conflict with the First or Second Law.


#### Traditional Ethics: Example 2

Suppose it is obvious that someone in need should be helped.
- A utilitarian will point to the fact that the consequences of doing so will maximize **well-being.**
- A deontologist will point to the fact that, in doing so the agent will be acting in accordance with a **moral rule** such as “Do unto others as you would be done by”.
- A virtue ethicist will point to the fact that helping the person would be charitable or **benevolent**.

### Political Philosophy

There are 4 theories about what is right and what is wrong in society. 

- John Rawls (**Rawlsians**): when we make a decision we must make it from the point of view of the **ignorance (and not from the personal point of view)**. I.e. this decision could be taken from any unknown person (this is what means from the **point of view of the ignorance**. This comes from the **consequences** of actions: **Utilitarians**!
    - The assurance of basic necessities and the opportunity to do better would form the foundation for social and political justice and provide the ability for people to assert themselves.
- John Locke (**Libertarians**): individuality at the extreme, individualistic point of view. 
    -  A man had a right to live for himself and an **individual’s happiness cannot be prescribed by another man or any number of other men**
    -  Libertarianism holds that the basic moral concepts are **individual rights** and that the rights to be respected are noninterference rights. These generally fall under the heading of rights to life, to **liberty or to property**.
    -  For libertarianism, the only proper limit to one person's enjoyment of these rights is his or her duty to respect the similar rights of others.
- John Stuart Mill (**Utilitarians**): you should make decisions always from the point of views of the **greatest good for the greatest number**:
    - Utilitarian calculus opens up the possibility that in situations such as a pandemic, some people might justly be sacrificed for the greater good. It would benefit society to accept casualties.
- Michael Sandel (**Communitarians**): Everyone derives their identify from the broader community. **Individual rights count, but not more than community norms**. Justice cannot be determined in a vacuum or behind a veil of ignorance, but must be rooted in society (common good). **First community and then the individuals**. 
- Marx: radical way of thinking about decisions and society. Emancipation is a key concept.


### Only west-centric values?

#### Example: Buddhism

Buddhism proposes a way of thinking about ethics based on the assumption that **all sentient beings want to avoid pain**. Thus, the Buddha teaches that an action is good if it leads to **freedom from suffering.**

Another key concept in Buddhism is compassion, or the desire and commitment to eliminate suffering in others. 

> The idea that we can change the world by doing things is an illusion. Buddhism defends that the only effective thing we can do is to be good personally, and then by being a reference or example the other people in the world will also be better: the only way to do good is by the power of the example. 



#### Canonical views of AI ethics?

Interesting shceme diap 45

### Ethics approaches

- The **normative approach to ethics focuses on how the world should be**. 
- The **positive approach to ethics describes the world as it is**

### An alternative approach to ethics

#### Positive approach

The **positive approach** to ethics describes the world as it is. It is about **how humans judge situations and decisions in different scenarios**. When we judge decisions, we follow different criteria when considering people and machines:
- For instance, empirical work has shown that people exhibit **algorithmic aversion**, a bias where people tend to reject algorithms even when they are more accurate than humans.

In recent decades, psychologists have discovered **five moral dimensions** that humans consider when judging situations:
- **Harm**, which can be both physical or psychological 
- **Fairness/liberty**, which is about biases in processes and procedures
- **Loyalty**, which ranges from supporting a group to betraying a country
- **Authority**, which involves disrespecting elders or superiors, or breaking rules
- **Purity**, which involves concepts as varied as the sanctity of religion or personal hygiene.

These five dimensions define a space where we, humans, decide what is right and what is wrong.

When judging situations we consider these factors. The weight we give to each of them changes depending on whether we are judging a person or a machine. 
-  Findings suggest that people judge machines based on the observed outcome, but judge humans based on a combination of outcome and intention.
- Humans are judged more positively than machines in autonomous driving scenarios
- Humans are judged more harshly (plagiarism)



# Exam:

We should know the general terms: normative, utilitarian, deontological

## Assignment 1

Clearly the given dataset is biased. What we would like to have is a dataset. What we do in order to equilibrate class A and class B is to oversample class B in order to get 6-3 additional samples. Sample could be as simple as repeating the samples on the dataset, or reweight the dataset (this is the interesting way to do it). There are several ways to do it implemented on Python (reweighting works better). 

In order to select the columns we need an expert to know the information provided by each column. 

This is all regarding statistical methods. 