Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Paper Summary - Causality for Machine Learning #33

Open
NicolaBernini opened this issue Apr 18, 2020 · 1 comment
Open

Paper Summary - Causality for Machine Learning #33

NicolaBernini opened this issue Apr 18, 2020 · 1 comment
Assignees
Labels
enhancement New feature or request

Comments

@NicolaBernini
Copy link
Owner

NicolaBernini commented Apr 18, 2020

Overview

Causality for Machine Learning

Arxiv: https://arxiv.org/abs/1911.10500

image

Key Value
Type of Contribution Theory
Objective Learning Causal Relationships instead of just Correlations
@NicolaBernini NicolaBernini added the enhancement New feature or request label Apr 18, 2020
@NicolaBernini NicolaBernini self-assigned this Apr 18, 2020
@NicolaBernini
Copy link
Owner Author

NicolaBernini commented Apr 18, 2020

Key Points

1. Intelligence

  • Key problem: define intelligence
  • No unique definition, even for humans, not to mention for AI
  • Some key aspects here
  1. Intelligence as the ability to generalize
  • from seen to unseen data
  • from one task to another task
  1. Intelligence as theability to act in an imagined space (definition of thinking, according to Konrad Lorentz)
  • Implicitly, in order to be able to act in an imagined space it is necessary to learn to predict in such a space which under the hood means learning casual relationships appunto

2. Data Driven Machine Learning

2.1 IID Assumption

  • Data Driven ML consists of learning models from data

  • How is data generated or to be more explicit what are the assumptions on data?

  • Typically ML methods rely on the assumptions that data samples are Independent and Identically Distributed (IID) which means

    • they all come from the same PDF (Identically Distributed)
    • the sampling process is without memory (Independent)
  • What happens when these assumptions do not hold?

  • Typically performance drops but sometimes this could be very sharp, for example let's consider Adversarial Attacks

  • Adversarial Attacks can be seen as the result of a violation of the "Identically Distributed" assumption: the PDF they come from is too distant form the training one (domain gap)

  • They can also be seen as a failure of the model to generalize properly (not enough intelligent)

  • They can also be seen as the result of model instability in certain points of their input space as a small variation in the input causes a huge variation in the output

  • But adding also the temporal dimension, Adversarial Attacks can be seen as a violation of the "Independent" assumption, as an attacker can resubmit more and more times the same sample

  • Furthermore, considering training, as it is an iterative process the weights at a certain iteration depend both on the data observed at that iteration and the weights at the previous iterations so failing at properly shuffling the samples in the dataset could make the NN learn some false correlation in its weights

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant