# Activity 1 - core probability concepts

**GOALS [NOTION]**:
* familiarity with core probabilities' lingo
  * mathematical notation
  * events and probability distributions
  * sum rule (_a.k.a._ union of events)
  * conditional probability
  * product rule (_a.k.a_ chain rule)
  * independency
* random variables [r.v.]
  * probability mass function
  * cumulative distribution function
  * expected value and variance
  * discrete vs continuous r.v.'s
  * joint probability
  * marginal probability

## 1.1 there is reason behind!

Probability theory arises as the _optimal_ way to measure plausibility quantitatively. There - we said it.

When we talk about the probability of an event, there's always a set of assumptions that go with it - and speaking of probabilities wouldn't ever make sense otherwise.

So let's consider the setting for this (sub-)activity, and the assumptions that are implicit on it - making it _explicit enough_:
* you have a deck of cards standing on top of a table;
* the deck is clearly wasted, and it is the first time you take a look at it;
* you know - or some friend just told you - that the deck is full: it has the 52 cards;
* you cannot resist but to peak on the first one.

Now:
1. What probability _are you willing_ to attribute for picking the ace of spaces? (the implicit _pretension_ here is that _you_ are willing to be _rational_ and have no reason to believe one card is _more likely_ to show up than any other!)
2. Let's hold on to that pretension; say $A$ represents the event that you pick an ace - any ace - what's $P(A)$?
3. Say $B$ represents the event that you pick a spades card - any spade... - what's $P(B)$?
4. What's the probability that you either get an ace or a space? - or, _in other words_, what's $P(A\cup B)$?
5. Write down the sum rule's equation for the events $A$ and $B$, and use it to arrive at the same result.
6. Now say your friend peaked at the card and didn't show it to you; instead, she told you it is indeed a spades card (our suspicion was right all along!) - what's the probability that it is a number card?
7. Say $N$ represents the event that you pick a number card (and that neither figures nor aces are considered numbers) - how would you represent the _expression_ implicit on the last question with some beautiful probability lingo/jargon/symbols/...hieroglyphs?
8. If your friend had instead told you it was an even number, what would you then _assign_ for the probability of it being a spade?
9. Is it any different than the probability of being a spade without any differentiating prior knowledge about the card?
10. Are the two events - $E$ for drawing an even number, and $B$ for drawing a spade - independent?
11. Write down the chain rule's equation for the events $A$ and $B$ and use it to support your answer.
12. What's the value of $P(A|E)$?
13. Is $A$ independent from $E$?
14. What's the value of $P(E\cap B|N)$?
15. How would you make an _equivalent_ description in english for that mathematical expression?
16. Now say you not only want to peak at the first card, but at the first two; say also that $N_1$ and $N_2$ represent the events that the first and second cards are numbers, respectively. What's $P(N_1, N_2)$?
17. Following the same "subscripting nomenclature" (- let's call it that), what's $P(E_2)$ - the probability of finding an even number card on the second draw?
18. What's $P(E_2,N_1)$? - is it the same as $P(N_1, E_2)$? - what about $P(N_2, E_1)$? (- try first to _interpret_ what that statement means in english)
19. What's $P(N_1|E_2)$? - fun (use the rules - it's sometimes better than thinking-itself! - cof... cof!)
20. Tricky question alert: why is it fun? - is it the question; is it the result? - compare it with $P(N_2|E_1)$.

## 1.2 there might be a simulation!

Let's now consider an event of a more numerical nature - on the land of random variables (henceforth _r.v._).

Say you have a 6-face dice:
* it seems very well balanced - perfectly and symmetrically polished (over all of its _main_ axes);
* (surprisingly enough -) you are concerned with the number of dots on the face that ends up upwards (you could instead be concerned with the amount of times it would hit the floor!);
* with enough patience, you can throw it any amount of times you wish, and register the result.

Since _we_ don't actually have a dice, or at least nowhere to throw it together, let's simulate this one with `python`'s `random` module.

1. Just for (re)starts: what probability would you attribute to each outcome of the dice (- each face landing upwards)?
2. Say $X$ is the r.v. associated with the number of dots on the face of the dice landing upwards, what's its sample space?
3. What's the probability mass function of $X$?
4. What's $P(X>4)$?
5. What's the cumulative distribution function of $X$?
6. What's $P(X>4|X \text{ is prime})$? (- say your same friend has thrown it; didn't show you yet again; and tells you it's a prime)
7. Say you are to make multiple throws, and $X_1$ represents the outcome of the first one, $X_2$ the outcome of the second, and so on, and so on. Do you have _reason_ to _believe_ $X_1$ is independent of $X_2$?
8. What would then be $P(X_2 = 3|X_1 = 3)$?
9. What about $P(X_2 = 3, X_1 = 3)$?
10. Do you have reason to believe that they - $X_1$ and $X_2$ (or, for that matter, $X_3$ and $X_i$) - are _identically distributed_?
11. Say you answered always on the affirmative (- welcome friend!) - use `python` to simulate multiple throws of our dice: make a function that, given a number of throws `n`, returns random sequence of outcomes of the `n` throws - or, _in other words_, the observed values of $(X_1, ..., X_n)$, _a.k.a_ the sample $(x_1, ..., x_n)$. Then, check it works.
12. Use that function to plot the absolute frequencies of each outcome on sequences of 10, 100, and 1000 throws (that's 3 plots).
13. Describe (in english) what you observe on the plots as `n` gets bigger.
14. Does it match what you expected it to be? (- the sweet yes or no)
15. What's the expected value for the distribution associated with $X$, $\mathbb{E}[X]$?
16. What's its variance, $\mathbb{V}[X]$? - you may always use `python` as a calculator!
17. Compare the answers of the last 2 questions with the _sample average_ and _sample variance_ (respectively) of some simulated 100 throws.

## 1.3 or a continuous one!

Now, let's consider an event of a more continuous (and still - numerical) nature - the land of continuous r.v.'s.

Consider a casual setting for a dystopic (yet rational!) dream: 
* there is only one way to travel from Lisbon to Porto: it is a bus that passes once a day;
* the bus is known to departure from Lisbon at any given instant of the day - no one knows in advance;
* there is no evidence that favours a specific time-period for departure - it's _uniformly random_.

For ease of computation, say we represent the time of the day that the bus departures with a r.v. $X$ with sample space $[0, 1[$.

1. What's the probability that the bus departs after noon? - or, _in our words_, what's $P(X>0.5)$?
2. What's the probability that the bus departs _exactly_ at 12h00, 0 seconds, 0 nanoseconds (- and 0-on until the very ultimate atomic unit of time!, whatever _that_ may be)? - or, to be more succinct, what's $P(X=0.5)$?
3. What's the relation between $P(X>0.5)$ and $P(X\geq 0.5)$? - _i.e._, are they the same? - if not, which is higher?
4. What's the probability that the bus departs between 8h00 and 18h00?
5. What about $P(\frac{1}{3} < X < \frac{3}{4} | X > \frac{1}{2})$? - use both the intuition on the interpreation of the mathematical expression, and the product rule, to arrive at the (same!) result.
6. What are the expected value and variance of $X$?
7. Use `python`'s `random` to simulate a sequence of actual departures of the bus on a given year (with 365 days, say).
8. Make a histogram of the sample results with 24 bins.
9.  How do you interpret the visualization? - is the (descriptive) statistics represented according to your expectations?
10. Now assume that you knew for a fact (you are dreaming - and with all your certainties!) that you would fall asleep - somewhen on the trip - with 80% probability if the bus departed between 00h00 and 6h00, and with 30% probability otherwise; say $S$ represents whether you do fall asleep or not (1 if so, 0 otherwise); we could write this whole statement as $P(S=1| X < \frac{1}{4}) = 0.8$, and $P(S=1|X\geq \frac{1}{4}) = 0.3$. What's the probability that you fall asleep, $P(S=1)$ (if you would sit on the station waiting for the bus to depart)? 
11. What's the probability that you do not fall asleep, if the bus would depart before 8h00 (and again, you're sure to catch it)?

# Activity 2 - prevailing distributions

**GOALS [NOTION]**:
* for each distribution,
  * get acquainted with its typical use-cases
  * know the parameters of each distribution (their sufficient statistics)
  * how to make use of it to model events and r.v.
* use `python`'s `scipy.stats` module to work with the distributions
* identify which distribution to use to model a specific event

Uff... so, no matter whether you managed to do all of the previous exercises (already) - we now go to the zoo:
the zoo of probability distributions - the famous ones, the classics, the special types,
the ones people recognize, talk about, and use for the most variate purposes: more concretely, for modelling real world phenomena.



We'll explore specific use-cases of some of the most famous distributions (not necessarily the most prevalent!),
and it is important that you understand each one's typical use-case for modelling.

1. What's the name of the type of distribution that we used on the previous activity for the dice and the bus?

## 2.1 bernoulli - the success!

The classic bernoulli use-case is an event that can only have two outcomes,
one is then considered a success, the other one a failure (with hashtag no judgments).

Consider an imagined traffic light in front of your imagined door-step:
* you know it always spends 10 seconds on green light and 15 on the red one - with no other colors;
* you _know_ that because you're obsessed with it and have measured it twenty thousand times - and now have no reason to _believe_ otherwise, despite past skepticism;
* say you walk out of your house and rush to the traffic light to observe its color at the first instant you see it.

1. What's the probability that you will see a red light?
2. Say we want to model this event with a Bernoulli r.v., $X$, with the success representing seeing the green light and the failure the red one - thus writing $X\sim \text{binomial}(p)$. What's then the value of $p$?
3. Is the expected value of $X$ a possible outcome of the r.v.?
4. 

## 2.2 binomial - the many successes!

* introduce quantiles

## 2.3 geometric - at last, a success...

## 2.4 poisson - counting on a given time

In [None]:
https://towardsdatascience.com/poisson-distribution-intuition-and-derivation-1059aeab90d

## 2.5 exponential - waiting for a given time

In [None]:
https://towardsdatascience.com/what-is-exponential-distribution-7bdd08590e2a

## 2.6 normal - the very special one

## 2.7 all-in-one

Let's mix things up a little. Make sure to use `scipy.stats` profusely, and take note on which distributions are appropriate for each question.

Consider your future-self with even more `python` skills on your hands:
* your future-self grabbed the data on all of your (then) past emails, both the ones flagged spam and the ones not;
* yours truly realized that 15% of the emails you will have received were flagged (by your _provider_) as spam - either fairly or not (!);
* to then realize that the average number of emails you'll have received per hour was 4.2;
* and that the average number of characters on the title was 19.4, with a standard deviation of 3.5.

Your future-self is then very interested on a couple of questions concerning the future's furure:
1. What would you say is the probability that the next email is [flagged as] spam?
2. With what you know so far, is it _reasonable to assume_ that, for the next emails, one being spam or not is independent of what the others are? Argue it out.
3. Say you assume so, and consider the next 20 emails. What is the probability that none of them is spam?
4. What's the probability that at most 3 of them are spam? - which probability distribution would you use to help answering that question? 
5. Say $X$ represents the number of spam emails that you get out of the 20. Write down its expression and parameters' values.
6. What's the probability that you'll get between 2 and 5 spam emails?
7. What's the most probable number of spam emails to get?
8. What's the _expected number_ of them? (allowing for a small abuse of language for basically asking what's $\mathbb{E}[X]$) - is it a number of emails that you can actually get - _i.e._, a _natural number_?
9.  Let's now forget about this' considering only 20. What's the probability that the first spam email to get is after 5 non-spam ones?
10. What's the most probable number of non-spam emails to get until you get the first spam one?
11. Considering your future-self's initial findings, what's probability would you attribute for, on the next week, getting 20 emails? - which probability distribution would you use to help answering that question? - do you need to make any further assumptions in order to use it?
12. What's the probability that you get less than 2 spam emails on the next day?
13. What's the probability that, on the next 3 days, you get more than 10 spam emails?
14. What's the probability of going a whole day without emails?
15. What's the probability of going more than 3 days without spam emails?
16. What's the probability that you wait less than an hour for a non-spam email?
17. Now say we are concerned with how long a title may be.
18. ...



# Activity 3 - the central (limit) theorem

In [None]:
simulate some emails on whether they are spam or not - and get the average - see where it leads

# Activity 4 - testing hypothesis

In [None]:
have some past data generated - test whether spam emails have longer titles

# Activity 5 - A/B testing

In [None]:
say you are a marketing company making publicity emails - 
trying to test how to better fool the email provider into thinking your publicity is not spam