<table align="left" style="border-style: hidden" class="table"> <tr> <td class="col-md-2"><img style="float" src="http://prob140.org/assets/icon256.png" alt="Prob140 Logo" style="width: 120px;"/></td><td><div align="left"><h3 style="margin-top: 0;">Probability for Data Science</h3><h4 style="margin-top: 20px;">UC Berkeley, Spring 2018</h4><p>Ani Adhikari</div></td></tr></table><!-- not in pdf -->

# Homework 5 #

Your homework has two components: A (written work only) and B (also involving code). Each question or subpart is labeled accordingly. Written work should be completed on paper, and coding questions should be done in the notebook. It is your responsibility to ensure that your homework is submitted completely and properly to Gradescope. Refer to the bottom of the notebook for submission instructions.

#### Rules for Written Homework ####

- Every answer should contain a calculation or reasoning. For example, a calculation such as $(1/3)(0.8) + (2/3)(0.7)$ is fine without further explanation or simplification. If we want you to simplify, we'll ask you to. But just ${5 \choose 2}$ is not fine; write "we need 2 out of the 5 frogs and they can appear in any order" or whatever reasoning you used. Reasoning can be brief and abbreviated, e.g. "product rule" or "not mut. excl."
- You may consult others but you must write up your own answers using your own words, notation, and sequence of steps.
- In the interest of saving trees, you do not need to *solve* each question on a new piece of paper. Folding the paper to show just the relevant problem will suffice. To ensure the correct page size, we recommend placing the folded part on a blank page before scanning, or adjusting the page settings on your phone scanning app.
- You will submit a scanned PDF to Gradescope. **Each question should *start* on a new PDF page. No page should contain two questions.**

#### Rules for Coding ####

- Do not share, copy, or allow others to copy your code. You may discuss your approach and relevant methods or functions to use.
- A code cell (which may contain starter code) is provided for each question or subpart that requires coding. You are free to add additional cells as needed.
- You will submit a PDF to Gradescope. See the bottom of the notebook for more instructions.
- Here is the Prob140 documentation [guide](https://probability.gitlab.io/prob140/index.html) for your reference.

In [2]:
# Run this cell to set up your notebook

import numpy as np
from scipy import stats
from datascience import *
from prob140 import *

# These lines do some fancy plotting magic
import matplotlib
%matplotlib inline
import matplotlib.pyplot as plt
plt.style.use('fivethirtyeight')

# These lines make warnings look nicer
import warnings
warnings.simplefilter('ignore', FutureWarning)

### 1. Panda's Problem ###

Every day, Panda the black-and-white cat comes to our house for food.
Assume that every day:

- We put the food out at the front door or at the back door, according to whether a $p$-coin lands heads or tails.

- Panda arrives at the door at which it found the food the previous day; if the food is not there, Panda is disappointed and trudges to the other door to eat.

**a)** Set up a four-state Markov Chain and find the long run expected proportion of days when Panda is disappointed.

**b)** Suppose that yesterday Panda arrived at the front door and was not disappointed. What is the chance of the same thing happening today? What is your best guess for the chance of the same thing happening (Panda arriving at the front door and not being disappointed) one year from now, assuming that the process continues as described?

**c)** Panda's strategy is to remember where the food was the previous day, and go to that door. Here are three other strategies that Panda might use:

- Always go to the front door.

- Always go to the back door.

- Remember where the food was the previous day, and go to the other door.

Compare each of these strategies to the strategy Panda uses: for what values of $p$ do these result in a lower expected proportion of days of disappointment?

#newpage

### 2. Jump Up, Fall Down ###
Consider a Markov Chain with state space $0, 1, 2, \ldots, 12$ and transition behavior given by:
- For $0 \le i \le 11$, the distribution of $X_{n+1}$ given $X_n = i$ is uniform on $i+1, i+2, \ldots , 12$.
- $P(12, 0) = 1$.

**a) [CODE]** Complete the cell below to construct the transition matrix of this chain and assign it to the name `jump_fall`.

In [None]:
#Answer to 2a

s = np.arange(...)
def transition_probs(i, j):
    ...
jump_fall_tbl = MarkovChain....
jump_fall

**b)** Explain why this chain is irreducible and aperiodic.

**c) [CODE]** Write code that uses `jump_fall` and `prob140` methods so that the final line evaluates to the expected long run proportion of time the chain spends at 0.

In [None]:
#Answer to 2c
...

### 3. Switching Chain ###

Consider a Markov Chain $X_0, X_1, \ldots $ with the transition matrix given below, for some $0 < p < 1$ and $q = 1-p$.

|     | $~~0~~$ | $~~1~~$ |
|-----|-----|-----|
| $~~0~~$ | $~~p~~$ | $~~q~~$ |
| $~~1~~$ | $~~q~~$ | $~~p~~$ |

**a)** For $n \ge 1$, let $C_n$ be the number of *switches* up to time $n$. That is, $C_n$ is the number of times the chain changes state up to and including time $n$. For example, if the path is 0 0 0 1 0 0 0 1 1, then $C_8 = 3$ (remember that the path starts at $X_0$). What is the distribution of $C_n$, and why?

**b)** Fill in the blank with a word:

For $n \ge 1$, 
$$
P_n(0, 0) ~ = ~ P(C_n \text{ is } \underline{ ~~~~~~~~~~~~~~~ })
$$

**c)** Now find $P_n(0, 0)$ using Part **b**. [Hint: Compare the expansions of $(p+q)^n$ and $(p-q)^n$. How can you use both of them to get just the terms that you need?] 

**d)** Use Part **c** (not the balance equations) to find the stationary distribution of the chain.

### 4. Doubly Stochastic Matrices ###

As you know, transition matrices are stochastic: each row sums to 1. A transition matrix is *doubly stochastic* if each of its columns also sums to 1. That is, for every $j$ in the state space $S$,

$$
\sum_{i \in S} P(i, j) = 1
$$

**a)** An irreducible, aperiodic Markov Chain with state space $0, 1, 2, \ldots, N$ has a doubly stochastic transition matrix. What is the stationary distribution of the chain? Prove your answer. [Hint: Look at the transition matrix in Exercise **3** and also the answer to **3d**.]

**b)** For $n \ge 1$, let $S_n$ be the total number of spots on $n$ rolls of a die. For large $n$, approximately what is the probability that $S_n$ is a multiple of 7? [Think about remainders.]

### 5. Wet Professor ###

A professor has two umbrellas, each of which could either be in her office or in her car. The professor walks from her car to her office; she also walks from her office to her car. Assume that on each of these walks:

- It rains with probability 0.7, independently of all other walks. 

- If it is not raining, the professor ignores the umbrellas.

- If it is raining, she uses an umbrella if there is one, and gets wet if there isn't.

**a)** In the long run, what is the expected proportion of walks on which the professor gets wet?

#newpage

### 6. Bernoulli-Laplace Model for Diffusion ###

There are 4 black balls and 6 white balls distributed in two containers such that each container always has 5 balls. Suppose that the balls move back and forth between containers as follows:

- At every step, one ball is selected at random from each container, independently of the other choice; then these two balls are switched. That is, each chosen ball is taken out of its container and placed in the other one.

For $n \ge 0$, let $X_n$ be the number of black balls in Container 1 at time $n$. Then $X_0, X_1, \ldots $ is a Markov Chain.

**a) [CODE]** Create a MarkovChain object `BL` that is the transition matrix of the chain, and use it to find the long run expected proportion of steps in which all of the black balls are in Container 1.

**b) [CODE]** For every step $n$ at which $X_n$ is an odd number, a gambler wins $\$X_n$. That is, whenever $X_n = 1$ the gambler wins $\$1$ and whenever $X_n = 3$ the gambler wins $\$3$. When $X_n$ is not an odd number, the gambler doesn't win anything. Find the gambler's long run expected average winnings per step.

**c)** Is the chain reversible? Explain your answer.

In [None]:
#Answer to 6a

In [None]:
#Answer to 6b

## Checklist

Your submission should have the following parts:

#### Part A (Written)

- 1a, 1b, 1c
- 2b
- 3a, 3b, 3c, 3d
- 4a, 4b
- 5a
- 6c

#### Part B (Code)

- 2a, 2c
- 6a, 6b

## Submission Instructions


#### Logistics 

1. Use a scanner or scanning app (such as CamScanner) to digitize your written assignments. Do not take pictures using your phone's camera app.
2. For code portions, examine the generated pdf before uploading to make sure that it contains all of your work.
3. When submitting to Gradescope, select the pages of your upload corresponding to each question. 
4. If you encounter any difficulties when submitting or exporting your assignment, please make a private Piazza post **before the deadline**. 

### **We will not grade assignments which do not have pages selected for each question, are illegibly scanned, or are submitted after 8PM.** 


#### Part A (Written)
- Make sure you have at least 6 pages of homework. Each problem should start on a new page; for example,  Problem 1 on page 1, Problem 2 on page 2, etc.
- Scan all the pages into a PDF. **Make sure the PDF page size is 8.5 x 11 inches**. It is your responsibility to check that all the work on the scanned pages is legible. You can use any scanner or a phone using applications such as CamScanner. Save the PDF.
- Upload the scanned PDF of your work onto Gradescope for the assignment "HW_03A". 
Refer to [this guide](http://gradescope-static-assets.s3-us-west-2.amazonaws.com/help/submitting_hw_guide.pdf) for detailed instructions about scanning and submitting, or consult course staff.

#### Part B (Code)

1. **Save your notebook using File > Save and Checkpoint.**
2. Run the cell below to generate a pdf file.
3. Download the pdf file and confirm that none of your work is missing or cut off.
4. Submit the assignment to "HW_03B" on Gradescope. Use the entry code "9GEKKD" if you haven't already joined the class.

In [None]:
import gsExport
gsExport.generateSubmission("hw05.ipynb")