Make sure you fill in any place that says `YOUR CODE HERE` or "YOUR ANSWER HERE". As a reminder, there is **NO COLLABORATION** whatsoever on the final.

---

---
## Part A (3 points)

Chomsky's argument from the poverty of the stimulus is one of the most
famous and controversial claims in cognitive science. Many researchers
view formal results on learnability as supporting the argument.

<div class="alert alert-success">In one paragraph, briefly summarize the poverty of the stimulus
argument, and outline the implications of the argument for language
learning.</div>

YOUR ANSWER HERE

---
## Part B (2 points)

In analyses of language learnability, cases where one language is a subset of another are particularly challenging. Imagine that a learner is trying to decide between two languages. Language $L_1$ consists of all the grammatical sentences in English. Language $L_2$ consists of all of the grammatical sentences in English, plus "Furiously sleep ideas green colorless." We can express this relationship between the languages graphically:
<br />

![](images/subsetProblem.png)


<div class="alert alert-success">Assume that the learner only receives positive examples—sentences that are valid examples of $L_1$—and wants to choose between $L_1$ and $L_2$. Explain why this creates a problem for approaches to learning that view a language as a set and attempt to logically deduce that set from examples.</div>

YOUR ANSWER HERE

---
## Part C (2 points)

Now, assume that the learner views $L_1$ and $L_2$ as probability distributions over sentences. Under $L_1$, sentences are generated from the probability distribution observed in English. Under $L_2$, $c$ is the probability that the specific sentence "Furiously sleep ideas green colorless" is generated, and $1-c$ is the probability that a sentence is generated from the probability distribution observed in English. Thus, the probability of any particular sentence in English in $L_2$ is $(1-c)$ times the probability of that sentence in $L_1$ (in other words, if the sentence in English is $s$, then $P(s|L_2) = P(s|L_1)(1-c)$). The problem of learning a language can now be formulated as a problem of Bayesian inference. The hypotheses $h$ are languages, and the data $d$ are a set of sentences. The likelihood $P(d|h)$ is the probability of a set of sentences given a language. 

In this case we are interested in the <i>posterior odds ratio</i>, or how many times more likely is $L_1$ than $L_2$. Recall posterior odds are given by the equation: 

$$\frac{P(L_1~|~d)}{P(L_2~|~d)} = \frac{P(d~|~L_1)P(L_1)}{P(d~|~L_2)P(L_2)}$$

<div class="alert alert-success"> Complete the function `getPostOdds` so that it computes the posterior odds ratio of $L_1$ over $L_2$ after observing $n$ consecutive English sentences. Assume that the sentences are generated independently from the distributions associated with the two languages, and that the probability of generating 'Furiously sleep ideas green colorless' under $L_2$, as denoted by the argument $c$, is .001.  Assume that the two languages, $P(L_1)$ and $P(L_2)$, are equally likely <i>a priori</i>. </div>

In [None]:
def getPostOdds(n, c=.001):    
    """Compute the posterior odds ratio of L1 (all grammatical sentences 
    in English) / L2 (all grammatical sentences in English + 'Furiously 
    sleep ideas green colorless')
    
    Parameters
    ----------
    n : integer
        The number of observed sentences
    c : float
        The probability that 'Furiously sleep ideas green colorless' 
        is generated under L2. By default, c = .001
        
    Returns
    -------
    float
        The posterior odds ratio of L1 / L2 after observing n sentences
    
    """    
    # YOUR CODE HERE
    raise NotImplementedError()

In [None]:
print(getPostOdds(10))
print(getPostOdds(100))
print(getPostOdds(1000))
print(getPostOdds(10000))

In [None]:
# add your own test cases here!

In [None]:
from numpy.testing import assert_allclose

assert_allclose(getPostOdds(10, c=.01), 1.1057273553218807)
assert_allclose(getPostOdds(100, c=.05), 168.90381970677726)
assert_allclose(getPostOdds(1000, c=.005), 150.28625220462635)
assert_allclose(getPostOdds(1, c=.005), 1.0050251256281406)
assert_allclose(getPostOdds(10, c=.3), 35.40133174641438)

print("Success!")

---
## Part D (1 point)

<div class="alert alert-success">Look at the posterior odds ratio after 10, 100, 1000, and 10000 sentences. Which lanugage were these sentences more likely to be drawn from? How does the posterior odds ratio change with the observation of additional sentences?</div>

In [None]:
print(getPostOdds(10))
print(getPostOdds(100))
print(getPostOdds(1000))
print(getPostOdds(10000))

YOUR ANSWER HERE

---
## Part E (2 points)

This is just one example of how the assumption that a language is a probability distribution and learners are provided with samples from that distribution can make it possible to learn the correct language in cases where other approaches fail. In contrast to other analyses of learnability, it is possible to show that under these assumptions even complex languages (such as those represented by probabilistic context free grammars) can be learned from sufficiently large amounts of data.

<div class="alert alert-success">Briefly discuss how the probabilistic approach above relates to the poverty of the stimulus argument. In particular, how does this approach handle the subset problem? How does this differ from other formal approaches to learnability (e.g., Gold's Game)?</div>

YOUR ANSWER HERE

---

Before turning this problem in remember to do the following steps:

1. **Restart the kernel** (Kernel$\rightarrow$Restart)
2. **Run all cells** (Cell$\rightarrow$Run All)
3. **Save** (File$\rightarrow$Save and Checkpoint)

<div class="alert alert-danger">After you have completed these three steps, ensure that the following cell has printed "No errors". If it has <b>not</b> printed "No errors", then your code has a bug in it and has thrown an error! Make sure you fix this error before turning in your exam.</div>

In [None]:
print("No errors!")