In [None]:
import numpy as np

Context-free grammars are one way that a small set of rules can produce a large number of sentences, and provide a rough approximation to the structure of human languages. You can generate strings in accordance with a context-free grammar in Python reasonably easily, by defining a function corresponding to each rule in the grammar. For example, the grammar beginning with the start symbol $S$ and defined by the rules

$$
\begin{eqnarray*}
S & \rightarrow & {\tt adjective\_phrase} \\
{\tt adjective\_phrase} & \rightarrow & nearly \ {\tt adjective\_phrase }\\
& | & nearly \ {\tt adjective} \\
{\tt adjective} & \rightarrow & finished
\end{eqnarray*}
$$

produces sentences of the form $(nearly)^n\ finished$ for any integer $n>0$, meaning sentences in which $nearly$ is repeated $n$ times, followed by $finished$.

The function `how_close` generates random sentences from this grammar, choosing at random when multiple rules could be applied. It utilizes the helper functions `adjective_phrase` and `noun` to represent the corresponding rules in the grammar:

In [None]:
def how_close():
    # S --> adjective_phrase
    S = adjective_phrase()
    return S

def adjective_phrase():
    # generates a random number uniformly between 0 and 1
    p = np.random.rand()
    if p < 0.7:
        # adjective_phrase --> nearly adjective_phrase
        return "nearly " + adjective_phrase()
    else:
        # adjective_phrase | nearly noun
        return "nearly " + noun()

def noun():
    # noun --> finished
    return "finished"

You can try running this function a few times and seeing what it produces:

In [None]:
for i in range(10):
    print(how_close())

----

## Part A (1 point)

Consider the following grammar beginning with the start symbol $S$ and with the rules

$$
\begin{eqnarray*}
  S & \rightarrow & {\tt phrase\_token} \\
  {\tt phrase\_token} & \rightarrow & {\tt phrase\_token} \ tic \ toc \\
  & | &  tic \ toc 
\end{eqnarray*}
$$

<div class="alert alert-success">
What is the set of sentences that is generated by this grammar? Express the "language" generated by this grammar as succinctly as
possible. Your answer should be in the same kind of form as our description of the sentences generated by the "nearly" grammar described above, which was $(nearly)^n finished$.
</div>

YOUR ANSWER HERE

----

## Part B (1 point)

In this problem, you are going to generate random sentences from a grammar. To generate random sentences, you'll need to use a random function from the `np.random` module:

In [None]:
np.random.rand?

This function returns a value that is uniformly distributed between 0 and 1:

In [None]:
np.random.rand()

For an example of how to use random numbers in generating random sentences from a grammar, see the `adjective_phrase` function defined above.

<div class="alert alert-success">
Edit the function <code>tic</code> to generate <b>random</b> sentences from the grammar defined in Part A.
</div>

In [None]:
def tic():
    """
    Generates random sentences using the grammar in Part A. Your answer
    should include a single space after each `tic` and `toc`.
    
    Hint: your solution can be done in 5 lines of code (or less), 
    including the return statement.

    Returns
    -------
    a string representing a sentence from the grammar in Part A

    """
    # YOUR CODE HERE
    raise NotImplementedError()

Make sure to test your function to make sure it produces sentences which conform to the grammar above:

In [None]:
for i in range(10):
    print(tic())

In [None]:
# add your own test cases here!


In [None]:
"""Check that `tic` produces expected output."""
from nose.tools import assert_equal

lengths = []
for i in range(10):
    output = tic()
    lengths.append(output)
    
    # check that the output only consists of ' ', t, i, c, and o
    assert_equal(sorted(set(output)), [' ', 'c', 'i', 'o', 't'])

    # check that the ouptut is repeated string
    assert len(set(output.split('tictoc '))) == 1
    
# check that the function doesn't produce the same thing every time
assert len(set(lengths)) > 1

print("Success!")

----

## Part C (1 point)

<div class="alert alert-success">
Edit the function <code>abncdn</code> to generate sentences of the form $(ab)^{n}(cd)^{n}$.
</div>

In [None]:
def abncdn(n):
    """
    Generates random sentences using the grammar in Part A. 
    
    Hint: your solution can be done in 1 line of code, 
    including the return statement.

    Returns
    -------
    a string representing a sentence from the grammar in Part A

    """
    # YOUR CODE HERE
    raise NotImplementedError()

Test your function on a few different inputs to make sure it produces the expected output:

In [None]:
for i in range(1, 10):
    print(abncdn(i))

In [None]:
# add your own test cases here!
abncdn(2)

In [None]:
"""Check that `abncn` produces the correct output."""
assert_equal(abncdn(1), "abcd")
assert_equal(abncdn(2), "ababcdcd")
assert_equal(abncdn(3), "abababcdcdcd")
assert_equal(abncdn(9), "abababababababababcdcdcdcdcdcdcdcdcd")
print("Success!")

In [None]:
"""Check that `abncn` produces different length strings for different inputs."""
for i in range(1, 10):
    output = abncdn(i)
    assert_equal(len(output), i*4)

print("Success!")

---

## Part D (2 points)

<div class="alert alert-success">
Give a definition for each of the following terms, and explain what types of sentences can exist within each type of language:

<ul>
<li>regular language</li>
<li>context-free language</li>
<li>context-sensitive language</li>
</ul>

Please limit your response to 1-2 sentences per term.
</div>

YOUR ANSWER HERE