# Programming Exercise: Posterior Probabilties

Just as we did dice_posterior right after dice_sample, now we'll be doing
site_posterior after site_sample.  If you haven't done site_sample, go do it!

The experimental setup is the same in this as in site_sample -- one of two models
(motif or background) is selected with a fixed but unknown probability. The
model is the used to generate a word (a sequence of a fixed length) and that
word is presented to you, the observer. Many words are generated in this way.
You must write code that calculates the posterior probability that each model
was used to generate each word, given the parameters and prior probabilities of
each model.

Before starting any coding, sit down with a pen and paper and
write the formula for the posterior probability of each sequence type as a
function of its prior probability, the probability distribution on bases for
each sequence type, and the bases actually observed.  Since there are only two 
possible models, the posteriors of these two must sum to one. To get the 
posteriors, you will use Bayes rule. This will give you something of the form 
$\frac{x}{(x + y)}$.

# Input and Output

The first line of your function should be:

```python
def site_posterior(sequence: list[int],
                   sequence_model: SequenceModel) -> float:
```

**sequence** will be the list of bases observed in a draw, expressed in
integers, with the mapping as described in site_sample:

```raw
A <-> 0
C <-> 1
G <-> 2
T <-> 3
```
so a list of `[0,2,1,3]` would correspond to `["A", "G", "C", "T"]`.  The length of
sequence must be the same as the length of `site_base_probs`, which is an 
attribute of `sequence_model`.

Hint: Since the nucleotides are encoded as the integers 0-3, you can use them
as indices into the probabilities in `sequence_model`.

**sequence_model** will be a SequenceModel object. 

You can 

<!--
reference the 
[SequenceModel object documentation](https://cse587a.github.io/cse587Autils/SequenceObjects/API/SequenceModel.html), and 
see [usage examples](https://cse587a.github.io/cse587Autils/SequenceObjects/Usage/SequenceModel.html),
in the cse587Autils documentation. 
-->

A call to `site_posterior()` will look like:

```python
>>> site_base_probs = [[.1, .4, .4, .1], [.1, .4, .4, .1], [.8, .1, .1, 0]]
>>> background_base_probs = [1/4, 1/4, 1/4, 1/4]
>>> sm = SequenceModel(0.35, site_base_probs, background_base_probs)
>>> site_posterior([1, 2, 0], site_base_probs)
0.8151939042420107
```

The return value should be a single number representing the posterior
probability that the sequence we drew was from a site bound by a transcription
factor.

# Instructions and Grading

As in dice_posterior, you should:

1. Work out the math for what you want to do first!  Seriously it's so much
   easier that way.
1. Make sure you account for cases where you have a site/backgroundPrior of 0,
   or you have a 0 probability for observing some base in the sequence

As in the rest of the python modules, you are filling in code in the 
[assignment/assignment.py](assignment.py) file. Your code
will be evaluated against the tests in 
[assignment/test_assignment.py](test_assignment.py). Note
that when the tests are run for grading purposes, it is from a clean version of 
the test file. Any changes you make to the test file will not be reflected 
in the grading test run. Upload the entire repository to Gradescope to 
submit.