# Data Structures and Files
In this section, we will review the basics of python including loops, strings, file processing, and data structures. 

For each problem in this handout, you'll be implementing the spec within the function problem. You may assume we will only pass parameters of the specified type and they are not None, but otherwise, you should make no assumptions about the parameters other then what we specify in the problem description.

Lastly, solutions will be contained in section slides, which you can view after section.

## DNA Match Score
Write a function called `dna_match_score` that takes two strings as arguments that represent DNA sequence alignments and returns the score of the alignment. A string of DNA is a string composed of the characters `"A"`, `"C"`, `"G"`, `"T"`, or `"-"` to represent a gap in the alignment. The two strings will be the same length. You are guaranteed that the strings will be valid DNA alignments and that there won’t be two `"-"` in the same position in both strings. 

We will use a simplified score function for alignments than what is actually used in practice. All that is needed to do here is add up the scores for each index in the two strings by comparing the characters that appear at the same index in both strings using the following rules:
- If both characters match and are one of `"A"`, `"C"`, `"G"`, `"T"`, the score is +2.

- If both characters are one of `"A"`, `"C"`, `"G"`, `"T"` but they don’t match, the score is -1.

- If one character is one of `"A"`, `"C"`, `"G"`, `"T"` and the other is a gap `"-"`, the score is -2.

For example, consider that the sequences given are `"-ATGC"` and `"CATGT"`. In order to calculate the score, we need to match up the strings and compute the score for each pair of characters at matching positions between the two strings. The table below shows the score for each index in the strings. 

| Row Name | Post 1 | Post 2 | Post 3 | Post 4 | Post 5|
| --- | --- | --- | --- | --- | --- |
| seq1 | - | A | T | G | C |
| seq2 | C | A | T | G | T |
| seq1 | -2 | +2 | +2 | +2 | -1 |

At the 0th index, we are matching with a gap so the score is -2. For indices 1, 2, and 3 the scores are +2 for each index as the character A matches to A, T matches to T, and G matches to G. For index 3, we score this match as a -1 as the letters do not match and neither value is a `"-"`. The score for the overall alignment will come from adding the index scores together so we end up with -2 + 2 + 2 + 2 - 1 = 3. 

In [None]:
def dna_match_score(seq1, seq2):
    """
    Returns the alignment score of two DNA sequences of equal length, where the score is based off
    of the number of matching (+2 points), non-matching (-1 points), and missing characters (-2 points). 
    
    >>> dna_match_score(
    ...    '-ATGC',
    ...    'CATGT'
    ...    )
    3
    >>> dna_match_score(
    ...    'ATGC',
    ...    'ATGC'
    ...    )
    8
    >>> dna_match_score(
    ...    '-AT',
    ...    'C-T'
    ...    )
    -2
    """
    ...


## Words by Letter
Write a function called `words_by_letter` which takes in a string parameter, `file_name`, and returns a dictionary where the keys are the letter and the value is the number of words that start with said letter. You should normalize the first letter of each word to be lowercase. If the file is empty, you should return an empty dictionary, `{}`.

In [None]:
def words_by_letter(file_name):
    """
    Returns a dictionary containing letter-count pairs, where each the count represents the number
    of words starting with a given letter in file_name.

    >>> words_by_letter(
    ...    "simple.txt"
    ...    )
    {'t': 3, 's': 2, 'i': 1}
    >>> words_by_letter(
    ...    "twister.txt"
    ...    )
    {'p': 24, 'a': 3, 'o': 4, 'i': 1, 'w': 1, 't': 1}
    """
    ...


## Count Divisible Digits
Write a function `count_divisible_digits` that takes two integer numbers `n` and `m` as arguments and returns the number of digits in `n` that are divisible by `m`. If `m` is 0, then `count_divisible_digits` should return 0. For this problem, any digit in `n` that is 0 is divisible by any number. You may assume `m` is a single digit (0 ≤ `m` <10) and that it is not negative.

**Do not** use `str` to solve this problem in any way to solve any part of the problem. Instead, you should solve this problem by manipulating the number itself using integer division. Here are some hints:

- `n` // 10 evaluates to all but the last digit of `n`.

- `n` % 10 evaluates to the last digit of `n`.

In [None]:
def count_divisible_digits(n, m):
    """
    Returns the number of digits in n that are divisible by m. If m is 0, then retun 0. Likewise,
    if any digit in n is 0, then it is divisible by all numbers.

    >>> count_divisible_digits(
    ...    650899,
    ...    3
    ...    )
    4
    >>> count_divisible_digits(
    ...    -204,
    ...    5
    ...    )
    1
    >>> count_divisible_digits(
    ...    10,
    ...    0
    ...    )
    0
    """
    ...

## Testing
Run all the tests and ensure your code is working by running the following code block.

In [None]:
import doctest
doctest.testmod()