<img src="https://datasciencedegree.wisconsin.edu/wp-content/themes/data-gulp/images/logo.svg" width="300">


# Assignment 5

This assignment builds on the previous Python content to write two fairly nontrivial programs: 
1. implementing the English language game of Piglatin
2. counting all instances of each letter in a text file

-----

## Problem 1.  Pig latin.

In [Pig Latin](https://en.wikipedia.org/wiki/Pig_Latin), English words are transformed according to the following rules:

* If the word begins with one or more consonants, those consonants are moved to the end of the word, followed by "ay":

  * pig -> igpay
  * Latin -> Atinlay
  * trash -> ashtray

* If the word begins with "qu", both of these letters are moved to the end of the word, followed by "ay":

  * quarter -> arterquay
  
* If the word begins with a vowel, it is followed by "yay":

  * apple -> appleyay
  * out -> outyay.
  

Note that 
* more than one consonant may be moved to the end of the word. 
* "y" functions as a consonant at the start of words. 
* capitalization should be preserved after manipulating the word. 


### Problem 1(a). Implementation.

🎯 Write Python code that takes a word and converts it to Pig Latin.  If the input consists of multiple words or contains punctuation, your code should print a suitable error message. 

We want you to solve this problem from "first principles," using what you learned about strings in lesson 5.  Do **not** use the **re** module or other regular expressions in this problem.

###### Tips

* If you write your piglatin code as a [function](https://www.tutorialspoint.com/python/python_functions.htm), then you'll be able to re-use it in part b!  Not required, just suggested, since it reduces code duplication and enhances readability.

In [1]:
# Function to translate string into pig latin
def Translate(word):
    # list of vowels to check against
    vowels = ['a','e','i','o','u']
    # Check to see if first character is a capital letter
    upperCheck = False
    beginning = word[0]
    # In case of 'Q' make sure a 'u' follows
    if len(word) > 1:
        uCheck = word[1]
    # Check to make sure that only letters are in input
    if word.isalpha() == False:
        return("Error: Use only letters")
    # Check to see if the first character is a capital letter
    if beginning.isupper():
        upperCheck = True
    # Check if first character is in vowels
    if beginning in vowels:
        word = word + "yay"
    # Check if first character is a q or Q followed by u or U
    elif beginning in ("q","Q"):
        if uCheck in ("u","U"):
            word = word[2:] + "quay"
            if upperCheck == True:
                word = word.capitalize()
    # For all other cases
    else:
        while word[0] not in vowels:
            word = word[1:] + word[0]
        word = word + "ay"
        if upperCheck == True:
            word = word.capitalize()
    return word

# Problem 1(b). Test Suite.

🎯 Test your code on the following words, and be sure your output matches what is shown on the right hand side of each arrow. Print the results of each test. If your output does not match, then fix your code in 1(a). 

    * orange -> orangeyay
    * yellow -> ellowyay
    * Strip -> Ipstray
    * quarter -> arterquay
    * schmooze -> oozeschmay
    * a -> ayay
    * Pig Latin -> (should produce an error message, `sys.exit` is forbidden)
    * Ke$ha -> (should produce an error message)
    
If you wrote a function for 1(a), you can just call it on these test strings.  The function `assert` can be used to help do the checks.
    

In [2]:
# Ran out of time to make this its own function
assert(Translate("orange") == "orangeyay")
assert(Translate("yellow") == "ellowyay")
assert(Translate("Strip") == "Ipstray")
assert(Translate("quarter") == "arterquay")
assert(Translate("schmooze") == "oozeschmay")
assert(Translate("a") == "ayay")
assert(Translate("Pig Latin") == "Error: Use only letters")
assert(Translate("Ke$ha") == "Error: Use only letters") 

---

## Problem 2.  Letter Frequencies.


The files ```encryptedA.txt``` and ```encryptedB.txt``` contain two different encrypted messages on similar topics.  One message was originally in English and one was in Welsh.  

<img src="https://upload.wikimedia.org/wikipedia/commons/thumb/d/d5/English_letter_frequency_%28alphabetic%29.svg/600px-English_letter_frequency_%28alphabetic%29.svg.png" width="300">


### Problem 2(a).  Letter Frequencies.

🎯 Write Python code that reads a text file into memory and creates a `dict` object with a frequency count for each letter.  For example, for ```encryptedA.txt```, your output should contain the key:value pairs ```'a': 78``` and ```'b': 31 ```.  

###### Notes
* Do not distinguish between uppercase and lowercase letters.
* Ignore punctuation.  Punctuation counts must not appear in your `dict`
* If a given letter does not appear in the text, there must be a key:value pair with value 0. 

🎯 Use Python to determine which letter has the highest frequency in each text file, and print the result.

In [3]:
# Open my text files
with open("encryptedA.txt") as encA:
    encA = encA.read()
with open("encryptedB.txt") as encB:
    encB = encB.read()  
# Getting the hang of comprehensions, wow they're cool!
ADict = {i:encA.count(i) for i in encA if i.isalpha()}
BDict = {i:encB.count(i) for i in encB if i.isalpha()}
print("ADict: ",ADict,'\n',"BDict: ",BDict)

# The lack of order is bugging me, but from what I can tell Python dicts are unordered and I would need to import a
# special library in order to order it. Is this correct?

ADict:  {'v': 27, 'c': 88, 'b': 31, 'd': 28, 'w': 76, 'q': 41, 'a': 78, 'r': 114, 'x': 72, 'z': 16, 'u': 70, 'j': 36, 'm': 76, 'g': 78, 'k': 22, 'y': 40, 't': 19, 'l': 32, 'i': 7, 'f': 18} 
 BDict:  {'k': 83, 'c': 40, 'n': 79, 'y': 90, 'd': 29, 'z': 61, 'x': 93, 'e': 28, 'q': 16, 'o': 48, 'u': 16, 'h': 48, 'p': 23, 's': 61, 'v': 122, 'g': 51, 't': 31, 'l': 11, 'w': 41, 'j': 19, 'i': 6, 'b': 3, 'a': 7, 'r': 2, 'f': 1}


### Problem 2(b).  Formatting for R.

🎯 Write your two dictionaries with frequency counts from 2(a) to a pair of suitably named `.csv` files, with one column for the key and one column for the frequency counted.

In [4]:
import csv

with open("encryptedA.csv", "w", newline="") as encA:
    writer = csv.writer(encA)
    writer.writerows(ADict.items())
with open("encryptedB.csv", "w", newline="") as encB:
    writer = csv.writer(encB)
    writer.writerows(BDict.items())