# Graded: x of 8 correct
- [x] Read the text file
- [x] Remove punctuation
- [x] Convert to lowercase
- [x] Split into words
- [x] Create a `set` of unique words
- [x] Dictionary of word counts
- [x] Display unique words and frequencies
- [x] Code organization and comments

Comments: 


## Unit 2 Programming Assignment
* The objective of this assignment is for you to write code that reads a text file and computes on the text data.
* Your program should read a text file, do some cleaning of the text and compute word statistics.
* In particular, the code should:
    1. Read the content of the text file `Unit2_Python_learning_journey.txt`.
        * The text in this file is from the free Python text book, [Python for everybody](https://www.py4e.com/book.php)
    2. Remove punctuation and convert all words to lowercase.
    3. Split the text into individual words.
    4. Use a set to find all unique words.
    5. Use a dictionary to count the frequency of each unique word.
    6. Display each unique word along with its frequency.
* You should organize your code appropriately to show a clean and thoughtful design.
    * Use functions as needed.
    * Break up into cells so smaller pieces can be easily tested.
    * Add teh appropriate documentation to make your code comprehensible.

In [3]:
# Import library
import string

def read_text_file(path):
    """
    Reads the content of the file specified by path.
    
    Parameters:
        path (str): The path to the text file.
        
    Returns:
        str: The content of the file.
    """
    with open(path, 'r') as file:
        return file.read()

def text_cleaning(text):
    """
    Sanitizes the text by removing punctuation and converting to lowercase.
    
    Parameters:
        text (str): The raw text to be sanitized.
        
    Returns:
        str: The sanitized text.
    """
    # Remove punctuation using str.translate and str.maketrans
    translator = str.maketrans('', '', string.punctuation)
    sanitized_text = text.translate(translator)
    # Convert to lowercase
    sanitized_text = sanitized_text.lower()
    return sanitized_text

def split_into_words(text):
    """
    Splits the sanitized text into individual words.
    
    Parameters:
        text (str): The sanitized text.
        
    Returns:
        list: A list of words.
    """
    return text.split()

def tally_word_frequencies(words):
    """
    Tallies the frequency of each word in the list.
    
    Parameters:
        words (list): A list of words.
        
    Returns:
        dict: A dictionary with words as keys and their frequencies as values.
    """
    word_freq = {}
    for word in words:
        if word in word_freq:
            word_freq[word] += 1
        else:
            word_freq[word] = 1
    return word_freq

def show_word_frequencies(word_freq):
    """
    Displays the frequency of each word in the dictionary.
    
    Parameters:
        word_freq (dict): A dictionary with words and their frequencies.
    """
    for word, freq in word_freq.items():
        print(f"{word}: {freq}")

# Main function to process the text file and compute word statistics
def process_text_file(path):
    """
    Reads a text file, sanitizes the text, and computes word statistics.
    
    Parameters:
        path (str): The path to the text file.
    """
    # Step 1: Read the content of the file
    text_content = read_text_file(path)
    
    # Step 2: Sanitize the text
    sanitized_text = text_cleaning(text_content)
    
    # Step 3: Split the text into individual words
    word_list = split_into_words(sanitized_text)
    
    # Step 4: Tally the frequency of each unique word
    word_frequencies = tally_word_frequencies(word_list)
    
    # Step 5: Display each unique word along with its frequency
    show_word_frequencies(word_frequencies)

# File path to the text file
file_path = r'C:\Users\Nguyen\Downloads\Unit2_Python_learning_journey.txt'

# Run the main function
process_text_file(file_path)



as: 2
you: 20
progress: 2
through: 1
the: 15
rest: 1
of: 5
book: 3
don’t: 3
be: 3
afraid: 1
if: 5
concepts: 2
seem: 1
to: 18
fit: 1
together: 1
well: 1
first: 3
time: 4
when: 2
were: 1
learning: 3
speak: 1
it: 12
was: 3
not: 1
a: 17
problem: 2
for: 2
your: 4
few: 4
years: 3
that: 9
just: 1
made: 1
cute: 1
gurgling: 1
noises: 1
and: 17
ok: 1
took: 3
six: 1
months: 1
move: 2
from: 3
simple: 2
vocabulary: 1
sentences: 2
56: 1
more: 4
paragraphs: 1
able: 1
write: 1
an: 1
interesting: 1
complete: 1
short: 1
story: 1
on: 1
own: 1
we: 4
want: 1
learn: 2
python: 1
much: 1
rapidly: 1
so: 1
teach: 1
all: 3
at: 4
same: 1
over: 1
next: 1
chapters: 1
but: 1
is: 3
like: 1
new: 1
language: 2
takes: 1
absorb: 2
understand: 1
before: 1
feels: 1
natural: 1
leads: 1
some: 2
confusion: 1
visit: 1
revisit: 1
topics: 1
try: 1
get: 2
see: 3
big: 2
picture: 2
while: 2
are: 7
defining: 1
tiny: 1
fragments: 1
make: 1
up: 3
written: 1
linearly: 1
taking: 1
course: 1
will: 3
in: 4
linear: 1
fashion: 1
hesitate: 1