<a href="https://colab.research.google.com/github/alerods-ds/python-for-everybody-colab/blob/main/notebooks/chapter_10.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 📘 Chapter 10: Tuples - Exercises

This notebook contains the solutions to the exercises from Chapter 10 of *Python for Everybody* by Charles Severance.

## 🧠 Exercise 1
### Revise a previous program as follows: Read and parse the “From” lines and pull out the addresses from the line. Count the number of messages from each person using a dictionary.

### After all the data has been read, print the person with the most commits by creating a list of (count, email) tuples from the dictionary. Then sort the list in reverse order and print out the person who has the most commits.
```
Sample Line:
From stephen.marquard@uct.ac.za Sat Jan 5 09:14:16 2008

Enter a file name: mbox-short.txt
cwen@iupui.edu 5

Enter a file name: mbox.txt
zqian@umich.edu 195
```

✅ Answer:

In [5]:
input_file = input('Enter a file name: ')

file_name = f"/content/drive/My Drive/python-for-everybody/data/{input_file}"

fhand = open(file_name, 'r')

email_addresses = {}

for line in fhand:
    words = line.split()
    if len(words) < 2 or words[0] != 'From': continue
    if words[1] in email_addresses:
        email_addresses[words[1]] += 1
    else:
        email_addresses[words[1]] = 1

commits_list = []

for key, val in email_addresses.items():
    commits_list.append((val, key))

commits_list.sort(reverse=True)

commits, person = commits_list[0]

print(f'{person} {commits}')

Enter a file name: mbox.txt
zqian@umich.edu 195


💡 Explanation:

This program counts how many messages were sent from each email address, then identifies the one with the most commits. It first builds a dictionary to track counts, then converts the dictionary to a list of `(count, email)` tuples. By sorting the list in reverse order, the entry with the highest count appears first, allowing us to extract and print the most active sender.

## 🧠 Exercise 2
### This program counts the distribution of the hour of the day for each of the messages. You can pull the hour from the “From” line by finding the time string and then splitting that string into parts using the colon character. Once you have accumulated the counts for each hour, print out the counts, one per line, sorted by hour as shown below.
```
python timeofday.py
Enter a file name: mbox-short.txt
04 3
06 1
07 1
09 2
10 3
11 6
14 1
15 2
16 4
17 2
18 1
19 1
```

✅ Answer:

In [7]:
input_file = input('Enter a file name: ')

file_name = f"/content/drive/My Drive/python-for-everybody/data/{input_file}"

fhand = open(file_name, 'r')

hours = {}

for line in fhand:
    words = line.split()
    if len(words) < 6 or words[0] != 'From': continue

    time = words[5].split(':')
    if time[0] in hours:
        hours[time[0]] += 1
    else:
        hours[time[0]] = 1

hours_list = []

for key, val in hours.items():
    hours_list.append((key, val))

hours_list.sort()

for key, value in hours_list:
    print(f'{key} {value}')

Enter a file name: mbox-short.txt
04 3
06 1
07 1
09 2
10 3
11 6
14 1
15 2
16 4
17 2
18 1
19 1


💡 Explanation:

This program counts the distribution of messages by hour based on the "From" lines in an email log. It extracts the time string from each line, splits it at the colon to isolate the hour, and uses a dictionary to count how many times each hour appears. The dictionary is then sorted by hour, and the counts are printed in order. This demonstrates how to extract structured information from strings and summarize it using dictionaries and sorting.

## 🧠 Exercise 3
### Write a program that reads a file and prints the letters in decreasing order of frequency. Your program should convert all the input to lower case and only count the letters a-z. Your program should not count spaces, digits, punctuation, or anything other than the letters a-z. Find text samples from several different languages and see how letter frequency varies between languages. Compare your results with the tables at https://wikipedia.org/wiki/Letter_frequencies.

In [8]:
import string

In [17]:
input_file = input('Enter a file name: ')

file_name = f"/content/drive/My Drive/python-for-everybody/data/{input_file}"

fhand = open(file_name, 'r')

frequency = {}

for line in fhand:
    line = line.translate(str.maketrans('','',string.punctuation+'1234567890'))
    line = line.lower()
    words = line.split()
    for word in words:
        letters = tuple(word)
        for letter in letters:
            if letter in frequency:
                frequency[letter] += 1
            else:
                frequency[letter] = 1

frequency_list = []

for key, value in frequency.items():
    frequency_list.append((value, key))

frequency_list.sort(reverse=True)

for count, letter in frequency_list:
    print(f'{letter}: {count}')

Enter a file name: romeo.txt
i: 14
t: 12
e: 12
s: 11
a: 11
n: 9
h: 9
o: 8
r: 7
u: 6
l: 6
d: 6
w: 5
k: 3
g: 3
f: 3
y: 2
b: 2
v: 1
p: 1
m: 1
j: 1
c: 1


💡 Explanation:

This program reads a text file and calculates the frequency of each letter from a–z, ignoring digits, punctuation, and other non-letter characters. It first removes unwanted characters using `str.translate()` and converts the text to lowercase. It then iterates through each character, counting only alphabetical letters using a dictionary. The frequency data is stored as `(count, letter)` tuples, sorted in descending order, and printed. This exercise demonstrates how to combine string processing, dictionaries, and sorting to perform character-level text analysis.

# 📚 Summary – What I Learned from These Exercises

In this chapter, I learned how to use tuples in combination with dictionaries to sort and analyze data. I practiced creating lists of tuples to represent key-value pairs, reversing the usual `(key, value)` structure to `(value, key)` in order to sort by count. I also worked with text processing tasks, extracting specific parts of strings and organizing data by frequency.

These exercises reinforced important skills such as using `.items()` to loop through dictionaries, sorting with custom criteria, and applying string manipulation techniques to real-world text data. Together, they illustrated how tuples can enhance the flexibility and power of dictionary-based data analysis.