Python File Parsing Exercises from PY4E

This repository contains a collection of beginner-friendly Python scripts based on the Python for Everybody (PY4E) course. The programs demonstrate how to read files, parse text, and use lists, dictionaries, and string methods to extract useful information.

All programs work with sample text files (mbox-short.txt and romeo.txt), which can be downloaded from:

Programs Included

1. Extract Email Addresses from "From" Lines

Reads mbox-short.txt line by line.
Finds lines starting with "From ".
Prints the sender’s email address (second word).
Prints a final count of such lines.

2. Distribution of Emails by Hour

Reads mbox-short.txt.
Extracts the hour from the timestamp in each "From " line.
Builds a histogram of how many messages were sent during each hour.
Prints the distribution sorted by hour.

3. Extract Floating-Point Number Using String Slicing

Finds a line like:
```
X-DSPAM-Confidence:    0.8475
```
Uses find() and slicing to extract the number.
Converts it to a float and prints it.

4. Find the Most Prolific Email Sender

Reads mbox-short.txt.
Builds a dictionary mapping sender emails to their message counts.
Finds and prints the email address with the highest count.

5. Unique Word List from `romeo.txt`

Reads romeo.txt line by line.
Splits lines into words and builds a list of unique words.
Sorts the list alphabetically using Python’s sort() method.
Prints the sorted list.

6. Compute Average Spam Confidence

Prompts for a file name (e.g., mbox-short.txt).
Reads the file and extracts floating-point values from lines starting with:
```
X-DSPAM-Confidence:
```
Computes and prints the average value (without using sum() or a variable named sum).

How to Run

Clone this repository:

git clone https://github.com/shaazejahan/python-file-parsing.git
cd python-file-parsing

Download the sample files from the links above and place them in the same directory as the Python scripts.
Run any program with:
```
python script_name.py
```

Skills Practiced

File handling (open, readline, loops)
String manipulation (split(), find(), slicing)
Lists and dictionaries
Counting and histogramming
Simple algorithms (max loop, average calculation, sorting)

Author

👩‍💻 Shaaz E Jahan

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Python File Parsing Exercises from PY4E

Programs Included

1. Extract Email Addresses from "From" Lines

2. Distribution of Emails by Hour

3. Extract Floating-Point Number Using String Slicing

4. Find the Most Prolific Email Sender

5. Unique Word List from `romeo.txt`

6. Compute Average Spam Confidence

How to Run

Skills Practiced

Author

About

Uh oh!

Releases

Packages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
Compute Average Spam Confidence		Compute Average Spam Confidence
Distribution of Emails by Hour		Distribution of Emails by Hour
Extract Email Addresses from "From" Lines		Extract Email Addresses from "From" Lines
Extract Floating-Point Number Using String Slicing		Extract Floating-Point Number Using String Slicing
Find the Most Prolific Email Sender		Find the Most Prolific Email Sender
README.md		README.md
Unique Word List from a file		Unique Word List from a file

shaazejahan/Python-file-parsing

Folders and files

Latest commit

History

Repository files navigation

Python File Parsing Exercises from PY4E

Programs Included

1. Extract Email Addresses from "From" Lines

2. Distribution of Emails by Hour

3. Extract Floating-Point Number Using String Slicing

4. Find the Most Prolific Email Sender

5. Unique Word List from romeo.txt

6. Compute Average Spam Confidence

How to Run

Skills Practiced

Author

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

5. Unique Word List from `romeo.txt`

Packages