Skip to content

shaazejahan/Python-file-parsing

Repository files navigation

Python File Parsing Exercises from PY4E

This repository contains a collection of beginner-friendly Python scripts based on the Python for Everybody (PY4E) course. The programs demonstrate how to read files, parse text, and use lists, dictionaries, and string methods to extract useful information.

All programs work with sample text files (mbox-short.txt and romeo.txt), which can be downloaded from:


Programs Included

1. Extract Email Addresses from "From" Lines

  • Reads mbox-short.txt line by line.
  • Finds lines starting with "From ".
  • Prints the sender’s email address (second word).
  • Prints a final count of such lines.

2. Distribution of Emails by Hour

  • Reads mbox-short.txt.
  • Extracts the hour from the timestamp in each "From " line.
  • Builds a histogram of how many messages were sent during each hour.
  • Prints the distribution sorted by hour.

3. Extract Floating-Point Number Using String Slicing

  • Finds a line like:

    X-DSPAM-Confidence:    0.8475
    
  • Uses find() and slicing to extract the number.

  • Converts it to a float and prints it.


4. Find the Most Prolific Email Sender

  • Reads mbox-short.txt.
  • Builds a dictionary mapping sender emails to their message counts.
  • Finds and prints the email address with the highest count.

5. Unique Word List from romeo.txt

  • Reads romeo.txt line by line.
  • Splits lines into words and builds a list of unique words.
  • Sorts the list alphabetically using Python’s sort() method.
  • Prints the sorted list.

6. Compute Average Spam Confidence

  • Prompts for a file name (e.g., mbox-short.txt).

  • Reads the file and extracts floating-point values from lines starting with:

    X-DSPAM-Confidence:
    
  • Computes and prints the average value (without using sum() or a variable named sum).


How to Run

  1. Clone this repository:

    git clone https://github.com/shaazejahan/python-file-parsing.git
    cd python-file-parsing
  2. Download the sample files from the links above and place them in the same directory as the Python scripts.

  3. Run any program with:

    python script_name.py

Skills Practiced

  • File handling (open, readline, loops)
  • String manipulation (split(), find(), slicing)
  • Lists and dictionaries
  • Counting and histogramming
  • Simple algorithms (max loop, average calculation, sorting)

Author

👩‍💻 Shaaz E Jahan


About

This is my first project on github

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published