Log File Line Generator

Exercise: Log File Line Generator

You are a DevOps engineer tasked with creating a memory-efficient tool for parsing large log files. A full-featured log parsing library might be too heavy, and loading a multi-gigabyte file into memory would be inefficient.

Your goal is to write a Python generator function that reads a log file line by line, yielding only the valid log entries.


Requirements:

    The function must accept a single argument: filepath (a string).

    It must open and read the file specified by filepath.

    It must yield each valid log line. A valid log line is any line that is not empty and not a comment.

    Comment lines are lines that start with a # character. Whitespace before the # should be ignored (for example,        # this is a comment is still a comment).

    Empty lines or lines containing only whitespace should be skipped.

    Each yielded line must have any leading or trailing whitespace removed.

    Your function must be a generator. It must use the yield keyword and should not return a list or any other collection.

Note: For this exercise, you can assume the filepath provided will always be a valid, existing file path.

Example:

Given a file named sample.log with the following content:


    [INFO] Application starting...
    [DEBUG] Connecting to database.
     
    # Configuration section
      # A nested comment
    [WARN] Deprecated feature used.
     
    [ERROR] Failed to process request id: 123


Iterating through your generator should produce the following lines in order:


    [INFO] Application starting...
    [DEBUG] Connecting to database.
    [WARN] Deprecated feature used.
    [ERROR] Failed to process request id: 123


How Your Solution Will Be Tested:

Your read_log_lines function will be tested against several scenarios:

    A standard log file containing a mix of valid entries, comments, and empty lines.

    An empty file, which should produce no output.

    A file containing only comments and blank lines.

    A file where comment lines have leading whitespace.

In [5]:
def read_log_lines(filepath):
    """
    Creates a generator that reads a log file, yielding valid, non-comment lines.

    Args:
        filepath (str): The path to the log file.

    Yields:
        str: A stripped, non-empty, non-comment line from the file.
    """

    with open(filepath, 'r') as f:

        for line in f:
            stripped_line = line.strip()


            if stripped_line and not stripped_line.startswith('#'):
                yield stripped_line


log_content = """[INFO] Application starting...
[DEBUG] Connecting to database.

# Configuration section
  # A nested comment
[WARN] Deprecated feature used.

[ERROR] Failed to process request id: 123
"""

file_name = "sample.log"

with open(file_name, "w") as f:
    f.write(log_content)

print(f"--- Reading from {file_name} ---")
log_generator = read_log_lines(file_name)
for log_entry in log_generator:
    print(log_entry)

import os
os.remove(file_name)


--- Reading from sample.log ---
[INFO] Application starting...
[DEBUG] Connecting to database.
[WARN] Deprecated feature used.
[ERROR] Failed to process request id: 123
