
# Lab 9: Build a Log Aggregator
## Alexander Land and Maggie Tapia

In this lab, you will create your own log generator, build a command-line utility that scans log files, summarizes their contents, and provides insight into system behavior. Data structures to track log message levels such as `INFO`, `WARNING`, `ERROR`, and `CRITICAL`.

This lab reinforces:
- File I/O
- Pattern recognition (regex)
- Dictionaries and counters
- Functions and modularity
- Optional: CLI arguments, logging



## Part 1: Create Log files (20%)
Using the the following example log format below create a **python file** that will log errors In a structured tree format 

You will find examples in the folder called Logs that you can use to build your program.

Remember set of logs should have a varied levels of log entries (`INFO`, `WARNING`, `ERROR`, `CRITICAL`) and tailored message types for different service components.
You must create 5 structured logs here are some examples:

    sqldb
    ui
    frontend.js
    backend.js
    frontend.flask
    backend.flask

You may use chat GPT to create sample outputs NOT THE LOGS. IE:

    System failure
    Database corruption
    Disk failure detected
    Database corruption


In [None]:
# Paste your python file here 
# don't forget to upload it with your submission

import logging
import logging.handlers
import random

logging.basicConfig(
    level= logging.DEBUG, 
    format= '%(asctime)s | %(name)s |%(levelname)s | %(message)s' # idk if asctime works
    )

formatter = logging.Formatter(
    fmt=('%(asctime)s | %(name)s |%(levelname)s | %(message)s' 
    )
)

# loggers
sqldb_Log = logging.getLogger("sql_logger")
sqldb_Log.setLevel(logging.INFO)

frontend_Log = logging.getLogger("frontend")
frontend_Log.setLevel(logging.INFO)

frontend_js_Log = logging.getLogger("frontend.js")
frontend_js_Log.setLevel(logging.INFO)

frontend_flask_Log = logging.getLogger("frontend.flask")
frontend_flask_Log.setLevel(logging.INFO)

frontend_flask_layer_Log = logging.getLogger("frontend.flask.layer")
frontend_flask_layer_Log.setLevel(logging.INFO)

# handlers

sql_handler = logging.handlers.TimedRotatingFileHandler(
    filename= "sql.log",
    when= "D",
    backupCount= 1
)

frontend_handler = logging.handlers.TimedRotatingFileHandler(
    filename= "frontend.log",
    when= "D",
    backupCount= 1
)

frontend_js_handler = logging.handlers.TimedRotatingFileHandler(
    filename= "frontend_js.log",
    when= "D",
    backupCount= 1
)

frontend_flask_handler = logging.handlers.TimedRotatingFileHandler(
    filename= "frontend_flask.log",
    when= "D",
    backupCount= 1
)

frontend_flask_layer_handler = logging.handlers.TimedRotatingFileHandler(
    filename= "frontend_flask_layer.log",
    when= "D",
    backupCount= 1
)

# add the handlers to the loggers
sqldb_Log.addHandler(sql_handler)
frontend_Log.addHandler(frontend_handler)
frontend_js_Log.addHandler(frontend_js_handler)
frontend_flask_Log.addHandler(frontend_flask_handler)
frontend_flask_layer_Log.addHandler(frontend_flask_layer_handler)

# add formatter to handlers
sql_handler.setFormatter(formatter)
frontend_handler.setFormatter(formatter)
frontend_js_handler.setFormatter(formatter)
frontend_flask_handler.setFormatter(formatter)
frontend_flask_layer_handler.setFormatter(formatter)



# make the log messages

#I asked chat for comedic error messages
list_of_potential_errors = [
    "Something went wrong. Probably your fault.",
    "Uncaught exception in user behavior.",
    "The system has given up. Please proceed manually.",
    "Confidence.exe unexpectedly closed.",
    "Sanity module missing.",
    "Task completion not found.",
    "You\'ve been idle for 3 hours. We assumed you were crying.",
    "Tried to operate before coffee. Not permitted.",
    "Thought process overflowed into dream state.",
    "Recollection of why you walked into the room not available.",
    "No idea what you\'re doing.",
    "Too many browser tabs. One is now sentient.",
    "Too many recursive thoughts.",
    "You can\'t just do that because you feel like it.",
    "Cannot install new personality. Files corrupted.",
    "This seemed like a good idea 3 hours ago.",
    "You argued with someone on the internet.",
    "Connection lost to reality.",
    "Your ambition has been archived or deleted.",
    "Yes, everyone saw that. No, you can\'t undo it."
]

def random_error_level(logger, message):
    """takes in the logger and message and gives it a random error level"""
    match random.randint(0,4):
        case 0: logger.critical(message)
        case 1: logger.error(message)
        case 2: logger.warning(message)
        case 3: logger.info(message)
        case 4: logger.debug(message)


# start of main code
for i in range(50):
    random_error_message = list_of_potential_errors[random.randint(0, len(list_of_potential_errors)-1)]
    
    match random.randint(0,4):
        # assigns puts random logger into the random error level to give a randomized error source, level, and message.
        case 0: random_error_level(sqldb_Log, random_error_message)
        case 1: random_error_level(frontend_Log, random_error_message)
        case 2: random_error_level(frontend_js_Log, random_error_message)
        case 3: random_error_level(frontend_flask_Log, random_error_message)
        case 4: random_error_level(frontend_flask_layer_Log, random_error_message)
            
    




### Example Log Format

You will work with logs that follow this simplified structure:

```
2025-04-11 23:20:36,913 | my_app | INFO | Request completed
2025-04-11 23:20:36,914 | my_app.utils | ERROR | Unhandled exception
2025-04-11 23:20:36,914 | my_app.utils.db | CRITICAL | Disk failure detected
```


## Part 2: Logging the Log File (40%)
    New File
### Part 2a: Read the Log File (see lab 7) (10%)


Write a function to read the contents of a log file into a list of lines. Handle file errors gracefully.

### Part 2b: Parse Log Lines (see code below if you get stuck) (10%)

Use a regular expression to extract:
- Timestamp
- Log name
- Log level
- Message

### Part 2c: Count Log Levels (20%)

Create a function to count how many times each log level appears. Store the results in a dictionary. Then output it as a Json File
You may pick your own format but here is an example. 
```python
{
    "INFO": 
    {
        "Request completed": 42, 
        "Heartbeat OK": 7
    }

    "WARNING":
    {
        ...
    }
}

```


In [None]:
# Paste your python file here don't for get to upload it with your submission
import re
import json

def readlog(logfilename):
    """reads the log file, parses it, and returns a dict with the number of occurances of each error message at each error level
    It also can return a list of each log if slightly modified."""
    # list_of_logs = [] #unused list that also holds each log
    dict_of_log = {}
    # regex pattern that puls out, the time, log_name, log_level, and message. Chat made the regex but I made everything else based on NGCP code I made
    pattern = r'^(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2},\d{3}) \| ([^|]+) \|([A-Z]+) \| (.+)$' 


    try:
        with open(logfilename) as file:
            lines = file.readlines()

            for line in lines: 
                match = re.search(pattern, line) #checks line by line for the regex pattern

                if match:
                    timestamp, log_name, log_level, message = match.groups() # if a line is match it'll section each section of the data into its respective value
                    # list_of_logs.append({"timestamp": timestamp, "log name": log_name, "log level":log_level, "message": message})  #unused list it could output

                    if log_level not in dict_of_log: #adds log level if its not in the dictionary
                        dict_of_log[log_level] = {}
                    
                    if message not in dict_of_log[log_level]: # adds the message if its not in the dictionary at that log level
                        dict_of_log[log_level][message] = 0

                    dict_of_log[log_level][message] += 1 #since the log level and message are in the dict with the above part, this will increment it every time that message shows up
    except Exception as e: # I learned this neat way of handling erros so it'll print the error if something happens when opening the file
        print(f"error {e}")

    return dict_of_log


def save_to_json(dictionary,filename): 
    """takes dict and dumps it in json file"""
    with open(filename, 'w') as json_file:
        json.dump(dictionary, json_file, indent=4)

def loader(filename): 
    """opens json file, prints it readibly, then returns it as a dict"""
    with open(filename) as file:
        data = json.load(file)
        neater_data = json.dumps(data, indent=2)
        print(neater_data)
    return data


save_to_json(readlog("frontend.log"), "frontend.json")
loader("frontend.json")



In [None]:
# Paste your python file here 
# don't forget to upload it with your submission


## Step 3: Generate Summary Report (40%)
    New File
### Step 3a (20%):
 Develop a function that continuously monitors your JSON file(s) and will print a real-time summary of log activity. It should keep count of the messages grouped by log level (INFO, WARNING, ERROR, CRITICAL) and display only the critical messages. (I.e. If new data comes in the summary will change and a new critical message will be printed)
 - note: do not reprocess the entire file on each update.  

### Step 3a: Use a Matplotlib (Lecture 10) (20%)
Develop a function that continuously monitors your JSON file(s) and will graph in real-time a bar or pie plot of each of the errors.  (a graph for each log level). 
- The graph should show the distribution of log messages by level  (INFO, WARNING, ERROR, CRITICAL)  


### Critical notes:
- Your code mus use Daemon Threads (Lecture 14)
- 3a and 3b do not need to run at the same time. 


In [None]:
# Paste your python file here 
# don't forget to upload it with your submission
import matplotlib.pyplot as plt
import json

#loader from log reader
def loader(filename):
    """opens json file, prints it readibly, then returns it as a dict"""
    with open(filename) as file:
        data = json.load(file)
        neater_data = json.dumps(data, indent=2)
        print(neater_data)
    return data

def log_json_to_graph(filename):
    """opens json log file and graphs the number of errors at each error level"""
    categories = []
    values = []
    data = loader(filename)

    # makes a dictionary that is indexed at error level and keeps track of the number of errors at that level
    count_of_errors = {}
    for log_level in data: 
        count_of_errors[log_level] = 0  # for each log level make a key with a value of 0
        for error in data[log_level]:  
            count_of_errors[log_level] += data[log_level][error]    
            #for each error message it'll check how many times it occurs and increment that number to whichever log level its currently iterating through

    for error_level in count_of_errors: # puts the keys from counter_of_errors dictionary as catagores and values as values
        categories.append(error_level)
        values.append(count_of_errors[error_level])

    # plots a bar graph for the data
    plt.bar(categories, values)
    plt.title(f'Number of errors in: {filename}')
    plt.xlabel('Error Levels')
    plt.ylabel('# of errors')

    plt.show()

log_json_to_graph("frontend.json")




In [None]:
# Here is a sample regex that parses a log file and extracts relevant information. 
# you will need to modify it. Review Lecture 11
import re

def parse_log_line(line):
    pattern = r"^(.*?)\s\|\s(\w+)\s\|\s(\w+)\s\|\s(.*)$"
    match = re.match(pattern, line)
   
