# INFO 153 Homework Assignment 1
Problem : Validation of the opening and closing tags balance in an HTML document<br>
Author : Seth Coward<br>
Drexel ID : sac484<br>
Date : 04/23/2025

## 1. The Stack Abstract Data Type and Data Structure Implementation
The stack abstract data type is suitable for this problem because it is a simple data type that allows us to work in the correct order of operations to balance these tags. More specifically, we need to be able to look at things in reverse order to be able to check if things are balanced, which a stack allows us to do very easily with its last-in first-out structure.

### Create Class for Stack Data Structure

In [1]:
# Implementation of stack data structure
class Stack:
    # sac484
    # This class is for implementing a stack abstract data type as a data structure
    # It conatins methods:
    #     push(item): Adds an item to the top of the stack
    #     pop(): Removes and returns the item at the top of the stack
    #     peek(): Returns the item at the top of the stack
    #     is_empty(): Checks if the stack is empty and returns a boolean accordingly
    #     size(): Returns the size of the stack

    # Constructor for stack
    def __init__(self):
        self.items = []

    # Adds an item to the top of the stack
    def push(self, item):
        self.items.append(item)

    # Removes and returns the item at the top of the stack
    def pop(self):
        # If stack is empty, there is nothing to pop, so return None
        if self.is_empty():
            return None
        return self.items.pop()

    # Returns the item at the top of the stack
    def peek(self):
        # If stack is empty, there is nothing to peek, so return None
        if self.is_empty():
            return None
        return self.items[len(self.items) - 1]

    # Checks if the stack is empty and returns a boolean accordingly
    def is_empty(self):
        return len(self.items) == 0

    # Returns the size of the stack
    def size(self):
        return len(self.items)

## 2. "html_checker" Function Definition

> Since we're using a standardized format for the HTML, I decided just to remove the slash from the closing tags instead of removing all symbols from both the opening and closing tags. To me, it's more efficient and makes the code easier to understand.<br>

> I had to use single line comments for the function definition because the multi-line comment caused an error when running the code

In [2]:
def html_checker(html_string):
    # sac484
    # This function checks a given string of HTML for balanced opening and closing tags
    #
    # Parameters:
    # html_string (string): the HTML document
    #
    # Returns:
    # True if the HTML document has balanced opening and closing tags
    # False if otherwise

    # Create an empty stack
    stack = Stack()

    # Boolean for end result of whether the document is balanced or not
    balanced = True

    # Split HTML document based on new line characters
    html_lines = html_string.split("\n")

    # Iterate over each line
    for line in html_lines:
        # If the line doesn't start with the less than symbol, indicating 
        # that it is not an HTML tag, skip it
        if not line.startswith("<"):
            continue

        # Check the second character in the tag for a slash, indicating that it is
        # a closing tag if it is present and an opening tag if it is not
        if line[1] != "/":
            # Add tag to stack if it is an opening tag
            stack.push(line)
        else:
            # Retrieve most recently stored opening tag
            last_open_tag = stack.peek()
            
            # Remove the slash from a closing tag for easier comparison to opening tags
            # Ex.
            #    </html> -> <html> 
            #    <html> == <html>
            line = line.replace("/", "")
            
            if stack.is_empty() or line != last_open_tag:
                # If the closing tag does not have an equivalent opening tag at the top of the stack,
                # then the document is not balanced
                balanced = False
                break
            else:
                # Remove the last opening tag from the stack since it has a closing tag
                stack.pop()

    if balanced and stack.is_empty():
        # The balanced variable must be true and the stack must be empty
        # for the document to be truly balanced
        return True
    # Return False otherwise
    return False

## 3. Calling function 

### html_string_1 creation with function call and result output

In [3]:
html_string_1 = """<html>
<head>
<title>
An example of simple balanced HTML document
</title>
</head>
<body>
<h1>
Hello, Jupyter Notebook!
</h1>
</body>
</html>"""

html_string_1_result = html_checker(html_string_1)
print(f"Does the HTML document represented by html_string_1 have balanced tags? - {html_string_1_result}")

Does the HTML document represented by html_string_1 have balanced tags? - True


### html_string_2 creation with function call and result output

In [4]:
html_string_2 = """<html>
<head>
<title>
An example of simple unbalanced HTML document
</title>
</head>
<body>
<h1>
Hello, Jupyter Notebook!
</h1>
</html>"""

html_string_2_result = html_checker(html_string_2)
print(f"Does the HTML document represented by html_string_2 have balanced tags? - {html_string_2_result}")

Does the HTML document represented by html_string_2 have balanced tags? - False


## 5. User Input Implementation

In [5]:
# Empty list to capture user input
user_lines = []

# Continuously capture user input until they enter a blank line
print("Please Enter Your HTML Below:")
while True:
    # Formatting input
    line = input("> ")

    # Stop the loop when user enters a blank line
    if line == "":
        break

    # Add each line the user_lines list
    user_lines.append(line)

# Join each line of user input with a new line character for
# compatibility with the html_checker function
user_input = "\n".join(user_lines)

# Print what the user wrote
print(f"\nYour Input:\n{user_input}\n")

# Capture balance result from user input
user_input_result = html_checker(user_input)

# Display an appropriate message depending on if the HTML from the user input
# was balanced or not
if user_input_result == True:
    print("The HTML document you just entered has balanced tags. Good job!")
else:
    print("The HTML document you just entered has unbalanced tags. Please debug!")

Please Enter Your HTML Below:


>  <html>
>  <head>
>  <title>
>  Test
>  </title>
>  </head>
>  <body>
>  <div>
>  <h1>
>  Test Heading
>  </h1>
>  </div>
>  </body>
>  </html>
>  



Your Input:
<html>
<head>
<title>
Test
</title>
</head>
<body>
<div>
<h1>
Test Heading
</h1>
</div>
</body>
</html>

The HTML document you just entered has balanced tags. Good job!
