---
title: "Connected Sinks and Sources"
author: "Vahram Poghosyan"
date: "2022-01-23"
categories: ["Leetcode", "Algorithms", "Data Structures"]
image: "leetcode.png"
format:
  html:
    code-fold: true
jupyter: python3
include-after-body:
  text: |
    <script type="application/javascript" src="../../javascript/light-dark.js"></script>
---

# Connected Sinks and Sources

## Problem Statement

We are given a pipe system represented by a 2D rectangular grid of cells. There are three different types of items located in the cells within the grid, with each having either no Items or 1 Item:

* **Source:** There is 1 source in the system. It is represented by the asterisk character `*`.
* **Sinks:** There are an arbitrary number of sinks in the system. They are each represented by a different uppercase letter (`A`, `B`, `C`, etc.).
* **Pipes:** There are 10 different shapes of pipes, represented by the following characters: `═`, `║`, `╔`, `╗`, `╚`, `╝`, `╠`, `╣`, `╦`, `╩`

Note that: 

* Each pipe has openings on 2 or 3 sides of its cell.
* Two adjacent cells are connected if both have a pipe opening at their shared edge.
* We should treat the source and sinks as having pipe openings at all of their edges. For example, the two cells `A╦` are connected through their shared edge, but the two cells `B╔` are not.
* A sink may be connected to the source through another sink. For example, in the simple pipe system `*A═B=C`, all three sinks are connected to the source.

Our objective is to write a function that determines which sinks are connected to the source in a given pipe system.

As an example, consider the following pipe system:

```
*╗ ╦═A
 ╠═╝
 C ╚═B
```
In this pipe system, the source `*` is connected to sinks `A` and `C` but not `B`.

Such a system is specified by an input text file in the following format.

```
*02
C10
╠11
╗12
═21
╚30
╝31
╦32
═40
═42
B50
A52
```

Note that each item is followed by its coordinate (the origin of the coordinate system is taken to be the lower left corner (`0`,`0`) corresponding to an empty space in this system).

## Black-Boxing The Solution

Let's define the inputs and outputs of this program.

* **Input:** File path as a string
* **Output:** The sinks connected to the source as a string

For example, for the given pipe system in the example, the solution should be `AC`.

Let's define a black box:

In [1]:
#| code-fold: false
def get_connected_sinks(filePath: str) -> str | None:
    """
    Returns sinks connected to the source.

    Parameters:
    - filePath (str): Path to the input file which describes the pipe system.

    Returns:
    - str: A string containing the connected sinks,
    """
    pass


In the black box above, we use the union operator `|` to allow the function to return `None` (so that we don't have to worry about the implementation and return value for this black box implementation).

Now that we have `get_connected_sinks` which takes the input file path as `filePath`, we can start thinking about what this function should do. Here's a broad breakdown of the function.

1. Load the text file
2. Parse the input
3. Using parsed input, determine which sinks are connected to the source and return the answer

Not that step 3 is already black-boxed by `get_connected_sinks`.

Let's create sub-functions for each of the remaining tasks:

1. `load_file`: should read the file (as a string or a related representation) into the program
2. `parse_input`: should return a programmatic representation of the pipe system 

We can write `load_file` without much deliberation, as it'll just be a simple wrapper for [file.readLines](https://docs.python.org/3/tutorial/inputoutput.html#methods-of-file-objects) which returns an array of strings, each containing one line from the file. 

In [2]:
#| code-fold: false
def load_file(filePath: str) -> list[str]:
    """
    Loads the input file and returns the contents as a list of strings.

    Parameters:
    - filePath (str): Path to the input file which describes the pipe system.

    Returns:
    - list: A list of strings containing the contents of the input file.
    """

    with open(filePath, 'r') as file:
        return file.readlines()

::: {.callout-tip title="Note" appearance="minimal" collapse="false"}
We use the keyword `with` rather than `try`-`finally` to leverage built in file IO safety features.
:::

Let's save the output of `load_file` as the `input` we'll provide to `parse_input`. Running `load_file` for the example file above gives:

In [3]:
#| code-fold: false
input = load_file("./problem_assets/connected_sinks_and_sources/example.txt")
print(input)

['*02\n', 'C10\n', '╠11\n', '╗12\n', '═21\n', '╚30\n', '╝31\n', '╦32\n', '═40\n', '═42\n', 'B50\n', 'A52']


As we can see the lines include the newline character, let's clean up `input` by using `map` and [rstrip](https://docs.python.org/3/library/stdtypes.html#str.rstrip).  

In [4]:
#| code-fold: false
input = list(map(lambda x: x.rstrip("\n"), input))
print(input)

['*02', 'C10', '╠11', '╗12', '═21', '╚30', '╝31', '╦32', '═40', '═42', 'B50', 'A52']


::: {.callout-tip title="Note" appearance="minimal" collapse="false"}
Note that we need to use `list` with `map` because Python 3's `map` works by *lazy* as opposed to *strict* evaluation (which means that the Python interpreter only evaluates values one at a time, as opposed to all at once which is the case when evaluating a simple expression like, say, `sum=1+2`)
:::

But we're still dealing with simple string representations. That means it's not immediately clear how to obtain the **open edges** of, say, the `╚` character programmatically (much less). Let's parse this list down into a data structure we will call `Item`.

`Item` will have Boolean instance variables:

* `left`
* `right`
* `top`
* `down`

Which will indicate whether the corresponding edge is open or closed (`True` or `False` respectively).

It will also have instance variables `x` and `y` for the item's coordinates. An additional `type` instance variable with possible values in $\{$`Source`,`Pipe`,`Sink`$\}$ may be helpful for checking terminal conditions.

In [28]:
#| code-fold: false
class Item:
    def __init__(
            self, type: str = "Pipe", 
            edges: list[bool] = [True, True, True, True], 
            coords: list[int] = [0,0]
        ):
        self.type = type
        self.edges = edges
        self.coords = coords
    
    def __repr__(self): # String representation of object for logging
        return (f"-----{type(self).__name__}-----\n"
                f"  type: {self.type}\n"
                f"  edges: {self.edges}\n"
                f"  coordinates: {self.coords}\n")

Unfortunately, there's no way to get around the hard-coding of the ASCII pipe to `Item` mappings without adding unmerited complexity to the problem.

Here's an implementation of `mapToItems` with the hard-coding mentioned above. Feel free to expand and examine the implementation (at your own peril).

In [29]:
def map_to_items(input: list[str]) -> list[Item]: # Converts the input to a list of Items
    def to_item(line: str) -> Item: # Converts a single line in the input to an Item
        objectToReturn = Item()
        objectToReturn.coords = line[1:] # The coordinates are the rest of the line
        objectToReturn.edges = [True, True, True, True]
        match line[0]: # Match the first character of the line to determine the type of object
            case "*":
                # The default object is a Pipe at coordinates (0,0), so nothing else needs tp be done...
                objectToReturn.type = "Source"
            case "═":
                # Note: edges are ordered [top, right, down, left]
                objectToReturn.edges[0] = False
                objectToReturn.edges[2] = False
            case "║":
                objectToReturn.edges[1] = False
                objectToReturn.edges[3] = False
            case "╔":
                objectToReturn.edges[0] = False
                objectToReturn.edges[3] = False
            case "╗":
                objectToReturn.edges[0] = False
                objectToReturn.edges[1] = False
            case "╚":
                objectToReturn.edges[2] = False
                objectToReturn.edges[3] = False
            case "╝":
                objectToReturn.edges[2] = False
                objectToReturn.edges[1] = False
            case "╠":
                objectToReturn.edges[3] = False
            case "╣":
                objectToReturn.edges[1] = False
            case "╦":
                objectToReturn.edges[0] = False
            case "╩":
                objectToReturn.edges[2] = False
            case other: # The case of a Sink
                if not other.isalpha(): # Check if the first character is a letter at all...
                    raise ValueError("The first character of a Sink must be a letter.")
                objectToReturn.type = "Sink"
        return objectToReturn
            
    return list(map(to_item, input))

Let's run input through `mapToItems` to obtain our programmatic representation of the pipe system. We show only the first five from the output in the interest of brevity.

In [30]:
#| code-fold: false
parsed_input = map_to_items(input)
for obj in parsed_input[:5]:
    print(obj)

-----Item-----
  type: Source
  edges: [True, True, True, True]
  coordinates: 02

-----Item-----
  type: Sink
  edges: [True, True, True, True]
  coordinates: 10

-----Item-----
  type: Pipe
  edges: [True, True, True, False]
  coordinates: 11

-----Item-----
  type: Pipe
  edges: [False, False, True, True]
  coordinates: 12

-----Item-----
  type: Pipe
  edges: [False, True, False, True]
  coordinates: 21



## Path Finding (DFS)

This is a path finding problem, so we're likely going to use either BFS or DFS. Since it's far easier to represent the given data as a 2D array, we will implement the iterative version of whichever traversal algorithm we end up choosing, as the recursive version is better suited for the case in which the data is easier to represent as a graph.

Here's the graph representation of the pipe system above, for equivalency's sake (although we won't be using graphs).

```{mermaid}
   graph TD;
      A["*02"] --> B["╗12"];
      B-->C["╠11"];
      C-->D["C10"];
      C-->E["═21"]
      E-->F["╝31"]
      F-->G["╦32"]
      G-->H["═42"]
      H-->K["A"]
      L["╚30"]-->M["═40"]
      M-->N["B"]
```

<br></br>

So we have two disjoint acyclic graphs representing the example pipe system above. Ignore the edge directions ([Mermaid](https://mermaid.js.org/intro/) doesn't provide a way to make undirected graphs to my knowledge). This graph will be undirected, as connectedness is a symmetric relationship: i.e. if an item is connected to another, like in the pipe system `═╦`, then the other item is certainly also connected to the first. 

In general, because BFS traverses a graph one level at a time, we tend to use BFS when looking for the shortest path between two nodes. The first time BFS lands on the target node constitutes the shortest path to that node (or one of the multiple such paths if it's a tie between paths). DFS is better suited for just finding a *valid* path, or *all* valid paths (in which case we could throw in backtracking as well). Since we're interested in a valid path between the source `*` and *any* given sink, and not necessarily the shortest such path, we use DFS (possibly with backtracking) for this problem.

Note that BFS can, just as easily, be implemented on a 2D array as on a graph (graphs are nothing more, really, than [adjacency lists](https://en.wikipedia.org/wiki/Adjacency_list)). So, let's see how to implement DFS on a 2D array. But before that, let's expand the given input into a 2D array of `Items`.


In [None]:
class PipeSystem:
    def __init__(self):
        pass


def parse_input(input: list[str]) -> PipeSystem:
    """
    Parses the input and returns the connected sinks.

    Parameters:
    - input (list): A list of strings containing the contents of the input file.

    Returns:
    - str: A string containing the connected sinks.
    """
    pass