### Abstract

In [7]:
### Red Scare

### Abstract:

#   -Input: A graph G with vertex set V(G) and edge set E(G); the graph can be directed or undirected; 
#               - no multiple edges between any pair of vertices and unweighted;  
#               - every graph comes with two specified vertices s, t ∈ V(G) called start and end vertices and a subset R ⊆ V(G) of red vertices; R can include s and t;
#               - an s,t-path is a sequence of DISTINCT vertices v1, ... vl such that v1 = s, vl = t and (vi, vi+1) ∈ E(G) for all i = 1, ..., l − 1 := AKA simple path;

#           Every input file is of the form: 	n m r
#	                                            s t
#	                                            <vertices>
#	                                            <edges>        # with n vertices, m edges and r cardinality of R(how many red vertices are there)
#                                                              # each vertex name is a string from [_a-z0-9]+
#                                                              # the names of vertices in R are followed by *; Ex.: 7 *       
#                                                              # edges of the form : u -- v for undirected edge , u --> v for directed arc 



#    Sub-tasks we want to solve for each problem:

#             - None: Return 1 if the length of a shorthest path internally avoiding R(red vertices) exists, -1 otherwise; * if the edge (s, t) exists then length(path(s,t)) = 2;

#             - Some: Return True if there is a path from s to t that includes at least one vertex from R

#             - Many: Return the maximum number of red vertices on any path from s to t; if no path return -1

#             - Few: Return minimum number of red vertices on any path from s to t; if no path, return -1

#             - Alternate: Return true if there is a path from s to t that alternates between red and non-red vertices, false otherwise



#    Requirements:

#            - Hint: For 3, we should be able to handle all instances; 2 roughly 50% of instances;
#            - The algorithms should run in polynomial time; if no polynom and > 1h report;
#            - Hint to tackle: For 2, not able to write one algo that works for all graphs; for 1 of these 2 should be able to argue for computational hardness with a simple reduction; mistify 2
#            - Universality: the algo must run in polynomial time on a well-defined class of graphs:
#                            - Well-defined classes:  * all graphs, * directed graphs, * undirected graphs, * bipartite graphs,
#                                                     * acyclic graphs, * graphs of bounded treewidth, * planar graphs, * expanders, * combination of these;
#            - Allowed:  if(isBipartite(G)) then
#                             # run the Strumpf-Chosa algorithm
#                        else print('!') # problem is NP-hard for non-bipartite graph
#            - Not allowed:  if (filename == 'rusty-I-17") then print(14) solved by hand


#            Libraries:

#            - Focus is on choosing between algorithms, not implementing them; not required to write them from scratch;
#            - Allowed: implementation can be either reusing code, built-in, books, external;


#     Deliverables:

#            1. A report; follow the skeleton in doc/report.pdf.
#            2. A text file results.txt with all the results, as specified in report.
#            3. Scripts, ReadME file that explains how to recreate results.txt by running your programs.




In [None]:

#     Steps:


#          Keywords, concepts, tests: - * We have to build the graphs for all the instances/files;
#                                     - * Graph tests: the algorithm must run on defined classes of graphs, ex.directed, undirected, bipartite; graph is connected, so maybe specify this;
#                                     - The tests should tell us what kind of algorithm should we use for that specific graph, without knowing the type of the graph; blind graph;
#                                     - For some problems, the red vertices appear randomly, for others they are fixed; different rules for checking if the vertex is red; colloring the vertex red as we build the graphs vs build and then check for red vertices while searching for paths;
#                                     -  Remember that for each subtask we check if there is a path from s to t, s and t can be red;
#                                     - * A path from s to t has distinct vertices, the path starts at s ends at t; the number of edges tells us the type of graph each problem might respond to tests; Ex.: If #edges == 3 then problem is Individual graphs,  if #edges == N^2 then problem is Grids;
#                                     - Object implementation vs functional implementation; 


#          Problems:

#                     1. Individual Graphs:
#                                           * Small graph, 3 vertices and an-all red dodechaderon; good to test parser
#                                           * T: Can be directed or undirected, no tree;
#                     2. Word Graphs
#                                           * Each vertex represents a 5-letter word; 
#                                           * An edge (u,v) if the corresponding words are anagrams or differ in exactly k positions, k € { 1, 2};
#                                           * T: has distinctive name for the vertices;
#                     3. Grids
#                                           * Consists of N^2 vertices 
#                                           * Each vertex (x, y) is connected to (x-1, y), (x, y-1), (x-1, y-1) if they exist;
#                                           * Every second row is red, except for the top- or bottom-most vertex, alternatingly;
#                                           * T: consists of exactly N^2 edges, can be both directed and indirected;
#                     4. Walls
#                                           * Family consisting of N overlapping 8-cycles called bricks; the bricks are laid in a wall of height 2 with various intervals of overlap;
#                                           * Each wall has a single red vertex w, the rightmost vertex of the same vertex as vertex 0;
#                                           * T: Contains cycles of length 8 with just one red vertex, can be both directed and undirected; 
#                     5. Sky
#                                           * Tree, in each level move down either one step left either right; 
#                                           * "Get from the start to the goal, avoiding the trees" --> avoid red vertices but maybe also avoid using a tree
#                                           * T: Directed, no cycles;
#                     6. Increasing numbers
#                                           * Each Increasing graph is generated from a sequence idx_1, .. idx_n of unique ints with 0 < val_i < 2n;
#                                           * The random process: Pick a subset of size n from {1, ..., 2n} and arrange them randomly;
#                                           * s = val_1, t = val_n; Odd numbers are red; Edge (val_i, val_i+1) if idx_i < idx_j and val_i < val_j;

#          Algorithms:
#                      * Maximum independent set 
#                      * Spanning tree, BFS, DFS, Prim, Dijsktra
#                      * Greedy
#                      * Divide and conquer --> Grids 
#                      * Dynamic programming, backtragking
#                      * Network flow
#                      * Np-hardness

#          Tests:
#                      * Number of edges, vertices, ratio vertices/edges --> Individual graphs, Grid, Tree;
#                         - as you check graphs and gather info on ratio, collect it and update along for each type of problem, do majority voting for tests; outlier detection to establish range for edges/vertices ratio;
#                      * Complete graph  --> Individual graphs
#                      * Tree  --> Sky
#                      * Dense graph --> Grids
#                      * Sparse graph --> Increasing numbers
#                      * Based on input format, we color the red vertices - 5 * is a red vertex; source and target are set and can be red so check them; also check if the graph is directed or undirected; also if name is string or int;
#

#             Majority voting of the tests: - if 3/5 | 2/3 tests say that the graph is a tree, then we assume that the graph is a tree; 
#                      - connected graph --> all problems
#                      - directed vs undirected --> all problems - if directed, then sky and incresing numbers but no grid nor individual graphs
#                      - number of edges --> all problems  - if #edges = 3 -> Individual graphs, if #edges = N^2 -> Grids
#                      - check if there are 8 non-overlapping cycles --> Walls


### Read the files

In [158]:
class Node:
    def __init__(self, id, name, source=False, sink=False, red=False):
        self.id = id
        self.name = name
        self.red = red
        self.source = source
        self.sink = sink
    
    def __str__(self):
        return f'Name:{self.name}, Red:{self.red}'
    
        
class Edge:
    def __init__(self, start, end, directed=False):
        self.start = start
        self.end = end
        self.directed = directed
    
    def __str__(self):
        return f'To:{self.end} Directed:{self.directed}'


class Graph:
    def __init__(self):
        self.nodes = {}  # store nodes in list
        self.network = {}  # store
        self.directed = None
        self.source = None
        self.sink = None

    def getNode(self, name):
        if name not in self.nodes.keys():
            return None
        return self.nodes[name]

    def getEdges(self):
        allEdges = []
        for node in self.network:
            for edge in self.network[node]:
                allEdges.append(edge)
        return allEdges

    def getNodes(self):
        return self.nodes

    def addNode(self, id, name, source=False, sink=False, red=False):
        newNode = Node(id, name, source, sink, red)
        self.nodes[newNode.name] = newNode
        self.network[newNode.name] = []

    def addEdge(self, start, end, directed=False):
        # Create edge and residual edge
        newEdge = Edge(start, end, directed)

        # add edge to from_node's list of edges
        self.network[start].append(newEdge)

    def __str__(self):
        result = "Graph Information:\n"

        for node_name, node in self.nodes.items():
            edges = [edge for edge in self.network[node_name]]
            edge_info = ", ".join([f"To:{edge.end} Directed:{edge.directed}" for edge in edges])
            node_info = f"Name:{node.name}, Red:{node.red}, Source:{node.source}, Sink:{node.sink}"
            
            result += f"Node {node_name}:\n"
            result += f"{node_info}\n"
            result += f"{len(edges)} Edges: {edge_info}\n"
            result += "\n"

        return result
        
             

In [159]:
### taking the input, check for graphs of size 3

import warnings
from tqdm import tqdm
import os
import networkx

warnings.filterwarnings('ignore')
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'


PATH = "data/"

files_list = os.listdir(PATH)

###  count number of graphs in the folder
count_graphs = 0

for file in tqdm(files_list):
    if file.endswith('G-ex.txt'):
        full_path = os.path.join(PATH + file)
        if os.path.isfile(full_path):
            
            with open (full_path, 'r') as f:
                g = Graph()


                n, m, r = map(int, f.readline().strip().split())
                s, t = map(str, f.readline().strip().split())

                for i in range(n):
                    name = f.readline().strip().split(' ')
                    
                    #Red
                    if len(name) > 1:
                        if name[0] == s:
                            g.addNode(i, name[0], red=True, source=True)
                            g.source = g.getNode(name[0])
                        elif name[0] == t:
                            g.addNode(i, name[0], red=True, sink=True)
                            g.sink = g.getNode(name[0])
                        else:
                            g.addNode(i, name[0], red=True)    
                                
                    #Black
                    else:
                        if name[0] == s:
                            g.addNode(i, name[0], source=True)
                            g.source = g.getNode(name[0])
                        elif name[0] == t:
                            g.addNode(i, name[0], sink=True)
                            g.sink = g.getNode(name[0])
                        else:
                            g.addNode(i, name[0])   
                    
                
                for j in range(m):
                    start, directed, end = f.readline().strip().split(' ')
                    
                    if directed == '--':
                        g.addEdge(start, end)
                    else:
                        g.directed = True
                        g.addEdge(start, end, directed=True)
                        

100%|██████████| 155/155 [00:00<00:00, 66589.89it/s]


In [160]:
print(g)

Graph Information:
Node 0:
Name:0, Red:False, Source:True, Sink:False
3 Edges: To:1 Directed:False, To:4 Directed:False, To:5 Directed:False

Node 1:
Name:1, Red:False, Source:False, Sink:False
1 Edges: To:2 Directed:False

Node 2:
Name:2, Red:False, Source:False, Sink:False
1 Edges: To:3 Directed:False

Node 3:
Name:3, Red:False, Source:False, Sink:True
0 Edges: 

Node 4:
Name:4, Red:True, Source:False, Sink:False
1 Edges: To:3 Directed:False

Node 5:
Name:5, Red:True, Source:False, Sink:False
1 Edges: To:6 Directed:False

Node 6:
Name:6, Red:False, Source:False, Sink:False
1 Edges: To:7 Directed:False

Node 7:
Name:7, Red:True, Source:False, Sink:False
1 Edges: To:3 Directed:False




In [161]:
from collections import deque
class pathFinding:
    def __init__(self, G):
        self.G = G
        self.s = self.G.source
        self.t = self.G.sink

    
    def BFS(self):
        # Initialize a queue for BFS
        queue = deque() # []
        visited = set() # nodes visited ()
        parent = {}

        # Add the source node to the queue
        queue.append(self.s) # queue = [s, ...]
        visited.add(self.s.name) # node_object #### , mark the node as visited

        while queue:
            current_node = queue.popleft() # s
            
            # Check if we have reached the target (sink) node
            if current_node == self.t: # if source == target, -> path = []
                path = self.reconstructPath(parent, current_node.name)
                return path
            
            if current_node.red and current_node != self.s and current_node != self.t: # if node is red, skip it
                continue # skip red nodes

            # Explore neighbors
            for edge in self.G.network[current_node.name]: #{s:[n_1, n_2, n_3]}
                # edge.start == current_node.name
                # edge.end == n1.name, n2.name
                neighbor = edge.end # edge : s = 0, t=1, directied = false
                if neighbor not in visited:
                    queue.append(self.G.getNode(neighbor))
                    visited.add(neighbor) # name 
                    parent[neighbor] = current_node.name # {n1 : s}

        # If we reach here, there is no path from source to sink
        return None

    def reconstructPath(self, parent, current_node):
        path = [current_node] # node_t ojbect ###, 
        while current_node in parent:
            current_node = parent[current_node]
            path.insert(0, current_node)
        return path

In [162]:
pathfinding_g = pathFinding(g)
paths = pathfinding_g.BFS()
if paths:
    path_length = len(paths) - 1  # N-1, the length of a simple path
    path_nodes = ' --> '.join([str(node) for node in paths])
    print(f'Path Length = {path_length}\nWith Path: {path_nodes}')
else:
    print(-1)


Path Length = 3
With Path: 0 --> 1 --> 2 --> 3


### Individual Graphs

### WORD GRAPHS

In [146]:
for i in range(10):
    
    if i == 5:
        continue
    print(i)

0
1
2
3
4
6
7
8
9
