## Graph

Here’s a comparison table between graphs and trees, along with their common use cases:

| **Aspect**              | **Graph**                              | **Tree**                              |
|-------------------------|----------------------------------------|---------------------------------------|
| **Definition**           | A collection of nodes (vertices) connected by edges. Graphs can have cycles and multiple edges between nodes. | A type of graph with a hierarchical structure, where there are no cycles, and there is exactly one path between any two nodes. |
| **Structure**            | Can have multiple connected components and cycles. | A single connected component with a hierarchy (parent-child relationship). |
| **Edges**                | Edges can be directed or undirected; multiple edges between two nodes (multi-edges) are allowed. | Edges are directed, typically from a parent node to a child node. There is only one edge between any two nodes. |
| **Cycles**               | Can contain cycles (e.g., a node can be part of a cycle). | No cycles; it’s a directed acyclic graph (DAG). |
| **Degree**               | Nodes can have any number of neighbors (edges). | Each node has at most one parent but potentially many children. |
| **Connectivity**         | Can be disconnected (not all vertices are reachable from others). | Always connected (except in a forest, which is a collection of disjoint trees). |
| **Root**                 | No specific root node (although it can be directed). | Has a single root node from which all other nodes descend. |
| **Traversal Methods**    | DFS (Depth-First Search), BFS (Breadth-First Search), etc. | DFS, BFS, Pre-order, In-order, Post-order. |
| **Storage Complexity**   | Requires more memory for storing nodes and edges, especially if edges are dense. | Requires less memory compared to graphs with many nodes and edges due to its hierarchical structure. |
| **Flexibility**          | Highly flexible for representing various relationships (e.g., social networks, routing algorithms). | More constrained, designed to represent hierarchical structures. |
| **Example Representation** | Social networks, road networks, dependency graphs. | File system hierarchy, organizational chart, decision trees. |

### Use Cases

#### **Graphs**:
1. **Social Networks**: Graphs can represent users as nodes and connections (e.g., friendships) as edges. Algorithms like PageRank or community detection can be applied.
2. **Routing Algorithms**: In networking or transportation systems, graphs represent paths and intersections, and algorithms like Dijkstra’s help find the shortest path.
3. **Recommendation Systems**: In collaborative filtering, users and items are connected, and a graph structure helps find similarities or recommend items.
4. **Dependency Graphs**: Used in build systems or package managers to represent dependencies between software modules.
5. **Web Crawling**: Web pages as nodes, and hyperlinks as edges to traverse the internet or index pages.

#### **Trees**:
1. **File Systems**: The structure of directories and files is hierarchical, with the root directory at the top and files as leaf nodes.
2. **Database Indexing**: B-trees and binary search trees are used to store sorted data for fast retrieval in databases.
3. **Decision Trees**: Used in machine learning for classification and regression tasks where each internal node represents a decision, and the leaves represent outcomes.
4. **XML/JSON Parsing**: Represents data with a clear hierarchy where elements are nodes, and child nodes are related to the parent node (e.g., DOM tree).
5. **Game Trees**: In artificial intelligence, used to represent possible moves in games (e.g., chess, tic-tac-toe), where each node represents a game state.

In summary:
- **Graphs** are ideal for representing complex and flexible relationships, where cycles and multiple paths between nodes are common.
- **Trees** are better suited for hierarchical structures where relationships are acyclic, and there's a clear parent-child structure.

The main difference between a **graph** and a **tree** lies in their structure and properties:

- **Graph**: A graph is a general structure that consists of **nodes (vertices)** connected by **edges**. Graphs can have any number of edges between nodes, can be **directed or undirected**, and may contain **cycles** (paths where a node can be revisited). There is no inherent hierarchical structure, and a node may have any number of connections.

- **Tree**: A tree is a **special type of graph** that is **connected, acyclic**, and has a **hierarchical structure**. It has a single **root node** from which all other nodes (called child nodes) are descended. In a tree:
  - There are **no cycles**.
  - Each node, except the root, has exactly **one parent**.
  - There is exactly **one path** between any two nodes.
  - Trees are always **connected**, meaning all nodes are reachable from the root.

### Key Differences:
1. **Cycles**:
   - **Graph**: Can contain cycles (a node can be part of a cycle).
   - **Tree**: Cannot contain cycles (it is acyclic).
   
2. **Hierarchy**:
   - **Graph**: No inherent hierarchy.
   - **Tree**: Has a **hierarchical structure** with one root node and parent-child relationships.

3. **Connectivity**:
   - **Graph**: May be disconnected (not all nodes are reachable from others).
   - **Tree**: Always connected (except in the case of a forest, which is a collection of disjoint trees).

4. **Number of Paths**:
   - **Graph**: Multiple paths between nodes are allowed.
   - **Tree**: There is only **one path** between any two nodes.

In summary, a **tree** is a **special type of graph** that has strict rules about connectivity, acyclicity, and hierarchy. All trees are graphs, but not all graphs are trees.

In [1]:
class Graph:
    def __init__(self, directed=False):
        self.graph = {}
        self.directed = directed

    def add_vertex(self, vertex):
        if vertex not in self.graph:
            self.graph[vertex] = []

    def add_edge(self,src, dest):
        # Check and intialize vertex
        if src not in self.graph:
            self.add_vertex(src)
        if dest not in self.graph:
            self.add_vertex(dest)
        # Create edge between src, dest
        self.graph[src].append(dest)
        # If graph is directed
        # Create edge from dst, src
        if not self.directed:
            self.graph[dest].append(src)
    
    def remove_edge(self, src, dest):
        if src in self.graph:
            if dest in self.graph[src]:
                self.graph[src].remove(dest)
        if not self.directed:
            if dest in self.graph and src in self.graph[dest]:
                self.graph[dest].remove(src)

    def remove_vertex(self, vertex):
        # Remove any edges from other vertices to this one
        if vertex in self.graph:
            for adj in list(self.graph):
                if vertex in self.graph[adj]:
                    self.graph[adj].remove(vertex)
            # Remove the vertex entry
            del self.graph[vertex]

    def get_adjacent_vertices(self, vertex):
        if vertex in self.graph:
            if vertext in self.graph:
                return self.graph[vertex]
        else:
            return []

    def __str__(self):
        return str(self.graph)

In [2]:
g = Graph(directed=True)
g.add_vertex("A")
g.add_vertex("B")
g.add_edge("A","B")
g.add_edge("A","C")

print(g)

{'A': ['B', 'C'], 'B': [], 'C': []}


**Prompt** : *You are an expert software developer or site reliability engineer, and your job is to ensure that this code runs effectively, quickly, at scale and securely. Please profile it and find any issues that need to be fixed or updated*

**GPT**

This code implements a `Graph` class in Python. Below are some profiling observations, potential issues, and suggestions for improvement:

---

### 1. **Error Handling and Validation**
   - **Issue**: The `get_adjacent_vertices` method contains a typo (`vertext`) and redundant checks for vertex existence. Also, the method does not handle invalid input gracefully.
   - **Fix**:
     - Correct the typo and simplify the logic.
     - Add validation for invalid input across the class methods, e.g., raising exceptions when required.

   **Updated `get_adjacent_vertices`**:
   ```python
   def get_adjacent_vertices(self, vertex):
       if vertex not in self.graph:
           raise ValueError(f"Vertex '{vertex}' does not exist in the graph.")
       return self.graph[vertex]
   ```

---

### 2. **Edge Addition Logic**
   - **Issue**: The `add_edge` method assumes that vertices not already in the graph should be automatically added. This behavior might not always be desirable and could lead to unintended consequences.
   - **Fix**: Consider making vertex addition explicit or providing a flag to toggle this behavior.

   **Alternative**:
   ```python
   def add_edge(self, src, dest, auto_add_vertices=True):
       if auto_add_vertices:
           self.add_vertex(src)
           self.add_vertex(dest)
       elif src not in self.graph or dest not in self.graph:
           raise ValueError("Both vertices must exist to add an edge.")
       self.graph[src].append(dest)
       if not self.directed:
           self.graph[dest].append(src)
   ```

---

### 3. **Performance Considerations**
   - **Issue**: Removing a vertex is inefficient because it iterates over all adjacency lists to remove edges to the vertex. This is \(O(V+E)\), which might be slow for large graphs.
   - **Fix**: Use a more efficient data structure, like `defaultdict` for adjacency lists, or a set instead of a list for edges, to make lookups and deletions faster (\(O(1)\)).

   **Updated data structure**:
   ```python
   from collections import defaultdict

   class Graph:
       def __init__(self, directed=False):
           self.graph = defaultdict(set)
           self.directed = directed
   ```

---

### 4. **Security and Integrity**
   - **Issue**: The class does not validate the input data type for vertices and edges. Passing mutable or complex objects might lead to unpredictable behavior.
   - **Fix**: Enforce type checks to ensure vertices are hashable and edges are well-defined.

   **Type Enforcement**:
   ```python
   def add_vertex(self, vertex):
       if not isinstance(vertex, (int, str, float)):  # Adjust based on use case
           raise TypeError("Vertex must be a hashable type (int, str, float).")
       if vertex not in self.graph:
           self.graph[vertex] = set()
   ```

---

### 5. **Documentation and Usability**
   - **Issue**: The class lacks docstrings, making it hard for users to understand its behavior without inspecting the code.
   - **Fix**: Add docstrings for each method to clarify functionality and expected inputs/outputs.

   **Example**:
   ```python
   class Graph:
       """
       Represents a graph using an adjacency list.

       Attributes:
           graph: A dictionary where keys are vertices and values are sets of adjacent vertices.
           directed: Boolean indicating whether the graph is directed.
       """
       def __init__(self, directed=False):
           """
           Initializes a Graph instance.

           Args:
               directed (bool): Whether the graph is directed.
           """
           self.graph = defaultdict(set)
           self.directed = directed
   ```

---

### 6. **Readability and Code Quality**
   - **Issue**: There are inconsistencies in comments and indentation, and some comments are incorrect (e.g., "Create edge from dst, src" instead of "dest, src").
   - **Fix**: Use consistent, clear comments and maintain proper formatting.

---

### 7. **Edge Case Handling**
   - **Issue**: Removing a vertex or edge does not handle cases where the vertex/edge does not exist gracefully.
   - **Fix**: Provide meaningful error messages or handle these cases silently based on the desired behavior.

   **Example**:
   ```python
   def remove_vertex(self, vertex):
       if vertex not in self.graph:
           raise ValueError(f"Vertex '{vertex}' does not exist.")
       for adj in list(self.graph):
           self.graph[adj].discard(vertex)  # Use set for efficient removal
       del self.graph[vertex]
   ```

---

### 8. **Testing and Debugging**
   - **Issue**: No tests are provided to verify correctness.
   - **Fix**: Write unit tests for all methods, covering edge cases such as:
     - Adding duplicate vertices/edges.
     - Removing non-existent vertices/edges.
     - Directed vs undirected behavior.

---

### Final Code (Updated Version)
```python
from collections import defaultdict

class Graph:
    """
    Represents a graph using an adjacency list.

    Attributes:
        graph: A dictionary where keys are vertices and values are sets of adjacent vertices.
        directed: Boolean indicating whether the graph is directed.
    """

    def __init__(self, directed=False):
        self.graph = defaultdict(set)
        self.directed = directed

    def add_vertex(self, vertex):
        if not isinstance(vertex, (int, str, float)):
            raise TypeError("Vertex must be a hashable type (int, str, float).")
        self.graph[vertex]  # defaultdict ensures this initializes

    def add_edge(self, src, dest):
        if src not in self.graph or dest not in self.graph:
            raise ValueError("Both vertices must exist to add an edge.")
        self.graph[src].add(dest)
        if not self.directed:
            self.graph[dest].add(src)

    def remove_edge(self, src, dest):
        self.graph[src].discard(dest)
        if not self.directed:
            self.graph[dest].discard(src)

    def remove_vertex(self, vertex):
        if vertex not in self.graph:
            raise ValueError(f"Vertex '{vertex}' does not exist.")
        for adj in self.graph:
            self.graph[adj].discard(vertex)
        del self.graph[vertex]

    def get_adjacent_vertices(self, vertex):
        if vertex not in self.graph:
            raise ValueError(f"Vertex '{vertex}' does not exist.")
        return self.graph[vertex]

    def __str__(self):
        return str(dict(self.graph))
```

This revised version addresses performance, scalability, and code quality issues.

### Problem 1: Shortest route between nodes

Dijkstra's Algorithm

In [11]:
import heapq  # Importing the heapq module for priority queue functionality

class Graph:
    def __init__(self, directed=False):
        self.graph = {}
        self.directed = directed

    def add_vertex(self, vertex):
        if vertex not in self.graph:
            self.graph[vertex] = []

    def add_edge(self, src, dest, weight=1):
        # Check and initialize vertex
        if src not in self.graph:
            self.add_vertex(src)
        if dest not in self.graph:
            self.add_vertex(dest)
        # Create edge between src, dest with weight
        self.graph[src].append((dest, weight))
        if not self.directed:
            self.graph[dest].append((src, weight))
    
    def remove_edge(self, src, dest):
        if src in self.graph:
            self.graph[src] = [edge for edge in self.graph[src] if edge[0] != dest]
        if not self.directed:
            if dest in self.graph:
                self.graph[dest] = [edge for edge in self.graph[dest] if edge[0] != src]

    def remove_vertex(self, vertex):
        if vertex in self.graph:
            for adj in list(self.graph):
                self.graph[adj] = [edge for edge in self.graph[adj] if edge[0] != vertex]
            del self.graph[vertex]

    def get_adjacent_vertices(self, vertex):
        if vertex in self.graph:
            return [edge[0] for edge in self.graph[vertex]]
        else:
            return []

    def dijkstra(self, start):
        # Dictionary to hold the shortest path to each vertex
        shortest_paths = {vertex: float('inf') for vertex in self.graph}
        shortest_paths[start] = 0
        
        # Priority queue to hold vertices to be processed
        priority_queue = [(0, start)]  # (distance, vertex)
        
        while priority_queue:
            current_distance, current_vertex = heapq.heappop(priority_queue)
            
            # If the current distance is already greater than the recorded one, skip processing
            if current_distance > shortest_paths[current_vertex]:
                continue
            
            # Explore neighbors of the current vertex
            for neighbor, weight in self.graph[current_vertex]:
                distance = current_distance + weight
                if distance < shortest_paths[neighbor]:
                    shortest_paths[neighbor] = distance
                    heapq.heappush(priority_queue, (distance, neighbor))
        
        return shortest_paths

    def shortest_path(self, start, end):
        # Get all shortest paths
        shortest_paths = self.dijkstra(start)
        
        # If end is reachable, return its shortest path, otherwise return None
        return shortest_paths[end] if shortest_paths[end] != float('inf') else None

    def __str__(self):
        return str(self.graph)

In [12]:
# Example Usage
g = Graph()

# Adding vertices and edges (with weights)
g.add_edge('A', 'B', 4)
g.add_edge('A', 'C', 2)
g.add_edge('B', 'C', 5)
g.add_edge('B', 'D', 10)
g.add_edge('C', 'D', 3)
g.add_edge('D', 'E', 7)
g.add_edge('C', 'E', 8)

# Finding the shortest path from 'A' to 'D'
print(f"Shortest path from A to D: {g.shortest_path('A', 'D')}")


Shortest path from A to D: 5


### Problem 2: Traveling Salesman

In [13]:
from itertools import permutations

class Graph:
    def __init__(self, directed=False):
        self.graph = {}
        self.directed = directed

    def add_vertex(self, vertex):
        if vertex not in self.graph:
            self.graph[vertex] = {}

    def add_edge(self, src, dest, weight=1):
        if src not in self.graph:
            self.add_vertex(src)
        if dest not in self.graph:
            self.add_vertex(dest)
        self.graph[src][dest] = weight
        if not self.directed:
            self.graph[dest][src] = weight
    
    def remove_edge(self, src, dest):
        if src in self.graph and dest in self.graph[src]:
            del self.graph[src][dest]
        if not self.directed:
            if dest in self.graph and src in self.graph[dest]:
                del self.graph[dest][src]

    def remove_vertex(self, vertex):
        if vertex in self.graph:
            del self.graph[vertex]
        for v in self.graph:
            if vertex in self.graph[v]:
                del self.graph[v][vertex]

    def get_adjacent_vertices(self, vertex):
        if vertex in self.graph:
            return list(self.graph[vertex].keys())
        return []

    def get_edge_weight(self, src, dest):
        if src in self.graph and dest in self.graph[src]:
            return self.graph[src][dest]
        return float('inf')  # Return infinity if no edge exists

    def __str__(self):
        return str(self.graph)



In [15]:
class TSP:
    def __init__(self, graph):
        self.graph = graph

    def solve(self, start_vertex):
        vertices = list(self.graph.graph.keys())
        vertices.remove(start_vertex)
        
        min_path = None
        min_distance = float('inf')

        # Generate all permutations of the vertices (excluding start vertex)
        for perm in permutations(vertices):
            path = [start_vertex] + list(perm) + [start_vertex]
            distance = self.calculate_distance(path)
            
            if distance < min_distance:
                min_distance = distance
                min_path = path

        return min_path, min_distance

    def calculate_distance(self, path):
        distance = 0
        for i in range(len(path) - 1):
            distance += self.graph.get_edge_weight(path[i], path[i + 1])
        return distance

In [16]:
# Example usage
graph = Graph(directed=False)

# Add vertices and edges with weights
graph.add_edge('A', 'B', 1)
graph.add_edge('B', 'C', 2)
graph.add_edge('C', 'A', 2)
graph.add_edge('A', 'D', 4)
graph.add_edge('D', 'B', 3)
graph.add_edge('D', 'C', 1)

# Solve TSP
tsp = TSP(graph)
path, distance = tsp.solve('A')

print("Optimal path:", path)
print("Minimum distance:", distance)


Optimal path: ['A', 'B', 'D', 'C', 'A']
Minimum distance: 7
