# Metadata

**L1 Taxonomy** - Problem Solving & Algorithms

**L2 Taxonomy** - Graph Algorithms

**Subtopic** - Shortest Path Algorithms in Graphs

**Use Case** - Create a maze-solving algorithm that uses depth-first search with backtracking to find the shortest path from entrance to exit in a complex, branching maze.

**Programming Language** - Python

**Target Model** - GPT-4o

# Setup

```requirements.txt
```


# Prompt

I want to create a Python script that analyzes social network interactions using graph theory. The script must identify influencer nodes using centrality measures, detect community clusters with modularity optimization, and generate personalized recommendation feeds using common recommendation systems.

**Input Format**

Data Structure: Directed weighted graph represented as a dictionary of dictionaries:  
  "{"user_id": {"neighbor_user_id": interaction_count, ...}, ...}"  
  Example: "{"A": {"B": 3, "C": 1}, "B": {"A": 2}}"  
  - User IDs: Uppercase letters only (A-Z).  
  - Interaction counts: Integers between 1 and 100.  
  - Graph size: 5 to 50 nodes.  
  - No self loops

**Output Format**  

Return a dictionary with three keys:  
```python
{
  "influencers": [user_id1, user_id2, ...],  
  "communities": [[user_id1, user_id2], [user_id3], ...],  size DESC
  "recommendations": {"user_x": [recommended_user1, ...], ...}
}
```

**Examples**

```python
Input: {"A": {"B": 5}, "B": {"A": 3, "C": 2}, "C": {"B": 4}}
Output: {
  "influencers": ["B", "A", "C"],
  "communities": [["A", "B", "C"]],
  "recommendations": {"A": ["C"], "B": [], "C": ["A"]}
}
```

```python
Input: {"A": {"B": 1}, "C": {"D": 1}}
Output: {
  "influencers": ["A", "C", "B", "D"],
  "communities": [["A", "B"], ["C", "D"]],
  "recommendations": {"A": ["C", "D"], "B": ["C", "D"], "C": ["A", "B"], "D": ["A", "B"]}
}
```



# Requirements

**Explicit & Implicit Points**  

1. Influencer Detection: Use degree centrality (sum of outgoing edges).  
2. Community Detection: Apply Louvain's algorithm with resolution=1.0.  
3. Recommendations: Use Jaccard similarity between neighbor sets.  
4. Validate all inputs match the format before processing.  
5. Return empty lists for recommendations if no valid suggestions exist.  

**Solution Expectations**  

- The script must define following functions only:  

  ```python
  def validate_graph(graph: dict) -> None:
  def compute_influencers(graph: dict, top_k: int = 3) -> list[str]:
  def detect_communities(graph: dict) -> list[list[str]]:
  def generate_recommendations(graph: dict, user: str) -> list[str]:
  ```  
- Main Function:  
  ```python
  def analyze_network(graph: dict) -> dict:
  ```

**Edge Case Behavior**  

Handle these edge cases in your implementation:  
- Empty Graph: "Input: {}" -> Output all empty lists/dicts.  
- Isolated Node: "Input: {"A": {}}" -> Communities: "[["A"]]", Recommendations: "{"A": []}".  
- Duplicate Edges: Reject if any user has duplicate neighbors (e.g., "{"A": {"B": 1, "B": 2}}").  

**Constraints**  

1. Don't use any helper functions which is not mentioned in function signature.
2. Don't use "reversed()", "itertools", and "collections" (except "defaultdict").  
3. Time complexity <= O(n^2) for n <= 50.  
4. Use iterative DFS in community detection.  
5. Recommendations must exclude existing neighbors.  
6. "validate_graph()" must raise ValueError with these exact messages:  
   - "Invalid user ID: must be A-Z"  
   - "Duplicate edge detected for user: X"  
   - "Interaction count out of range (1-100) for user: X"  
7. Influencer scores rounded to 3 decimal places.  
8. Communities with size=1 must still be included.  
9. Recommendations sorted by similarity score DESC; alphabetically if tied.

In [None]:
"""Analyze social network graph: influencers, communities, and recommendations."""

from collections import defaultdict


def validate_graph(graph: dict) -> None:
    """Validate graph user IDs, interactions, self-loops, and duplicates."""
    for user, neighbors in graph.items():
        if not (len(user) == 1 and 'A' <= user <= 'Z'):
            raise ValueError("Invalid user ID: must be A-Z")
        if not isinstance(neighbors, dict):
            raise ValueError("Invalid user ID: must be A-Z")
        seen = set()
        for neighbor, count in neighbors.items():
            if not (len(neighbor) == 1 and 'A' <= neighbor <= 'Z'):
                raise ValueError("Invalid user ID: must be A-Z")
            if neighbor in seen:
                raise ValueError(f"Duplicate edge detected for user: {user}")
            if neighbor == user:
                raise ValueError("Invalid user ID: must be A-Z")
            if not isinstance(count, int) or not (1 <= count <= 100):
                raise ValueError(
                    f"Interaction count out of range (1-100) for user: {user}"
                )
            seen.add(neighbor)


def compute_influencers(graph: dict, top_k: int = 3) -> list[str]:
    """Compute influencer scores by degree centrality, rounded to 3 decimals."""
    all_users = set(graph)
    for u in graph:
        all_users.update(graph[u])
    scores = {}
    for user in all_users:
        score = sum(graph[user].values()) if user in graph else 0
        scores[user] = round(score, 3)
    return sorted(scores, key=lambda x: (-scores[x], x))


def detect_communities(graph: dict) -> list[list[str]]:
    """Detect modularity-based communities using Louvain + iterative DFS."""
    all_users = set(graph)
    for u in graph:
        all_users.update(graph[u])

    adj = defaultdict(set)
    for u in graph:
        for v in graph[u]:
            adj[u].add(v)
            adj[v].add(u)
    for u in all_users:
        adj[u]

    visited = set()
    components = []

    for node in sorted(all_users):
        if node not in visited:
            stack = [node]
            comp = []
            while stack:
                curr = stack.pop()
                if curr not in visited:
                    visited.add(curr)
                    comp.append(curr)
                    for nei in adj[curr]:
                        if nei not in visited:
                            stack.append(nei)
            components.append(sorted(comp))

    out_wt = defaultdict(int)
    in_wt = defaultdict(int)
    for u in graph:
        for v, w in graph[u].items():
            out_wt[u] += w
            in_wt[v] += w
    total_wt = sum(out_wt.values())

    final_clusters = []

    for comp in components:
        labels = {u: i for i, u in enumerate(comp)}
        changed = True
        while changed:
            changed = False
            groups = defaultdict(list)
            for u in comp:
                groups[labels[u]].append(u)
            q = 0.0
            for group in groups.values():
                kout = sum(out_wt[x] for x in group)
                kin = sum(in_wt[x] for x in group)
                win = 0
                for x in group:
                    if x in graph:
                        for y in graph[x]:
                            if y in group:
                                win += graph[x][y]
                if total_wt:
                    q += win - (kout * kin) / total_wt
            q /= total_wt if total_wt else 1

            for u in comp:
                orig = labels[u]
                best = orig
                best_q = q
                for v in comp:
                    labels[u] = labels[v]
                    temp_groups = defaultdict(list)
                    for x in comp:
                        temp_groups[labels[x]].append(x)
                    temp_q = 0.0
                    for group in temp_groups.values():
                        kout = sum(out_wt[x] for x in group)
                        kin = sum(in_wt[x] for x in group)
                        win = 0
                        for x in group:
                            if x in graph:
                                for y in graph[x]:
                                    if y in group:
                                        win += graph[x][y]
                        if total_wt:
                            temp_q += win - (kout * kin) / total_wt
                    temp_q /= total_wt if total_wt else 1
                    if temp_q > best_q:
                        best = labels[v]
                        best_q = temp_q
                    labels[u] = orig
                if best != orig:
                    labels[u] = best
                    changed = True

        clustered = defaultdict(list)
        for u in comp:
            clustered[labels[u]].append(u)
        for group in clustered.values():
            final_clusters.append(sorted(group))

    return sorted(final_clusters, key=lambda c: (-len(c), c[0]))


def generate_recommendations(graph: dict, user: str) -> list[str]:
    """Generate recommendations based on Jaccard similarity of neighbors."""
    all_users = set(graph)
    for u in graph:
        all_users.update(graph[u])

    user_neigh = set(graph.get(user, {}))
    scores = []

    for candidate in sorted(all_users):
        if candidate == user or candidate in user_neigh:
            continue
        cand_neigh = set(graph.get(candidate, {}))
        union = user_neigh | cand_neigh
        inter = user_neigh & cand_neigh
        sim = len(inter) / len(union) if union else 0.0
        scores.append((sim, candidate))

    scores.sort(key=lambda x: (-x[0], x[1]))
    return [c for _, c in scores]


def analyze_network(graph: dict) -> dict:
    """Analyze graph and return influencers, communities, and recommendations."""
    validate_graph(graph)
    all_users = set(graph)
    for u in graph:
        all_users.update(graph[u])
    if not all_users:
        return {
            "influencers": [],
            "communities": [],
            "recommendations": {}
        }
    influencers = compute_influencers(graph)
    communities = detect_communities(graph)
    recommendations = {
        u: generate_recommendations(graph, u) for u in sorted(all_users)
    }
    return {
        "influencers": influencers,
        "communities": communities,
        "recommendations": recommendations
    }


In [None]:
# tests
"""Unittest cases for social network graph analysis utilities."""

import unittest
from main import (
    validate_graph,
    compute_influencers,
    detect_communities,
    generate_recommendations,
    analyze_network,
)


class TestSocialNetworkAnalysis(unittest.TestCase):
    """Unit tests for social network analysis functions."""

    def test_valid_graph_passes(self):
        """Test that a valid graph passes validation."""
        graph = {"A": {"B": 10}, "B": {"C": 5}}
        validate_graph(graph)  # Should not raise

    def test_invalid_user_id(self):
        """Test that invalid user IDs raise ValueError."""
        with self.assertRaises(ValueError):
            validate_graph({"AA": {"B": 10}})

    def test_invalid_neighbor_id(self):
        """Test that invalid neighbor IDs raise ValueError."""
        with self.assertRaises(ValueError):
            validate_graph({"A": {"1": 5}})

    def test_invalid_interaction_count(self):
        """Test that out-of-range interaction count raises ValueError."""
        with self.assertRaises(ValueError):
            validate_graph({"A": {"B": 0}})

    def test_self_loop_detected(self):
        """Test that self-loop edges raise ValueError."""
        with self.assertRaises(ValueError):
            validate_graph({"A": {"A": 10}})

    def test_neighbor_overwrite_not_detected(self):
        """Test that overwriting neighbors does not cause validation error."""
        graph = {"A": {"B": 10}}  # Only one B, so no error expected
        validate_graph(graph)  # Should pass

    def test_compute_influencers_order(self):
        """Test that influencers are correctly sorted by score."""
        graph = {"A": {"B": 2}, "B": {"C": 3}, "C": {"A": 1}}
        result = compute_influencers(graph)
        self.assertEqual(result, ["B", "A", "C"])

    def test_compute_influencers_disconnected(self):
        """Test that disconnected users are included with zero score."""
        graph = {"A": {"B": 5}, "C": {}, "D": {}}
        result = compute_influencers(graph)
        self.assertIn("C", result)
        self.assertIn("D", result)

    def test_detect_single_community(self):
        """Test that a single connected component forms one community."""
        graph = {"A": {"B": 1}, "B": {"A": 1}}
        result = detect_communities(graph)
        self.assertEqual(result, [["A", "B"]])

    def test_detect_two_communities(self):
        """Test that two disjoint components form two communities."""
        graph = {
            "A": {"B": 1},
            "B": {"A": 1},
            "C": {"D": 1},
            "D": {"C": 1},
        }
        result = detect_communities(graph)
        self.assertEqual(result, [["A", "B"], ["C", "D"]])

    def test_generate_recommendations_simple(self):
        """Test basic recommendation generation."""
        graph = {"A": {"B": 1}, "B": {"C": 1}, "C": {"D": 1}, "D": {}}
        result = generate_recommendations(graph, "A")
        self.assertIn("C", result)

    def test_generate_recommendations_none(self):
        """Test that no recommendations are made when none exist."""
        graph = {"A": {"B": 1}, "B": {"A": 1}}
        result = generate_recommendations(graph, "A")
        self.assertEqual(result, [])

    def test_generate_recommendations_jaccard_tie_break(self):
        """Test recommendation sorting when Jaccard scores tie."""
        graph = {
            "A": {"B": 1, "C": 1},
            "D": {"B": 1, "C": 1},
            "E": {"C": 1, "F": 1},
        }
        result = generate_recommendations(graph, "A")
        self.assertEqual(result, ["D", "E", "F"])

    def test_analyze_network_empty(self):
        """Test network analysis with empty input."""
        graph = {}
        result = analyze_network(graph)
        self.assertEqual(
            result,
            {
                "influencers": [],
                "communities": [],
                "recommendations": {},
            },
        )

    def test_analyze_network_small_graph(self):
        """Test network analysis on a small graph."""
        graph = {"A": {"B": 1}, "B": {"C": 2}, "C": {"A": 3}}
        result = analyze_network(graph)
        self.assertIn("A", result["influencers"])
        self.assertTrue(any("A" in c for c in result["communities"]))
        self.assertIn("B", result["recommendations"])

    def test_influencers_sorted_alphabetically_on_tie(self):
        """Test that influencers are sorted alphabetically on score tie."""
        graph = {"A": {"B": 5}, "C": {"D": 5}}
        result = compute_influencers(graph)
        self.assertEqual(result, ["A", "C", "B", "D"])

    def test_detect_communities_stable_output(self):
        """Test that community detection produces consistent output."""
        graph = {
            "A": {"B": 1},
            "B": {"A": 1, "C": 1},
            "C": {"B": 1},
            "D": {"E": 1},
            "E": {"D": 1},
        }
        result = detect_communities(graph)
        expected = [["A", "B", "C"], ["D", "E"]]
        self.assertEqual(result, expected)

    def test_generate_recommendations_jaccard_zero_division(self):
        """Test Jaccard similarity when both sets are empty."""
        graph = {"A": {}, "B": {}}
        result = generate_recommendations(graph, "A")
        self.assertEqual(result, ["B"])

    def test_validate_graph_non_dict_neighbors(self):
        """Test that non-dict neighbors raise ValueError."""
        with self.assertRaises(ValueError):
            validate_graph({"A": [("B", 1)]})

    def test_analyze_network_large_complete(self):
        """Test network analysis on a complete graph of 5 users."""
        graph = {
            chr(i): {chr(j): 1 for j in range(65, 70) if i != j}
            for i in range(65, 70)
        }
        result = analyze_network(graph)
        self.assertEqual(len(result["influencers"]), 5)
        self.assertEqual(len(result["communities"]), 1)
        self.assertTrue(
            all(len(v) <= 4 for v in result["recommendations"].values())
        )

    def test_recommendations_excludes_existing_neighbors(self):
        """Test that recommendations exclude already connected users."""
        graph = {
            "A": {"B": 1, "C": 1},
            "B": {"C": 1},
            "C": {"D": 1},
            "D": {},
        }
        recs = generate_recommendations(graph, "A")
        self.assertNotIn("B", recs)
        self.assertNotIn("C", recs)


# Model Breaking Proof

#### Model Breaking Task URL: <https://rlhf-v3.turing.com/prompt/ad819dab-7325-4182-916b-83e27e09b77e?origin=https%3A%2F%2Flabeling-ms.turing.com&redirect_url=https%3A%2F%2Flabeling-ms.turing.com%2Fconversations%2F67442%2Fview>

#### Model code:

```python
from typing import List, Dict
from collections import defaultdict
import networkx as nx

def validate_graph(graph: dict) -> None:
    for user, neighbors in graph.items():
        if not ('A' <= user <= 'Z'):
            raise ValueError("Invalid user ID: must be A-Z")
        seen = set()
        for neighbor, interaction_count in neighbors.items():
            if neighbor in seen:
                raise ValueError(f"Duplicate edge detected for user: {user}")
            if not ('A' <= neighbor <= 'Z'):
                raise ValueError("Invalid user ID: must be A-Z")
            if not (1 <= interaction_count <= 100):
                raise ValueError(f"Interaction count out of range (1-100) for user: {user}")
            seen.add(neighbor)

def compute_influencers(graph: dict, top_k: int = 3) -> List[str]:  # <Issue>: Return type must be `list[str]`, not `List[str]` per prompt
    centrality = {node: sum(neighbors.values()) for node, neighbors in graph.items()}
    # <Issue>: Influencer scores are not rounded to 3 decimal places as required by prompt
    influencers = sorted(centrality, key=lambda x: (-centrality[x], x))[:top_k]
    return influencers

def detect_communities(graph: dict) -> List[List[str]]:  # <Issue>: Return type must be `list[list[str]]`, not `List[List[str]]`
    G = nx.DiGraph()
    for user, neighbors in graph.items():
        for neighbor, weight in neighbors.items():
            G.add_edge(user, neighbor, weight=weight)
    
    undirected_G = G.to_undirected()
    communities = list(nx.community.louvain_communities(undirected_G, resolution=1.0, seed=42))
    return sorted(communities, key=lambda community: (-len(community), sorted(community)))  # <Issue>: Sorting not guaranteed to match size desc then lexicographically by first element

def generate_recommendations(graph: dict, user: str) -> List[str]:  # <Issue>: Return type must be `list[str]` per prompt
    user_neighbors = set(graph.get(user, {}))
    scores = {}
    for candidate in graph:
        if candidate != user and candidate not in user_neighbors:
            candidate_neighbors = set(graph.get(candidate, {}))
            intersection = user_neighbors & candidate_neighbors
            union = user_neighbors | candidate_neighbors
            if union:
                jaccard_score = len(intersection) / len(union)
                if jaccard_score > 0:  # <Issue>: Must include all candidates with similarity >= 0.0, prompt forbids thresholding to > 0
                    scores[candidate] = jaccard_score
    
    recommendations = sorted(scores, key=lambda x: (-scores[x], x))
    return recommendations

def analyze_network(graph: dict) -> dict:
    try:
        validate_graph(graph)
    except ValueError as e:
        return {"error": str(e)}  # <Issue>: Must propagate the `ValueError` directly, returning a dict violates prompt

    influencers = compute_influencers(graph)
    communities = detect_communities(graph)
    recommendations = {user: generate_recommendations(graph, user) for user in graph}
    # <Issue>: Recommendations must be generated for all users in the graph (including those who only appear as neighbors), not just `graph.keys()`

    return {
        "influencers": influencers,
        "communities": communities,
        "recommendations": recommendations
    }

# Example usage:
graph_input = {"A": {"B": 5}, "B": {"A": 3, "C": 2}, "C": {"B": 4}}
result = analyze_network(graph_input)
print(result)
```