In [4]:
import requests
from datetime import datetime, timedelta
from typing import List
import json

from optimization import improve_graph

def get_task_goals(
    base_url: str = "https://stackvm.tidb.ai/",
   last_hours: int = 2
) -> List[str]:
    try:
        end_time = datetime.utcnow()
        start_time = end_time - timedelta(hours=last_hours)
        
        url = f"{base_url}/api/tasks/evaluation"
        params = {
            "start_time": start_time.isoformat(),
            "end_time": end_time.isoformat()
        }
        response = requests.get(url, params=params)
        response.raise_for_status()
        
        tasks = response.json()
        
        goals = [task.get("goal", "") for task in tasks if task.get("goal")]
        
        return goals
        
    except requests.exceptions.RequestException as e:
        print(f"Error fetching task goals: {e}")
        return []
    except json.JSONDecodeError as e:
        print(f"Error parsing JSON: {e}")
        return []
    except Exception as e:
        print(f"Unknown error: {e}")
        return []
import time

while True:
    goals = get_task_goals()
    for goal in goals:
        improve_graph(goal)
    
    # Sleep for 10 minutes before next iteration
    print(f"Sleeping for 10 minutes... Current time: {datetime.now()}")
    time.sleep(600)  # 600 seconds = 10 minutes

2025-05-24 13:38:47 - root - INFO - AFC is enabled with max remote calls: 10.


Found new issues 0, total issues 134
processing issue 0 for gemini-2.5-pro-critic redundancy_entity affected entities: [33, 42849, 362784] 0.0


2025-05-24 13:39:29 - root - INFO - AFC is enabled with max remote calls: 10.


```json
{
  "is_valid": true,
  "critique": "The issue identified is valid. The entities `DEADLOCKS` (id: 33924), `INFORMATION_SCHEMA.DEADLOCKS` (id: 42849), and `DEADLOCKS Table` (id: 362784) are redundant as they all refer to the exact same real-world concept: the system table within TiDB's `INFORMATION_SCHEMA` that stores deadlock information for a specific TiDB node.\n\nHere's a detailed analysis based on the provided graph data and guidelines:\n\n1.  **Names and Aliases**: \n    *   Entity 33924: `\"name\": \"DEADLOCKS\"` - This is a common, concise way to refer to the table.\n    *   Entity 42849: `\"name\": \"INFORMATION_SCHEMA.DEADLOCKS\"` - This is the fully qualified SQL name for the table, clearly identifying it within the `INFORMATION_SCHEMA`.\n    *   Entity 362784: `\"name\": \"DEADLOCKS Table\"` - This is a descriptive name that explicitly calls it a table.\n    These names, while different in form, all point to the same database object.\n\n2.  **Descriptions**: \n    * 

2025-05-24 13:41:01 - root - INFO - AFC is enabled with max remote calls: 10.


```json
{
  "is_valid": true,
  "critique": "The issue claims redundancy among a set of relationships connecting 'TiDB' and 'Pessimistic Transaction Mode'. After careful analysis, the issue is deemed valid, as a specific form of redundancy exists within the identified set, although the reasoning provided is somewhat imprecise for the entire group of relationships.\n\n**Detailed Analysis:**\n\n1.  **Correction of Affected Relationships:** The provided list of affected relationships is `[242589, 32911, 60213, 33552, 92635, 46481]`. Relationship `242589` is incorrectly included. Its definition is:\n    *   `242589`: Source: \"TiDB\" (1590025), Target: \"autocommit statement retry\" (242699), Relationship: \"TiDB supports autocommit statement retry to automatically retry statements in case of failure.\"\n    This relationship does not involve \"Pessimistic Transaction Mode\" (ID 1320209) as the target entity. Therefore, it's not relevant to the core claim about redundancy between 'TiDB' an

2025-05-24 13:42:07 - root - INFO - AFC is enabled with max remote calls: 10.


```json
{
  "is_valid": true,
  "critique": "The issue identifies relationships [35611, 45178, 46478] as affected by redundancy, reasoning that they 'convey essentially the same connection' related to 'TiDB's support for the optimistic transaction model.' This assessment is largely valid, particularly concerning relationships 45178 and 46478.\n\n**Analysis of Affected Relationships:**\n\n1.  **Target Entity Redundancy**: \n    *   Relationship 46478 points to \"Optimistic Transaction Model\" (ID: 690060).\n    *   Relationship 45178 points to \"Optimistic Transaction Mode\" (ID: 1320043).\n    The entities 690060 (\"Optimistic Transaction Model\") and 1320043 (\"Optimistic Transaction Mode\") are highly likely redundant duplicates. Their names are near-identical, and their descriptions both define the concept of optimistic transactions in TiDB, albeit with slightly different details and focuses. The 'Redundant Relationships' guideline allows for identifying redundancy when target entit

2025-05-24 13:43:12 - root - INFO - AFC is enabled with max remote calls: 10.


```json
{
  "is_valid": true,
  "critique": "The issue identified for entity 'locks' (id: 31082) is valid. The reasoning states that its description, 'Locks held by transactions in TiKV', is too vague and doesn't provide enough context to distinguish it from other lock-related concepts in the graph. This aligns with the 'Ambiguous Definition/Description' subtype of an entity quality issue.\n\nHere's a detailed analysis:\n\n1.  **Ambiguity regarding Lock Types in TiKV**: TiKV, as the storage layer for TiDB, supports different transaction models, notably pessimistic and optimistic. These models employ distinct locking mechanisms or primitives:\n    *   **Pessimistic transactions** involve acquiring explicit locks on data keys in TiKV.\n    *   **Optimistic transactions** involve writing 'lock' information (e.g., during the prewrite phase of 2PC) to key-value pairs, which serve a similar purpose but have different semantics and lifecycles than pessimistic locks.\n    The current name 'loc

2025-05-24 13:43:59 - root - INFO - AFC is enabled with max remote calls: 10.


```json
{
  "is_valid": true,
  "critique": "The issue identified for entity `42413` ('resolve lock') is valid. The description provided, 'A process in TiDB related to lock resolution,' aligns with the 'Meaningless or Fundamentally Vague Description' subtype of an `entity_quality_issue`.\n\nHere's a detailed analysis:\n\n1.  **Lack of Specificity**: The description is extremely high-level. While it tells us the entity is a 'process' within 'TiDB' and concerns 'lock resolution,' it omits crucial details. For instance, it doesn't specify:\n    *   The types of locks this process resolves (e.g., row locks, table locks, transactional locks, metadata locks).\n    *   The context or trigger for this process (e.g., does it occur after transaction completion, deadlock detection, or to clean up orphaned locks?).\n    *   The actual mechanism or outcome of the 'resolution' (e.g., releasing locks, notifying waiting transactions, rolling back operations).\n    *   Its specific role or component wi

2025-05-24 13:45:20 - root - INFO - AFC is enabled with max remote calls: 10.


```json
{
  "is_valid": true,
  "critique": "The issue correctly identifies that relationship `60904` presents a 'Contradictory Definitions' problem. This is primarily due to an internal conflict within the attributes of relationship `60904` itself.\n\nRelationship `60904` is defined as:\n- `id: 60904`\n- `source_entity: \"TiDB\"` (id: 1590025)\n- `target_entity: \"Pessimistic Transaction Mode\"` (id: 1320209)\n- `relationship` (description): `\"TiDB uses the optimistic transaction model.\"`\n\nThe core contradiction lies in the fact that the relationship's description explicitly refers to the \"optimistic transaction model,\" while its `target_entity` is \"Pessimistic Transaction Mode.\" This creates a conflicting definition within the relationship, as the description does not align with the target it's supposed to characterize or connect to. This internal inconsistency fundamentally obscures the purpose and meaning of this specific connection, fitting the 'Contradictory Definitions' 

2025-05-24 14:00:32 - httpx - INFO - HTTP Request: POST http://192.168.206.252:1234/v1/chat/completions "HTTP/1.1 200 OK"


updated entity {'name': 'locks', 'description': "Locks held by a specific transaction (start_ts: 442918429687808001) in TiKV Region 3121, resulting in 480,000 locks. These locks were identified as the root cause of stalled resolved-ts updates and hindered Stale Read operations. The transaction corresponds to an unexpected large-scale UPDATE statement ('update t set b = b + 1') that processed 10 million keys, creating excessive contention in the region.", 'meta': {'region': '3121', 'lock_count': '480000', 'start_ts': '442918429687808001', 'transaction_id': '2826881778407440457', 'keys_affected': '[74800000000000006A5F7280000000000405F6, ... , 74800000000000006A5F72800000000000EFF6, 74800000000000006A5F7280000000000721D9, 74800000000000006A5F72800000000002F691]', 'resolver_status': 'Resolver tracked index (2477) matched applied index, indicating resolver was the bottleneck', 'problem_cause': "Uncontrolled large transaction processing 10 million keys via 'update t set b = b + 1' statement

2025-05-24 14:00:33 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-05-24 14:00:34 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


Success update entity(3) 31082 to {'name': 'locks', 'description': "Locks held by a specific transaction (start_ts: 442918429687808001) in TiKV Region 3121, resulting in 480,000 locks. These locks were identified as the root cause of stalled resolved-ts updates and hindered Stale Read operations. The transaction corresponds to an unexpected large-scale UPDATE statement ('update t set b = b + 1') that processed 10 million keys, creating excessive contention in the region.", 'meta': {'region': '3121', 'lock_count': '480000', 'start_ts': '442918429687808001', 'transaction_id': '2826881778407440457', 'keys_affected': '[74800000000000006A5F7280000000000405F6, ... , 74800000000000006A5F72800000000000EFF6, 74800000000000006A5F7280000000000721D9, 74800000000000006A5F72800000000002F691]', 'resolver_status': 'Resolver tracked index (2477) matched applied index, indicating resolver was the bottleneck', 'problem_cause': "Uncontrolled large transaction processing 10 million keys via 'update t set b

2025-05-24 14:01:21 - httpx - INFO - HTTP Request: POST http://192.168.206.252:1234/v1/chat/completions "HTTP/1.1 200 OK"


updated entity {'name': 'resolve lock', 'description': "A process in TiDB's internal lock management system responsible for resolving and releasing locks held by transactions to maintain data consistency. Specifically, it handles the cleanup of pessimistic transaction locks during garbage collection (GC). A known issue (fixed in TiDB 6.1.7) occurred when this process might hang indefinitely if there was a sudden change in the PD (Placement Driver) component's time, disrupting normal operation.", 'meta': {'related_issue': '44822', 'topic': 'TiDB internal', 'affected_component': 'Lock Management / Garbage Collection', 'fix_version': 'TiDB 6.1.7'}}


2025-05-24 14:01:22 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-05-24 14:01:23 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"


Success update entity(4) 42413 to {'name': 'resolve lock', 'description': "A process in TiDB's internal lock management system responsible for resolving and releasing locks held by transactions to maintain data consistency. Specifically, it handles the cleanup of pessimistic transaction locks during garbage collection (GC). A known issue (fixed in TiDB 6.1.7) occurred when this process might hang indefinitely if there was a sudden change in the PD (Placement Driver) component's time, disrupting normal operation.", 'meta': {'related_issue': '44822', 'topic': 'TiDB internal', 'affected_component': 'Lock Management / Garbage Collection', 'fix_version': 'TiDB 6.1.7'}}
Success to resolve entity 3
Success to resolve entity 4
pendding redundancy entity number 2
start to merge entity(('redundancy_entity', (36192, 361046, 420035, 1320059))) for {'issue_type': 'redundancy_entity', 'affected_ids': [36192, 420035, 1320059, 361046], 'reasoning': 'These entities (36192, 1320059, 361046, 420035) all 