In [1]:
from run import run_knowledge_agents
from model_ops import ModelOperation, ModelProvider

query = """
<temporal_context>
- Immediate Events (past 24 hours): Capture real-time developments
- Recent Patterns (past 72 hours): Identify emerging narratives
- Weekly Trends (past 7 days): Track pattern formation
- Monthly Context (past 30 days): Establish baseline trends
</temporal_context>

<focus_areas>
1. Technology & Innovation:
   - Breakthrough announcements and validation metrics
   - Product launches with market reception data
   - Research developments with citation patterns
   - Industry shifts with quantifiable impacts

2. Economic & Market Dynamics:
   - Market movements with volatility indicators
   - Economic indicators with historical comparisons
   - Policy impacts with measured outcomes
   - Industry transformations with adoption rates

3. Social & Cultural Impact:
   - Public discourse with sentiment metrics
   - Social movements with participation trends
   - Cultural phenomena with diffusion patterns
   - Community responses with engagement data

4. Policy & Governance:
   - Regulatory changes with compliance metrics
   - Political developments with stability indicators
   - International relations with diplomatic indices
   - Policy implementations with effectiveness measures

5. Narrative Analysis:
   - Information flow patterns
   - Discourse evolution metrics
   - Cross-domain influence measures
   - Stability indicators
</focus_areas>

<analysis_requirements>
1. Temporal Analysis:
   - Extract time-series data points (tx, vx)
   - Document temporal dependencies
   - Map causal chains

2. Pattern Recognition:
   - Identify cyclical patterns
   - Document correlation strengths
   - Flag regime changes

3. Contextual Integration:
   - Map narrative networks
   - Track information propagation
   - Measure cross-domain effects

4. Forecast Requirements:
   - Generate confidence intervals
   - Project trend trajectories
   - Identify stability thresholds
</analysis_requirements>
"""
providers = {
    ModelOperation.EMBEDDING: ModelProvider.OPENAI,
    ModelOperation.CHUNK_GENERATION: ModelProvider.OPENAI,
    ModelOperation.SUMMARIZATION: ModelProvider.GROK}

chunks, summary = run_knowledge_agents(
    query=query,
    process_new=True,
    providers=providers)
print(summary)

INFO:root:OpenAI client initialized successfully
INFO:root:Grok client initialized successfully
INFO:root:Venice client initialized successfully
INFO:root:OpenAI client initialized successfully
INFO:root:Grok client initialized successfully
INFO:root:Venice client initialized successfully
Configuration loaded successfully
INFO:run:Configuration loaded successfully
Processing new data...
INFO:run:Processing new data...
INFO:root:Loading all CSV data from S3 bucket: rolling-data, prefix: data


All files in directory './data' have been removed.
Directory './data' already exists.
Directory './data/stratified/' created successfully.


INFO:root:Latest date processed: 2024-12-26 00:00:00+00:00
INFO:root:Combined data contains 6229940 rows.


Time column: posted_date_time, Strata column: None


Using openai for embeddings
INFO:run:Using openai for embeddings
INFO:embedding_ops:Creating knowledge base with embeddings...
Loading CSV files: 100%|██████████| 12/12 [00:00<00:00, 75.80it/s]
Processing articles:   0%|          | 0/48 [00:00<?, ?it/s]INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
Processing articles:   2%|▏         | 1/48 [00:03<02:33,  3.26s/it]INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
Processing articles:   4%|▍         | 2/48 [00:06<02:33,  3.33s/it]INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
Processing articles:   6%|▋         | 3/48 [00:11<03:05,  4.12s/it]INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
Processing articles:   8%|▊         | 4/48 [00:14<02:43,  3.71s/it]INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
Processing articles:  10%|█         | 5/48 [00:17<02:2

### 1. Temporal Overview
**Time Range:** 2024-12-27 01:47:07 to 2024-12-29 11:19:11

**Active Periods:**
- **High Activity Spike:** The thread discussing the unusually high number of "twitter-react" threads (event4) showed significant activity on 2024-12-27T06:52:12Z, indicating a surge in discussions related to Twitter reactions.
- **Moderate Activity:** Threads related to political unrest in India (event1) and the potential impact of Elon Musk's actions (event10) showed consistent engagement over the period, with notable spikes on 2024-12-27T13:36:55Z and 2024-12-27T10:24:55Z respectively.

**Thread Distribution:**
- Threads were distributed across various topics, with a notable concentration on geopolitical issues, cryptocurrency speculation, and cultural critiques. The distribution suggests a diverse range of interests among users, with some threads gaining more traction due to their controversial nature or involvement of high-profile figures.

**Key Correlations:**
- There is a co

In [2]:
query = """<temporal_context>
- Real-time Market Dynamics (past 24 hours): Capture price movements, volume patterns, and sentiment shifts
- Short-term Trends (past 72 hours): Track momentum and narrative development
- Medium-term Patterns (past 7 days): Identify market regime changes and sector rotations
- Long-term Context (past 30 days): Establish baseline market conditions and macro trends
</temporal_context>

<focus_areas>
1. Market Structure Analysis:
   - Liquidity conditions and market depth
   - Institutional vs retail flow patterns
   - Cross-asset correlations
   - Market microstructure changes

2. Risk-Return Dynamics:
   - Volatility term structure
   - Risk premium evolution
   - Systematic risk factors
   - Market inefficiencies and anomalies

3. Sentiment & Positioning:
   - Crowd sentiment metrics
   - Positioning extremes
   - Narrative momentum
   - Consensus divergence

4. Macro Environment:
   - Monetary policy expectations
   - Fiscal policy impacts
   - Global capital flows
   - Regulatory landscape shifts

5. Innovation & Disruption:
   - Technology adoption curves
   - Business model evolution
   - Industry value chain shifts
   - Competitive dynamics
</focus_areas>

<analysis_requirements>
1. Signal Extraction:
   - Price-volume relationships
   - Sentiment-flow correlations
   - Cross-sectional patterns
   - Regime change indicators

2. Risk Assessment:
   - Tail risk probabilities
   - Correlation breakdowns
   - Liquidity conditions
   - Systemic risk factors

3. Opportunity Identification:
   - Value-momentum divergence
   - Narrative-price disconnects
   - Structural inefficiencies
   - Mean reversion potential

4. Forecast Parameters:
   - Confidence intervals by timeframe
   - Scenario probabilities
   - Position sizing signals
   - Risk-adjusted return expectations
</analysis_requirements>"""

providers = {
    ModelOperation.EMBEDDING: ModelProvider.OPENAI,
    ModelOperation.CHUNK_GENERATION: ModelProvider.VENICE,
    ModelOperation.SUMMARIZATION: ModelProvider.VENICE
}

chunks, summary = run_knowledge_agents(
    query=query,
    process_new=True,
    providers=providers
)
print(summary)

Configuration loaded successfully
INFO:run:Configuration loaded successfully
Processing new data...
INFO:run:Processing new data...
INFO:root:Loading all CSV data from S3 bucket: rolling-data, prefix: data


All files in directory './data' have been removed.
Directory './data' already exists.
Directory './data/stratified/' created successfully.


INFO:root:Latest date processed: 2024-12-26 00:00:00+00:00
INFO:root:Combined data contains 6234454 rows.


Time column: posted_date_time, Strata column: None


Using openai for embeddings
INFO:run:Using openai for embeddings
INFO:embedding_ops:Creating knowledge base with embeddings...
Loading CSV files: 100%|██████████| 10/10 [00:00<00:00, 60.11it/s]
Processing articles:   0%|          | 0/25 [00:00<?, ?it/s]INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
Processing articles:   4%|▍         | 1/25 [00:07<03:11,  7.97s/it]INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
Processing articles:   8%|▊         | 2/25 [00:16<03:04,  8.02s/it]INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
Processing articles:  12%|█▏        | 3/25 [00:28<03:38,  9.95s/it]INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
Processing articles:  16%|█▌        | 4/25 [00:36<03:14,  9.28s/it]INFO:httpx:HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
Processing articles:  20%|██        | 5/25 [00:42<02:4

### 1. Temporal Overview
Time Range: 2024-12-27 03:18:30 to 2024-12-29 12:15:44
Active Periods: 
- Key activity spikes were observed on December 27, 06:57:27, with the sharing of a list of URLs, possibly indicating sources of information or tools related to mapping and tracking.
- Another spike occurred on December 28, 05:20:57, with a post discussing Elon Musk's alleged intentional change of discourse to favor Indian immigration using leftist tactics.
- A third spike was observed on December 28, 14:35:24, with a post questioning the credibility of skilled foreign labor from China.
Thread Distribution: Threads were primarily focused on topics such as immigration policies, foreign labor credibility, and potential biases in hiring practices.
Key Correlations: 
- The discussion around Elon Musk's alleged support for Indian immigration was correlated with conversations about racism and sexism in hiring practices.
- The topic of foreign labor credibility was linked to concerns about the qua