# 1. Why LLM output is non-deterministic

# hardware based sotchasticity
every time we add together floating-point numbers in a different order, we can get a completely different result.

In [None]:
import random

vals = [1e-10, 1e-5, 1e-2, 1]
vals = vals + [-v for v in vals]

results = []
random.seed(42)
for _ in range(10000):
    random.shuffle(vals)
    results.append(sum(vals))

results = sorted(set(results))
print(f"There are {len(results)} unique results: {results}")

# Output:
# There are 102 unique results: [-8.326672684688674e-17, -7.45931094670027e-17, ..., 8.326672684688674e-17]


## Sampling and logits

### inference-time logits

`My favorite color is ...`

<img src="https://huyenchip.com/assets/pics/sampling/2-logits.png" style="width:40%;">


### sampling

- Temperature
- TopK
- TopP

<img src="https://huyenchip.com/assets/pics/sampling/4-logprobs.png" style="width:75%;">

In [None]:
# print infernce logit table


## Real cause of nondeterminins: batch size variation
- [Defeating Nondeterminism in LLM Inference](https://thinkingmachines.ai/blog/defeating-nondeterminism-in-llm-inference/)
    - batch size variation

In [None]:
import torch
torch.set_default_device('mps')

B = 2048
D = 4096
a = torch.linspace(-1000, 1000, B*D).reshape(B, D)
b = torch.linspace(-1000, 1000, D*D).reshape(D, D)
# Doing a matrix vector multiplication by taking
# the first element of the batch
out1 = torch.mm(a[:1], b)
# Doing a matrix matrix multiplication and then taking
# the first element of the batch
out2 = torch.mm(a, b)[:1]
print((out1 - out2).abs().max()) # tensor(1669.2500, device='mps')


In [None]:
a.shape

In [None]:
b.shape

In [None]:
out1.shape

In [None]:
out2.shape

# 2. bringing order to randomness

## prompt and prey

## structured output

### trust the API

In [None]:
# function calling

In [None]:
# json output

### trust the math:L constrained output

<img src="https://huyenchip.com/assets/pics/sampling/7-guidance.png" style="width:60%;">

#### structured outputs

<img src="https://huyenchip.com/assets/pics/sampling/8-finetuning-classifier.png" style="width:60%;">

#### Constraint sampling

<img src="https://huyenchip.com/assets/pics/sampling/9-constrained-sampling.png" style="width:60%;">


In [None]:
# restrict tokens: outlines 



# 3. Grounding inference

## bloating context
adding raw source information to inferance context
- context sizes
- ignoring middle of the context
- garbage in garbage out

## semantic rethrieval: RAG

when it works:

when it dos not works:

## tree-based rethrieval (DAG): Traversing Abstract Syntax Tree

In [None]:
# AST example


In [None]:
# why not repo map example

## graph basd rethrieval (incuding Cyclic and Undirrected): GraphRAG
- [From Local to Global: A Graph RAG Approach to Query-Focused Summarization](https://arxiv.org/abs/2404.16130)
- [github graphrag](https://github.com/microsoft/graphrag)
- <img src="assets/graphrag_image1.png" style="width:60%;">

# 4. Knowledge Graph

## Creating Basic Knowledge Graph (Neo4j)

### Core Components

1. **Nodes (entities):** Represent data objects with labels and properties[1][2]
2. **Relationships:** Connect nodes with directional connections and properties[2]
3. **Properties:** Key-value pairs attached to nodes or relationships[2]

### Data Model Structure

**Nodes:**
- Securities (stocks, bonds, derivatives)
- Companies/Issuers
- Exchanges
- Economic indicators
- News articles
- Analysts[3][4]

**Relationships:**
- Company -[:TRADES_ON]-> Exchange
- Security -[:ISSUED_BY]-> Company
- Company -[:OWNS]-> Company (ownership stakes)
- Security -[:CORRELATES_WITH]-> Security
- News -[:MENTIONS]-> Company[3]

### Transformation Steps

1. **Extract identifiers:** Convert tickers, ISINs, CUSIPs to Security nodes with standardized properties
2. **Map hierarchies:** Model corporate structures, sector classifications, and geographic relationships as graph relationships
3. **Connect market** Link securities to exchanges, pricing data, and trading volumes with temporal properties
4. **Integrate meta** Transform metadata fields into node properties and create relationships for derived data (correlations, risk factors)
5. **Add lineage:** Track data provenance and transformations for compliance

## Domain Graph

<img src="https://graphrag.com/_astro/domain-graph.DOzUcy6K_YkdvU.svg" style="width:90%;">

## Using LLM to Update Knowledge Graph from Unstructured Text

### Extraction Pipeline

1. **Chunk and embed text:** Split articles/papers into smaller text chunks (500-1000 tokens), generate embeddings for each chunk, and store as Document and Chunk nodes[1][2]

2. **Entity extraction with schema:** Use LLM with prompt specifying allowed node types from Bloomberg model (Company, Security, Exchange, Economic_Indicator, Pattern) and extract entities with properties[3][2][4]

3. **Relationship extraction:** Prompt LLM to identify relationships between extracted entities and return structured JSON output[1][4]

4. **Entity disambiguation:** Use LLM to merge duplicate entities (e.g., "NVIDIA" vs "Nvidia Corp" vs "NVDA") by matching on identifiers and properties[1]

### LLM Prompt Example

**For tech company article extraction:**
```
Extract entities and relationships from this text about technology companies.

Entity types: Company, Security, Product, Pattern, Market_Trend
Relationship types: PRODUCES, COMPETES_WITH, EXHIBITS_PATTERN, CORRELATES_WITH

Text: "NVIDIA's GPU shipments showed 40% growth correlation with Microsoft Azure 
expansion. The pattern suggests cloud infrastructure demand drives semiconductor sales."

Output as JSON with: entity_type, name, properties, relationships
```

### Updating Existing Graph (Cypher Examples)

**Merge extracted companies:**
```cypher
MERGE (c:Company {ticker: 'NVDA'})
ON CREATE SET c.name = 'NVIDIA Corporation', c.sector = 'Technology'
ON MATCH SET c.lastUpdated = datetime()
```

**Add observed patterns:**
```cypher
CREATE (p:Pattern {
  id: 'gpu_cloud_correlation',
  description: '40% growth correlation',
  confidence: 0.85,
  source: 'academic_paper_123'
})

MATCH (nvidia:Company {ticker: 'NVDA'})
MATCH (msft:Company {ticker: 'MSFT'})
CREATE (nvidia)-[:EXHIBITS_PATTERN]->(p)
CREATE (p)-[:CORRELATES_WITH]->(msft)
```

**Link to source documents:**
```cypher
MATCH (doc:Document {id: 'article_456'})
MATCH (p:Pattern {id: 'gpu_cloud_correlation'})
CREATE (p)-[:EXTRACTED_FROM {extractedAt: datetime()}]->(doc)
```

### Bloomberg Model Extensions

**New node types for unstructured content:**
- Pattern (observed trends/behaviors)
- Research_Finding
- Market_Signal
- Competitive_Dynamic[5][1]

**New relationships:**
- Company -[:EXHIBITS_PATTERN]-> Pattern
- Pattern -[:CORRELATES_WITH]-> Company
- Company -[:COMPETES_WITH {intensity: float}]-> Company
- Security -[:SHOWS_SIGNAL]-> Market_Signal
- Pattern -[:EXTRACTED_FROM]-> Document


## Lexical Graph
<img src="https://graphrag.com/_astro/knowledge-graph-lexical-graph-extracted-entities.BsKeTZFb_ZxxPUk.svg" style="width:50%;">

# 5. LLM + KG


## basic pattern:

<img src="assets/graphrag-diagram.gzIXlJ0V_Z1168rD.svg" style="width:100%;">

### Rethrieval patterns

- Basic Retriever
- Cypher Templates
- Dynamic Cypher Generation
- Global Community Summary Retriever
- Graph-Enhanced Vector Search
- Hypothetical Question Retriever
- Local Retriever
- Metadata Filtering
- Parent-Child Retriever
- Pattern Matching
- Text2Cypher