### Document

In [27]:
from langchain_core.documents import Document

In [28]:
# Document 1: HFT
documents=[Document(
        page_content="""--- SECTION 1: HIGH-FREQUENCY TRADING (HFT) ---
IDENTIFIER: HFT-REGIME-001
MARKET_STATE: Microstructure Imbalance
OBSERVATION: On 2026-01-12, a 'Liquidity Vacuum' occurred in NVDA at 14:02:01.456Z. 
The Bid-Ask spread widened from $0.02 to $0.45 within 12 milliseconds. 
PREDICTIVE_SIGNAL: Lead-lag relationships showed that NASDAQ:NVDA price movements 
were preceded by 40ms by price shifts in the NVDA February $120 Put options.
ACTION_PROTOCOL: In 'Vacuum' regimes, the Alpha-Seeker algorithm must switch from 
'Market Making' to 'Passive Rebate Capture' to avoid toxic flow.""",
        metadata={"source": "market_data", "domain": "HFT"}
    ),

    # Document 2: Algo Trading
    Document(
        page_content="""--- SECTION 2: ALGORITHMIC TRADING STRATEGIES ---
STRATEGY_ID: ALPHA-SENT-V4
LOGIC: Cross-asset sentiment correlation.
FACT: There is a statistically significant correlation (p < 0.05) between the 
'Agricultural Sentiment Index' (ASI) and the 'EUR/USD' exchange rate with a 
3-day look-forward bias. 
THRESHOLD: When ASI drops below 30 points, the algorithm triggers a 'Mean Reversion' 
trade on Wheat Futures (ZW), targeting a 1.2% profit margin with a 
Stop-Loss at 0.5%.
DIVERGENCE: If the Federal Reserve mentions 'Transitory' in FOMC minutes, 
the ASI correlation breaks, and the model enters 'Neutral' mode.""",
        metadata={"source": "strategy_manual", "domain": "AlgoTrade"}
    ),

    # Document 3: Comp Bio
    Document(
        page_content="""--- SECTION 3: COMPUTATIONAL BIOLOGY ---
GENE_ENTRY: BRCA1-VAR-22
SEQUENCE_FRAGMENT: ATGGATTTATCTGCTCTTCGCGTTGAAGAAGTACAAAATGTCATTAATGCTATGCAGAAA
PROTEIN_STRUCTURE: The BRCA1 protein contains a RING finger domain at the 
N-terminus (residues 1–109) and two BRCT domains at the C-terminus.
FUNCTIONAL_IMPACT: Mutation at residue C61G (Cysteine to Glycine) disrupts 
zinc-binding in the RING domain, leading to a loss of E3 ubiquitin ligase activity.
THERAPEUTIC_NOTE: PARP inhibitors (e.g., Olaparib) show high efficacy in 
treating tumors exhibiting this specific RING domain dysfunction.""",
        metadata={"source": "genomics_db", "domain": "Bio"}
    )
]

In [29]:
documents

[Document(metadata={'source': 'market_data', 'domain': 'HFT'}, page_content="--- SECTION 1: HIGH-FREQUENCY TRADING (HFT) ---\nIDENTIFIER: HFT-REGIME-001\nMARKET_STATE: Microstructure Imbalance\nOBSERVATION: On 2026-01-12, a 'Liquidity Vacuum' occurred in NVDA at 14:02:01.456Z. \nThe Bid-Ask spread widened from $0.02 to $0.45 within 12 milliseconds. \nPREDICTIVE_SIGNAL: Lead-lag relationships showed that NASDAQ:NVDA price movements \nwere preceded by 40ms by price shifts in the NVDA February $120 Put options.\nACTION_PROTOCOL: In 'Vacuum' regimes, the Alpha-Seeker algorithm must switch from \n'Market Making' to 'Passive Rebate Capture' to avoid toxic flow."),
 Document(metadata={'source': 'strategy_manual', 'domain': 'AlgoTrade'}, page_content="--- SECTION 2: ALGORITHMIC TRADING STRATEGIES ---\nSTRATEGY_ID: ALPHA-SENT-V4\nLOGIC: Cross-asset sentiment correlation.\nFACT: There is a statistically significant correlation (p < 0.05) between the \n'Agricultural Sentiment Index' (ASI) and the

In [30]:
from langchain_chroma import Chroma

In [31]:
chroma_db=Chroma()

In [32]:
from langchain_ollama import OllamaEmbeddings,ChatOllama

In [33]:
embedding_model=OllamaEmbeddings(model="qwen3-embedding:8b")

In [34]:
vector_store=Chroma.from_documents(documents,embedding=embedding_model)

In [35]:
vector_store.similarity_search("HFT")

[Document(id='a314e5b1-8047-4bf8-bea6-dfdcafd0e973', metadata={'source': 'market_data', 'domain': 'HFT'}, page_content="--- SECTION 1: HIGH-FREQUENCY TRADING (HFT) ---\nIDENTIFIER: HFT-REGIME-001\nMARKET_STATE: Microstructure Imbalance\nOBSERVATION: On 2026-01-12, a 'Liquidity Vacuum' occurred in NVDA at 14:02:01.456Z. \nThe Bid-Ask spread widened from $0.02 to $0.45 within 12 milliseconds. \nPREDICTIVE_SIGNAL: Lead-lag relationships showed that NASDAQ:NVDA price movements \nwere preceded by 40ms by price shifts in the NVDA February $120 Put options.\nACTION_PROTOCOL: In 'Vacuum' regimes, the Alpha-Seeker algorithm must switch from \n'Market Making' to 'Passive Rebate Capture' to avoid toxic flow."),
 Document(id='4f609a8d-fbad-450c-8a1d-87cbb1f1eebe', metadata={'source': 'market_data', 'domain': 'HFT'}, page_content="--- SECTION 1: HIGH-FREQUENCY TRADING (HFT) ---\nIDENTIFIER: HFT-REGIME-001\nMARKET_STATE: Microstructure Imbalance\nOBSERVATION: On 2026-01-12, a 'Liquidity Vacuum' occu

### Async query

In [36]:
await vector_store.asimilarity_search("hft")

[Document(id='a314e5b1-8047-4bf8-bea6-dfdcafd0e973', metadata={'source': 'market_data', 'domain': 'HFT'}, page_content="--- SECTION 1: HIGH-FREQUENCY TRADING (HFT) ---\nIDENTIFIER: HFT-REGIME-001\nMARKET_STATE: Microstructure Imbalance\nOBSERVATION: On 2026-01-12, a 'Liquidity Vacuum' occurred in NVDA at 14:02:01.456Z. \nThe Bid-Ask spread widened from $0.02 to $0.45 within 12 milliseconds. \nPREDICTIVE_SIGNAL: Lead-lag relationships showed that NASDAQ:NVDA price movements \nwere preceded by 40ms by price shifts in the NVDA February $120 Put options.\nACTION_PROTOCOL: In 'Vacuum' regimes, the Alpha-Seeker algorithm must switch from \n'Market Making' to 'Passive Rebate Capture' to avoid toxic flow."),
 Document(id='4f609a8d-fbad-450c-8a1d-87cbb1f1eebe', metadata={'source': 'market_data', 'domain': 'HFT'}, page_content="--- SECTION 1: HIGH-FREQUENCY TRADING (HFT) ---\nIDENTIFIER: HFT-REGIME-001\nMARKET_STATE: Microstructure Imbalance\nOBSERVATION: On 2026-01-12, a 'Liquidity Vacuum' occu

In [37]:
await vector_store.asimilarity_search_with_score("bio")

[(Document(id='bdecbe9d-70b5-4a3e-84b5-d94b1835ff66', metadata={'domain': 'Bio', 'source': 'genomics_db'}, page_content='--- SECTION 3: COMPUTATIONAL BIOLOGY ---\nGENE_ENTRY: BRCA1-VAR-22\nSEQUENCE_FRAGMENT: ATGGATTTATCTGCTCTTCGCGTTGAAGAAGTACAAAATGTCATTAATGCTATGCAGAAA\nPROTEIN_STRUCTURE: The BRCA1 protein contains a RING finger domain at the \nN-terminus (residues 1–109) and two BRCT domains at the C-terminus.\nFUNCTIONAL_IMPACT: Mutation at residue C61G (Cysteine to Glycine) disrupts \nzinc-binding in the RING domain, leading to a loss of E3 ubiquitin ligase activity.\nTHERAPEUTIC_NOTE: PARP inhibitors (e.g., Olaparib) show high efficacy in \ntreating tumors exhibiting this specific RING domain dysfunction.'),
  1.4636110067367554),
 (Document(id='3d49ca58-828d-499e-95ff-671fc13f9aa7', metadata={'domain': 'Bio', 'source': 'genomics_db'}, page_content='--- SECTION 3: COMPUTATIONAL BIOLOGY ---\nGENE_ENTRY: BRCA1-VAR-22\nSEQUENCE_FRAGMENT: ATGGATTTATCTGCTCTTCGCGTTGAAGAAGTACAAAATGTCATTAAT

### Retriever

In [38]:
from typing import List
from langchain_core.runnables import RunnableLambda

In [40]:
retriver=RunnableLambda(vector_store.similarity_search).bind(k=1)
retriver.batch(['hft','bio'])

[[Document(id='4f609a8d-fbad-450c-8a1d-87cbb1f1eebe', metadata={'domain': 'HFT', 'source': 'market_data'}, page_content="--- SECTION 1: HIGH-FREQUENCY TRADING (HFT) ---\nIDENTIFIER: HFT-REGIME-001\nMARKET_STATE: Microstructure Imbalance\nOBSERVATION: On 2026-01-12, a 'Liquidity Vacuum' occurred in NVDA at 14:02:01.456Z. \nThe Bid-Ask spread widened from $0.02 to $0.45 within 12 milliseconds. \nPREDICTIVE_SIGNAL: Lead-lag relationships showed that NASDAQ:NVDA price movements \nwere preceded by 40ms by price shifts in the NVDA February $120 Put options.\nACTION_PROTOCOL: In 'Vacuum' regimes, the Alpha-Seeker algorithm must switch from \n'Market Making' to 'Passive Rebate Capture' to avoid toxic flow.")],
 [Document(id='3d49ca58-828d-499e-95ff-671fc13f9aa7', metadata={'domain': 'Bio', 'source': 'genomics_db'}, page_content='--- SECTION 3: COMPUTATIONAL BIOLOGY ---\nGENE_ENTRY: BRCA1-VAR-22\nSEQUENCE_FRAGMENT: ATGGATTTATCTGCTCTTCGCGTTGAAGAAGTACAAAATGTCATTAATGCTATGCAGAAA\nPROTEIN_STRUCTURE:

### Vectorstore implement as retriever method

In [41]:
vector_store.as_retriever(
    search_type="similarity",
    search_kwargs={"k":1}
)
retriver.batch(["HFt"])

[[Document(id='4f609a8d-fbad-450c-8a1d-87cbb1f1eebe', metadata={'domain': 'HFT', 'source': 'market_data'}, page_content="--- SECTION 1: HIGH-FREQUENCY TRADING (HFT) ---\nIDENTIFIER: HFT-REGIME-001\nMARKET_STATE: Microstructure Imbalance\nOBSERVATION: On 2026-01-12, a 'Liquidity Vacuum' occurred in NVDA at 14:02:01.456Z. \nThe Bid-Ask spread widened from $0.02 to $0.45 within 12 milliseconds. \nPREDICTIVE_SIGNAL: Lead-lag relationships showed that NASDAQ:NVDA price movements \nwere preceded by 40ms by price shifts in the NVDA February $120 Put options.\nACTION_PROTOCOL: In 'Vacuum' regimes, the Alpha-Seeker algorithm must switch from \n'Market Making' to 'Passive Rebate Capture' to avoid toxic flow.")]]

In [52]:
from langchain_ollama import ChatOllama
llm=ChatOllama(model="gemma3:4b")

In [53]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
message="""
Answer the question using the provided context only
{question}
context:
{context}
"""

prompt=ChatPromptTemplate.from_messages(["human",message])
retriever=vector_store.as_retriever()
rag_chain=({"context":retriever,"question":RunnablePassthrough()}|prompt|llm)

In [54]:
result=rag_chain.invoke("tell me about hft")
print(result.content)

Based on the provided documents, High-Frequency Trading (HFT) involves observing “Liquidity Vacuums” like the one in NVDA on 2026-01-12, where the Bid-Ask spread widens significantly. In these situations, the “Alpha-Seeker” algorithm must switch from “Market Making” to “Passive Rebate Capture” to avoid “toxic flow.” Additionally, there’s an algorithmic trading strategy (ALPHA-SENT-V4) that uses cross-asset sentiment correlation, specifically between the ‘Agricultural Sentiment Index’ (ASI) and the ‘EUR/USD’ exchange rate. When the ASI drops below 30 points, a ‘Mean Reversion’ trade is triggered on Wheat Futures (ZW).
