<a href="https://colab.research.google.com/github/solo938/ZKWhisper/blob/main/ZKWhisper_Bot.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
!pip install streamlit langchain langchain-community langchain-openai chromadb openai tiktoken pyngrok pysqlite3-binary -q


In [None]:
knowledge_base_content = """
# Zero-Knowledge Proofs, Noir, and Tornado Cash Knowledge Base

## 1. Zero Knowledge Proofs (ZKPs) Fundamentals
**Definition:** A cryptographic method allowing a "Prover" to convince a "Verifier" that a statement is true without revealing the secret information (witness) underlying the statement.

**Core Properties:**
1. **Completeness:** If the statement is true and the prover is honest, the verifier will be convinced.
2. **Soundness:** If the statement is false, a dishonest prover cannot convince the verifier (except with negligible probability).
3. **Zero-Knowledge:** The verifier learns nothing other than the fact that the statement is true.

## 2. The Noir Programming Language
**Overview:**
Noir is a Domain Specific Language (DSL) for generating zero-knowledge proofs. It is backend-agnostic and compiles to an intermediate representation called **ACIR**.

**Key Features:**
* **Rust-like Syntax:** Uses `fn`, `let`, `struct`.
* **Tooling (Nargo):**
    * `nargo new`: Create project.
    * `nargo check`: Compile/check constraints.
    * `nargo prove`: Generate witness and proof.
    * `nargo verify`: Verify the proof.

## 3. Tornado Cash Architecture
**Purpose:** An Ethereum privacy solution (mixer) that breaks the on-chain link between recipient and sender addresses.

**Core Mechanism:**
1. **Deposit:** User generates a secret (Randomness `r` and Nullifier `k`), computes hash `C = Hash(k, r)`, and adds `C` to a Merkle Tree.
2. **Withdraw:** User generates a ZK-SNARK proof proving knowledge of `k` and `r` in the Merkle root. The **Nullifier Hash** (`Hash(k)`) prevents double-spending.

## 4. Security & Auditing
**Common Vulnerabilities:**
* **Under-constrained Circuits:** Failing to constrain inputs sufficiently allows malicious provers to forge proofs.
* **Input Aliasing:** Multiple inputs mapping to the same value in modular arithmetic.
"""

with open("knowledge_base.md", "w") as f:
    f.write(knowledge_base_content)

print(" Knowledge base created successfully!")

‚úÖ Knowledge base created successfully!


In [None]:
%%writefile app.py
# --- CRITICAL FIX FOR COLAB (Must be at top) ---
import pysqlite3
import sys
sys.modules['sqlite3'] = sys.modules.pop('pysqlite3')
# -----------------------------------------------

import streamlit as st
import os
from langchain_community.document_loaders import TextLoader
from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain_text_splitters import RecursiveCharacterTextSplitter

# LCEL imports
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.messages import HumanMessage, AIMessage

# Page Configuration
st.set_page_config(page_title="ZK RAG Bot", layout="wide")
st.title(" ZK Contextual RAG Bot")

# Sidebar - API Key Input
with st.sidebar:
    st.header("Configuration")
    api_key = st.text_input("Enter OpenAI API Key", type="password")

    if not api_key:
        st.warning("‚ö†Ô∏è Please enter your OpenAI API Key to continue.")
        st.stop()

    os.environ["OPENAI_API_KEY"] = api_key
    st.success(" API Key loaded")

# Initialize RAG Chain (cached for efficiency)
@st.cache_resource
def setup_rag():
    """
    Initialize the RAG chain with:
    1. Document loading
    2. Text splitting
    3. Vector embeddings
    4. ChromaDB storage
    5. LLM + LCEL chain setup
    """
    try:
        with st.spinner("üîÑ Loading documents..."):
            loader = TextLoader("knowledge_base.md")
            docs = loader.load()
            st.sidebar.info(f"üìÑ Loaded {len(docs)} document(s)")

        with st.spinner("‚úÇÔ∏è Splitting documents..."):
            splitter = RecursiveCharacterTextSplitter(
                chunk_size=1000,
                chunk_overlap=200
            )
            splits = splitter.split_documents(docs)
            st.sidebar.info(f" Created {len(splits)} chunks")

        with st.spinner("Creating embeddings..."):
            embeddings = OpenAIEmbeddings()
            vectorstore = Chroma.from_documents(
                splits,
                embeddings,
                persist_directory="./chroma_db"
            )
            vectorstore.persist()

        with st.spinner(" Setting up LLM and Chains..."):
            llm = ChatOpenAI(model_name="gpt-4", temperature=0)
            retriever = vectorstore.as_retriever(search_kwargs={"k": 3})

            qa_system_prompt = """You are a helpful assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Keep the answer concise and relevant.

Context: {context}"""

            qa_prompt = ChatPromptTemplate.from_messages([
                ("system", qa_system_prompt),
                MessagesPlaceholder(variable_name="chat_history"),
                ("human", "{input}"),
            ])

            # Create a simple chain using LCEL
            from langchain_core.runnables import RunnablePassthrough

            def format_docs(docs):
                return "\n\n".join(doc.page_content for doc in docs)

            rag_chain = (
                RunnablePassthrough.assign(
                    context=lambda x: format_docs(retriever.get_relevant_documents(x["input"]))
                )
                | qa_prompt
                | llm
            )

        st.sidebar.success(" RAG setup complete!")
        return {"chain": rag_chain, "retriever": retriever}

    except Exception as e:
        st.error(f" Setup Error: {e}")
        st.stop()

# Initialize session state
if "conversation" not in st.session_state:
    st.session_state.conversation = setup_rag()

if "messages" not in st.session_state:
    st.session_state.messages = []

# Display chat history
for msg in st.session_state.messages:
    with st.chat_message(msg["role"]):
        st.write(msg["content"])

# Chat input and response
if prompt := st.chat_input("Ask about Noir, Tornado Cash, or Zero-Knowledge Proofs..."):
    st.session_state.messages.append({"role": "user", "content": prompt})

    with st.chat_message("user"):
        st.write(prompt)

    formatted_chat_history = []
    for msg in st.session_state.messages[:-1]:
        if msg["role"] == "user":
            formatted_chat_history.append(HumanMessage(content=msg["content"]))
        elif msg["role"] == "assistant":
            formatted_chat_history.append(AIMessage(content=msg["content"]))

    if st.session_state.conversation:
        with st.spinner(" Thinking..."):
            try:
                chain = st.session_state.conversation["chain"]
                result = chain.invoke({
                    "input": prompt,
                    "chat_history": formatted_chat_history
                })
                answer = result.content if hasattr(result, 'content') else str(result)

                st.session_state.messages.append({"role": "assistant", "content": answer})

                with st.chat_message("assistant"):
                    st.write(answer)

            except Exception as e:
                st.error(f" Error generating response: {e}")


Overwriting app.py


In [None]:
from pyngrok import ngrok

ngrok.set_auth_token("NGROK AUTH TOKEN")

ngrok.kill()

public_url = ngrok.connect(8501)
print(f" View your RAG Bot here: {public_url}")
print(f" Logs will appear below...\n")

!streamlit run app.py &> logs.txt &


üöÄ View your RAG Bot here: NgrokTunnel: "https://unmulcted-oneirocritical-lawerence.ngrok-free.dev" -> "http://localhost:8501"
üìù Logs will appear below...

