1. User Query Input
2. Retrieve Relevant Document Chunks
3. Load Conversation History
4. Format Prompt
5. Call LLM (Together LLaMA-3)
6. Store Response + Continue Chat

In [1]:
%cd ..

/home/vivek/Projects-and-programming/Python/NotebookLM-Clone


# Step 1: Inputs

| Component	| Input | Output |
|-----------|-------|--------|
|`user_query` |	string (e.g., "What is the main point?") |	user query (text) |
|`document_id` | string or UUID to identify the document | retrieved chunks |
|`session_id`	| string (unique ID for chat memory)	| memory restored or new |

In [2]:
from backend.vectordb.chroma_db import ChromaDBManager

chroma_manager = ChromaDBManager()

  from .autonotebook import tqdm as notebook_tqdm


In [3]:
query = "Give me a summary of this article!"
document_id = "a63f24d4"
session_id = "user_123"

# Step 2: Retrieval

| Component |	Input |	Output |
|-----------|---------|--------|
|ChromaDBManager |	`document_id`, `query` |	relevant `context` chunks |


In [None]:
retrieved_docs, retrieved_metadata = chroma_manager.hybrid_query(query, document_id=document_id, top_k=5)

retrieved_docs, retrieved_metadata

In [5]:
context = "\n".join([
        f"{doc}\n[Keywords: {', '.join(metadata.get('keywords', []))}]"  
        for doc, metadata in zip(retrieved_docs, retrieved_metadata or [{}])  
    ])
context

'that you have cloned under the schemes folderClick on Color Presets and choose a color scheme8. FINAL OUTPUT!You came this far! You’ve done it, congratulations!With the right tools that we now have, we can be much more productive compared to before. Customizing our terminal setup really requires some effort but the output that we’re aiming is all worth it.Figure 7. Final terminal UI with neofetch9) Resources/Further ReadingsAlexandre Zajac’s “Terminal 101 — The Setup Guide that will Boost your Workflow”Awesome Zsh PluginsCode Slicer’s “Your terminal can be much, MUCH more productive”Have VCS icons been replaced by simple symbols?Oh My Zsh READMEPowerlevel10k READMETerminalZshOh My ZshPowerlevel10kUnix----1FollowWritten by kjdeluna11 Followers·2 FollowingFollowResponses (1)See all responsesHelpStatusAboutCareersPressBlogPrivacyRulesTermsText to speech\n[Keywords: p, r, o, d, u, c, t, i, v, e, ,,  , s, e, t, u, p, ,,  , f, i, n, a, l, ,,  , t, e, r, m, i, n, a, l, ,,  , o, u, t, p, u, t

# Step 3: Memory

| Component |	Input |	Output |
|-----------|---------|--------|
| RedisChatMessageHistory |	`session_id` |	`message history` object |

In [6]:
from langchain_community.chat_message_histories import RedisChatMessageHistory
from langchain.memory import ConversationBufferMemory
from dotenv import load_dotenv
import os

load_dotenv()

redis_url = os.getenv("REDIS_URL")

def get_message_history(session_id: str) -> RedisChatMessageHistory:
    return RedisChatMessageHistory(session_id=session_id, url=redis_url)

# Step 4: Prompt Creation

| Component |	Input |	Output |
|-----------|---------|--------|
| ChatPromptTemplate |	query + context + history |	formatted messages |

In [7]:
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder

chat_prompt = ChatPromptTemplate.from_messages([
    ("system",
     "You are a seasoned Document QA Expert and knowledgeable researcher. Your role is to carefully study and interpret the document context provided, "
     "and answer the user's query with nuanced insights and a deep understanding of the material. "
     "When formulating your response, consider all aspects of the content—its themes, data, structure, and any subtle nuances—without imposing a rigid structure such as always including 'Summary', 'Key Features', 'Benefits', or 'Conclusion' sections unless they naturally emerge from the document itself.\n\n"
     "Guidelines:\n"
     "1. **Contextual Adaptation:** Tailor your response based solely on the document context and the conversation history. Your answer should be organically structured to best reflect the content's unique characteristics and the query's specific focus.\n"
     "2. **Expert Insight:** Explain complex ideas clearly and provide detailed, thoughtful analysis. Think like a human expert who synthesizes information rather than a mechanical system that outputs preset sections.\n"
     "3. **Conversational Tone:** Maintain a professional yet engaging and conversational tone. Avoid a static, template-driven response. Let your natural voice guide the structure of the answer.\n"
     "4. **Evidence-Based Reasoning:** Draw on the provided document details and explicitly reference key points where relevant. Your answer should help the user understand not just what the document says, but why those points matter.\n"
     "5. **Flexibility:** Adjust your style, tone, and structure dynamically based on the nature of the document. If the document is narrative, be descriptive; if it is technical, be precise; if it contains tables or data, interpret and integrate that information seamlessly.\n"
    ),
    MessagesPlaceholder(variable_name="history"),
    ("human", "### User Query:\n{query}"),
    ("human", "### Relevant Document Context:\n{context}"),
])

# Step 5: LLM
| Component |	Input |	Output |
|-----------|---------|--------|
| TogetherChatModel |	formatted messages |	LLM response (text) |

In [8]:
from langchain_core.runnables import RunnableMap, RunnableLambda
from langchain_core.runnables.history import RunnableWithMessageHistory

In [9]:
from langchain_together import ChatTogether

llm = ChatTogether(
    model="meta-llama/Llama-3.3-70B-Instruct-Turbo-Free",
    temperature=0.5,
    top_p=0.9,
    together_api_key=os.getenv("TOGETHER_API_KEY")  
)


#### Chain Assembly

In [10]:
rag_chain = (
    RunnableMap({
        "query": lambda x: x["query"],
        "context": lambda x: x["context"],
        "history": lambda x: x["history"]
    })
    | chat_prompt
    | llm
)

#### Memory-Aware Wrapper

In [11]:
rag_with_memory = RunnableWithMessageHistory(
    rag_chain,
    get_message_history,
    input_messages_key="query",
    history_messages_key="history"
)

In [13]:
response = rag_with_memory.invoke(
    {
        "query": query,
        "context": context
    },
    config={"configurable": {"session_id": session_id}}
)

print(response.content)

It seems like you've provided a document context, but I'll summarize the main points for you.

The article is about upgrading and customizing your terminal experience using Zsh, Oh My Zsh, and Powerlevel10k. The goal is to make the terminal more aesthetically beautiful, descriptive, and functional.

The process starts by discarding Bash and switching to Zsh, which is the default shell in most Linux and Mac terminals. The author then installs Oh My Zsh, a framework for managing Zsh configurations, and Powerlevel10k, a theme for Zsh that provides a more customizable and functional interface.

The article highlights several plugins that can be used to enhance the terminal experience, including:

* zsh-syntax-highlighting: highlights commands in green if recognized by the shell and red if not, and provides hints about syntax errors
* zsh-autosuggestions: provides autocomplete suggestions based on command history
* fzf: a command-line fuzzy finder for searching recently used commands, files

In [14]:
rag_with_memory.invoke(
    {
        "query": "Does the content/article specifies the naem of author? Just give me name if you find it or else say Name not found",
        "context": context
    },
    config={"configurable": {"session_id": session_id}}
)

AIMessage(content='kjdeluna', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 4, 'prompt_tokens': 3044, 'total_tokens': 3048, 'completion_tokens_details': None, 'prompt_tokens_details': None, 'cached_tokens': 0}, 'model_name': 'meta-llama/Llama-3.3-70B-Instruct-Turbo-Free', 'system_fingerprint': None, 'id': 'npJAoqp-4Yz4kd-92d391a1ffb88b0f', 'finish_reason': 'stop', 'logprobs': None}, id='run-1e3af6ca-e81e-4249-87a5-b72f1600c079-0', usage_metadata={'input_tokens': 3044, 'output_tokens': 4, 'total_tokens': 3048, 'input_token_details': {}, 'output_token_details': {}})

# Step 6: Return & Persist
| Component |	Input |	Output |
|-----------|---------|--------|
| ChatMemory & Testing |	Save query/answer |	updated memory |

In [18]:
message_history = RedisChatMessageHistory(session_id=session_id, url=redis_url)

# Each message is a BaseMessage object (HumanMessage, AIMessage, etc.)
for msg in message_history.messages:
    role = "User" if msg.type == "human" else "AI"
    print(f"\n{role}: {msg.content}")


In [17]:
message_history.clear()