## 📂 Data Loading

In [33]:
# pip install llama-index llama-index-embeddings-huggingface llama-index-llms-huggingface gradio transformers torch

from llama_index.core import SimpleDirectoryReader, VectorStoreIndex, Settings, get_response_synthesizer
from llama_index.core.schema import  TextNode
from llama_index.core.retrievers import  VectorIndexRetriever
from llama_index.core.query_engine import  RetrieverQueryEngine
from llama_index.core.postprocessor import  SimilarityPostprocessor

from openai import  OpenAI
from dotenv import  load_dotenv
import  os
from IPython.display import display, Markdown
import torch, gradio as gr

# 3. Load documents
docs = SimpleDirectoryReader("Data/").load_data()

## 📂 Data Loading

In [15]:
# import  sk from .env file

load_dotenv()

# connect to openai API
client = OpenAI(api_key=os.getenv("BOBAD_OPENAI_KEY"))

## 🔎 Embedding & Vectorization

In [16]:
# 4. Build index
index = VectorStoreIndex.from_documents(docs)

print(f"Embedding Model: {index._embed_model.model_name}")
print(f"Index Size: {len(index.vector_store.data.embedding_dict)}")
first_key = list(index.vector_store.data.embedding_dict.keys())[0]
first_embedding = index.vector_store.data.embedding_dict[first_key]
print(f"First embedding shape: {len(first_embedding)}")

Embedding Model: text-embedding-ada-002
Index Size: 20
First embedding shape: 1536


## 📊 Similarity Computation

In [17]:
# configure retriever
retriever = VectorIndexRetriever(
    index=index,
    similarity_top_k=10,
)

In [18]:
results = retriever.retrieve("what is Domain Driven Design?")

In [20]:
display(results)

[NodeWithScore(node=TextNode(id_='1a680426-d4f0-4ef8-b633-37565e519df2', embedding=None, metadata={'page_label': '1', 'file_name': 'A Crash Course on Domain-Driven Design.pdf', 'file_path': '/Users/dubeys/Documents/ABB/CortexLab/Assignment-3/Data/A Crash Course on Domain-Driven Design.pdf', 'file_type': 'application/pdf', 'file_size': 4062016, 'creation_date': '2024-10-08', 'last_modified_date': '2024-10-08'}, excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], relationships={<NodeRelationship.SOURCE: '1'>: RelatedNodeInfo(node_id='ef8ff1e7-b819-4602-873f-6c6f436233c5', node_type='4', metadata={'page_label': '1', 'file_name': 'A Crash Course on Domain-Driven Design.pdf', 'file_path': '/Users/dubeys/Documents/ABB/CortexLab/Assignment-3/Data/A Crash Course on Domain-Driven Design.pdf

In [22]:
# format results in markdown
results_markdown = ""
for i, result in enumerate(results, start=1):
    doc = result.node  # Access the Document inside NodeWithScore
    text_snippet = doc.text  # Extract text
    metadata = doc.metadata  # Extract metadata dictionary

    # Example: get file_name or page_label safely
    file_name = metadata.get("file_name", "N/A")
    page_label = metadata.get("page_label", "N/A")

    results_markdown += f"{i}. **File:** {file_name}  \n"
    results_markdown += f"   **Page:** {page_label}  \n"
    results_markdown += f"   **Snippet:** {text_snippet[:500]}...  \n\n"  # Limit snippet to first 500 chars

In [26]:
display(Markdown(results_markdown))

1. **File:** A Crash Course on Domain-Driven Design.pdf  
   **Page:** 1  
   **Snippet:** A Crash Course on Domain-Driven Design
AUG 01, 2024
1 21 Share
Developing so ware for complex domains is a challenging task. 
As the complexity of the problem domain grows, it becomes increasingly di cult to create
so ware that accurately represents the business concepts, rules, and processes. Poorly designed
so ware can quickly turn into an incomprehensible tangle of code that is di cult to understand,
maintain, and extend.
Domain-Driven Design (DDD) o ers a solution to this problem.
DDD is an ...  

2. **File:** A Crash Course on Domain-Driven Design.pdf  
   **Page:** 19  
   **Snippet:** DDD focuses on creating a rich domain model that re ects a deep understanding of the
business domain. The domain model is based on input from domain experts and serves as a
conceptual framework for the so ware.
A ubiquitous language is cultivated based on the domain model. This language is used
consistently by developers and domain experts in all communication - in conversations,
documentation, and the code itself. Using the ubiquitous language helps keep the model
and implementation aligned.
Th...  

3. **File:** A Crash Course on Domain-Driven Design.pdf  
   **Page:** 2  
   **Snippet:** Domain-Driven Design (DDD) focuses on creating so ware systems that closely align with the
underlying business domain. 
It aims to bridge the gap between the technical implementation and the business requirements
by placing the domain model at the center of the development process.
Core Principles of Domain-Driven Design...  

4. **File:** A Crash Course on Domain-Driven Design.pdf  
   **Page:** 5  
   **Snippet:** In DDD, the domain model is not just a conceptual tool. It is the very foundation of the so ware
design. The structure and behavior of the so ware mirror the structure and behavior of the
model. 
This approach is known as model-driven design.
In practice, it means that the classes, relationships, and behaviors in the code directly
correspond to the concepts, relationships, and rules in the domain model. The design is not
driven by technical concerns or infrastructure details but by the need to e...  

5. **File:** A Crash Course on Domain-Driven Design.pdf  
   **Page:** 14  
   **Snippet:** Factories ensure that the objects they create are always in a consistent and fully initialized
state.
For example, an “OrderFactory” in an e-commerce system can take all the necessary details
(customer, items, quantities, etc.), create the “Order” and “LineItem” instances, link them
together appropriately, and return the resulting Order aggregate to the client.
When dealing with large models in big projects, some strategic design concepts should be
followed to gain the bene ts of domain-driven d...  

6. **File:** A Crash Course on Domain-Driven Design.pdf  
   **Page:** 3  
   **Snippet:** There are three core principles of DDD:
Creating a rich domain model based on input from domain experts
Using a ubiquitous language based on the domain model
Driving the design of the so ware from the domain model
Let's explore each of these principles in more detail.
The foundation of DDD lies in the creation of a rich domain model that accurately captures the
key concepts, relationships, and business rules of the problem domain. This model is not created
in isolation by the development team bu...  

7. **File:** A Crash Course on Domain-Driven Design.pdf  
   **Page:** 20  
   **Snippet:** It would be even better if you included Domain Events as a part of tactical DDD components or
patterns. Though it's a newer concept it still plays a crucial role in the domain model when raising
business logic events in event-driven systems.
LIKE (5) REPLY SHARE
© 2024ByteByteGo ∙ Privacy ∙ Terms ∙ Collection notice
Substackis the home for great culture...  

8. **File:** A Crash Course on Domain-Driven Design.pdf  
   **Page:** 7  
   **Snippet:** persistence, messaging, logging, and other technical services. 
For example, in an e-commerce system, the Domain Layer would contain core business concepts
like Product, Order, and Customer. The UI Layer would have screens for product display,
shopping cart, and checkout. The Application Layer would handle the  ow of the order
placement process. The Infrastructure Layer would provide functionalities like persistence,
messaging, etc.
In Domain-Driven Design (DDD), an entity is a fundamental conce...  

9. **File:** A Crash Course on Domain-Driven Design.pdf  
   **Page:** 6  
   **Snippet:** Typically, a layered architecture consists of four main layers:
User Interface (Presentation) Layer: This layer is responsible for presenting information to
the user and interpreting user commands. It communicates with the Application Layer to
send user requests and receive responses.
Application Layer: The Application Layer coordinates the overall application activity and
orchestrates the data  ow between the UI and the Domain Layer. It does not contain
business logic but delegates it to the Do...  

10. **File:** A Crash Course on Domain-Driven Design.pdf  
   **Page:** 13  
   **Snippet:** These methods allow the client code to interact with the “CustomerRepository” in a way that
mimics a collection of “Customer” objects. 
The client can add, remove, and query customers without worrying about how the data is stored
in the database. Behind the scenes, the repository may use an ORM tool, SQL query, or any other
persistence mechanism to interact with the underlying storage system.
In domain-driven design, a factory is a pattern that encapsulates the creation of complex entities
or ag...  



In [29]:
# configure response synthesizer
response_synthesizer = get_response_synthesizer()

## 📊 Similarity Computation

In [30]:
# assemble query engine
query_engine = RetrieverQueryEngine(
    retriever=retriever,
    response_synthesizer=response_synthesizer,
    node_postprocessors=[SimilarityPostprocessor(similarity_cutoff=0.7)],
)

In [31]:
response = query_engine.query("what is Domain-Driven Design ?")
print(response)

Domain-Driven Design (DDD) is an approach to software development that focuses on modeling the core domain and business logic to guide the software design process. It emphasizes creating a rich domain model based on input from domain experts, using a ubiquitous language for communication, and driving the software design from the domain model itself. By placing the primary focus on the core domain, basing designs on the domain model, and fostering collaboration between technical and domain experts, DDD aims to manage complexity and create maintainable and extensible systems aligned with the business domain.


## 🔎 Embedding & Vectorization

In [34]:
# 6. Gradio UI

def rag_chat(user_input: str) -> str:
    """TODO: Add docstring."""
    return str(query_engine.query(user_input))

iface = gr.Interface(
    fn=rag_chat,
    inputs="text",
    outputs="text",
    title="📚 Offline RAG Chatbot",
    description="Ask questions about your local documents (embeddings + LLM are 100% local)."
)

iface.launch(inbrowser=True, share=False)

* Running on local URL:  http://127.0.0.1:7860
* To create a public link, set `share=True` in `launch()`.




Created dataset file at: .gradio/flagged/dataset1.csv
