# RAG on HTML documents


## Step-1: Configuration

In [1]:
from my_config import MY_CONFIG

## Step-2: Setup Embeddings

In [2]:
# If connection to https://huggingface.co/ failed, uncomment the following path
import os
os.environ['HF_ENDPOINT'] = 'https://hf-mirror.com'

In [3]:
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.core import Settings

Settings.embed_model = HuggingFaceEmbedding(
    model_name = MY_CONFIG.EMBEDDING_MODEL
)

  from .autonotebook import tqdm as notebook_tqdm


## Step-3: Connect to Milvus

In [4]:
# connect to vector db
from llama_index.core import VectorStoreIndex, StorageContext
from llama_index.vector_stores.milvus import MilvusVectorStore

vector_store = MilvusVectorStore(
    uri = MY_CONFIG.DB_URI ,
    dim = MY_CONFIG.EMBEDDING_LENGTH , 
    collection_name = MY_CONFIG.COLLECTION_NAME,
    overwrite=False  # so we load the index from db
)
storage_context = StorageContext.from_defaults(vector_store=vector_store)

print ("✅ Connected to Milvus instance: ", MY_CONFIG.DB_URI )

2025-03-28 14:43:57,576 [DEBUG][_create_connection]: Created new connection using: d330a7f6f1e74a6081facd45190a5cc4 (async_milvus_client.py:600)


✅ Connected to Milvus instance:  ./rag_website.db


## Step-4: Load Document Index from DB

In [5]:
%%time

from llama_index.core import VectorStoreIndex

index = VectorStoreIndex.from_vector_store(
    vector_store=vector_store, storage_context=storage_context)

print ("✅ Loaded index from vector db:", MY_CONFIG.DB_URI )

✅ Loaded index from vector db: ./rag_website.db
CPU times: user 92.8 ms, sys: 9.95 ms, total: 103 ms
Wall time: 102 ms


## Step-5: Setup LLM

In [6]:
from llama_index.llms.replicate import Replicate
from llama_index.core import Settings

llm = Replicate(
    model= MY_CONFIG.LLM_MODEL,
    temperature=0.1
)

Settings.llm = llm

## Step-6: Query

In [7]:
query_engine = index.as_query_engine()
res = query_engine.query("What is AI Alliance?")
print(res)

The AI Alliance is an international community of researchers, developers, and organizational leaders committed to supporting and enhancing open innovation across the AI technology landscape. Its primary focus is to accelerate progress, improve safety, security, and trust in AI, and maximize benefits to people and society worldwide. The Alliance aims to foster an open community where developers and researchers can collaborate responsibly to innovate in AI, ensuring scientific rigor, trust, safety, security, diversity, and economic competitiveness. It brings together over 100 members from various sectors, including companies, startups, universities, research institutions, government organizations, and non-profit foundations, to work on aspects of AI education, research, development, deployment, and governance. The AI Alliance will operate through member-driven working groups, a governing board, and technical oversight committees, while also partnering with existing initiatives from gover

In [8]:
query_engine = index.as_query_engine()
res = query_engine.query("What are the main focus areas of AI Alliance?")
print(res)

The AI Alliance has six goal-oriented thematic programs, which are referred to as Focus Areas. These Focus Areas are:

1. AI for Social Good
2. AI Safety and Security
3. AI Ethics and Fairness
4. AI Education and Workforce Development
5. AI Governance and Policy
6. AI Infrastructure and Scalability

These Focus Areas guide the AI Alliance's efforts to improve foundational capabilities, safety, security, and trust in AI, while responsibly maximizing benefits to people and society worldwide.


In [9]:
query_engine = index.as_query_engine()
res = query_engine.query("What are some ai alliance projects?")
print(res)

Based on the provided context, the AI Alliance consists of two main types of projects:

1. **Core Projects**: These are foundational building blocks of the AI Alliance. They are managed directly by the AI Alliance and have goals and a roadmap to address substantial cross-community challenges. Core Projects are openly licensed and are opportunities for individual contributors and members to collaborate and make an impact on the future of AI.

2. **Affiliated Projects**: These are typically from member organizations who seek deeper collaboration and impact, often by building with or on Core Projects or other Affiliated Projects. The project owners retain full management of the project, but they must have open permissively licensed artifacts and clear community calls-to-action.

The context does not provide a specific list of Core or Affiliated Projects. However, it emphasizes that the AI Alliance encourages grassroots collaboration and supports the discovery, enabling, and scaling of goo

In [10]:
query_engine = index.as_query_engine()
res = query_engine.query("Where was the demo night held?")
print(res)

The demo night, Open Source AI Demo Night, was held in San Francisco, California.


In [11]:
query_engine = index.as_query_engine()
res = query_engine.query("When was the moon landing?")
print(res)

The context information provided does not contain any details about the moon landing. It discusses various AI-related projects, frameworks, and initiatives, but there is no mention of a moon landing event.


In [12]:
query_engine = index.as_query_engine()
res = query_engine.query("What is the AI Alliance doing in the area of material science?")
print(res)

The AI Alliance is working on the Materials and Chemistry Working Group, which focuses on curating datasets, tasks, and benchmarks for materials science. They aim to build out foundation models in chemistry for predicting properties, experimental outcomes, or generating new candidates. Additionally, they are creating a framework to foster collaboration between human experts and AI agents to address global urgent challenges in sustainability and the safety of materials.


In [13]:
query_engine = index.as_query_engine()
res = query_engine.query("How do I join the AI Alliance?")
print(res)

To join the AI Alliance, you can follow these steps:

1. **Apply to become a collaborator**: Submit the form at the "Become a collaborator" section to express your interest in joining one of the working groups or major initiatives. The AI Alliance will review your request and get back to you within 30 days or fewer.

2. **Subscribe to the community newsletter**: By submitting the form at the "Subscribe to our community newsletter" section, you agree that the AI Alliance will collect and process your personal information to keep you informed about AI Alliance initiatives and enable your involvement in AI Alliance activities. You can also request a permanent deletion of your personal information at any time.

3. **Join the community**: The AI Alliance is a community of leading AI innovators, researchers, developers, engineers, and early enterprise adopters. You can attend events, contribute to projects, and engage with the community through their website and social media channels.

4. **