# RAG on HTML documents


## Step-1: Configuration

In [1]:
from my_config import MY_CONFIG

## Step-2: Setup Embeddings

In [2]:
# If connection to https://huggingface.co/ failed, uncomment the following path
import os
os.environ['HF_ENDPOINT'] = 'https://hf-mirror.com'

In [3]:
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.core import Settings

Settings.embed_model = HuggingFaceEmbedding(
    model_name = MY_CONFIG.EMBEDDING_MODEL
)

  from .autonotebook import tqdm as notebook_tqdm


## Step-3: Connect to Milvus

In [4]:
# connect to vector db
from llama_index.core import VectorStoreIndex, StorageContext
from llama_index.vector_stores.milvus import MilvusVectorStore

vector_store = MilvusVectorStore(
    uri = MY_CONFIG.DB_URI ,
    dim = MY_CONFIG.EMBEDDING_LENGTH , 
    collection_name = MY_CONFIG.COLLECTION_NAME,
    overwrite=False  # so we load the index from db
)
storage_context = StorageContext.from_defaults(vector_store=vector_store)

print ("✅ Connected to Milvus instance: ", MY_CONFIG.DB_URI )

2025-05-14 00:50:33,771 [DEBUG][_create_connection]: Created new connection using: a733e3dfd02845258053e25013d61c31 (async_milvus_client.py:600)


✅ Connected to Milvus instance:  workspace/rag_website_milvus.db


## Step-4: Load Document Index from DB

In [5]:
%%time

from llama_index.core import VectorStoreIndex

index = VectorStoreIndex.from_vector_store(
    vector_store=vector_store, storage_context=storage_context)

print ("✅ Loaded index from vector db:", MY_CONFIG.DB_URI )

✅ Loaded index from vector db: workspace/rag_website_milvus.db
CPU times: user 137 ms, sys: 13.5 ms, total: 150 ms
Wall time: 149 ms


## Step-5: Setup LLM

In [6]:
from llama_index.llms.ollama import Ollama
from llama_index.llms.replicate import Replicate

# Setup LLM
if MY_CONFIG.LLM_RUN_ENV == 'replicate':
    llm = Replicate(
        model=MY_CONFIG.LLM_MODEL,
        temperature=0.1
    )
    if os.getenv('REPLICATE_API_TOKEN'):
        print("✅ Found REPLICATE_API_TOKEN")    
    else:   
        raise ValueError("❌ Please set the REPLICATE_API_TOKEN environment variable in .env file.")
elif MY_CONFIG.LLM_RUN_ENV == 'local_ollama':
    llm = Ollama(
        model= MY_CONFIG.LLM_MODEL,
        request_timeout=30.0,
        temperature=0.1
    )
else:
    raise ValueError("❌ Invalid LLM run environment. Please set it to 'replicate' or 'local_ollama'.")
print("✅ LLM run environment: ", MY_CONFIG.LLM_RUN_ENV)    
print("✅ Using LLM model : ", MY_CONFIG.LLM_MODEL)
Settings.llm = llm

✅ LLM run environment:  local_ollama
✅ Using LLM model :  qwen3:0.6b


## Step-6: Query

In [7]:
import query_utils

query_engine = index.as_query_engine()
query = query_utils.tweak_query('What is AI Alliance?', MY_CONFIG.LLM_MODEL)
res = query_engine.query(query)
print(res)

<think>

</think>

The AI Alliance is a collaborative effort involving members from various sectors, including non-profits, government, and academic institutions, aimed at fostering collaboration and aligning AI skills and knowledge across different fields. It seeks to improve outcomes for students and job seekers, enhance the relevance of training programs, and support more collaborative and transparent approaches between academic and corporate partners. The Alliance is growing, with a diverse range of members in 23 countries and is expanding its influence.


In [8]:
query_engine = index.as_query_engine()
query = query_utils.tweak_query('What are the main focus areas of AI Alliance?', MY_CONFIG.LLM_MODEL)
res = query_engine.query(query)
print(res)

<think>

</think>

The main focus areas of AI Alliance, as outlined in the context, are:

1. **Deploy benchmarks, tools, and other resources** to enable responsible development and use of AI systems at a global scale.
2. **Create a catalog of vetted safety, security, and trust tools**.
3. **Support the development and use of AI systems** through open-source tools and collaboration.
4. **Foster a vibrant AI hardware accelerator ecosystem**.
5. **Develop educational content and resources** to inform the public and policymakers about AI.
6. **Launch initiatives** to encourage open development of AI in safe and beneficial ways.


In [9]:
query_engine = index.as_query_engine()
query = query_utils.tweak_query('What are some ai alliance projects?', MY_CONFIG.LLM_MODEL)
res = query_engine.query(query)
print(res)

<think>

</think>

Some AI Alliance projects include the development of frameworks for platform software such as PyTorch, Transformers, Diffusers, Kubernetes, Ray, Hugging Face, and Parameter Efficient Fine Tuning. Additionally, the AI Alliance includes the creation of open models like Llama2, Stable Diffusion, StarCoder, Bloom, and many others.


In [10]:
query_engine = index.as_query_engine()
query = query_utils.tweak_query('Where was the demo night held?', MY_CONFIG.LLM_MODEL)
res = query_engine.query(query)
print(res)

<think>

</think>

The AI Alliance hosted the Open Source AI Demo Night in San Francisco.


In [11]:
query_engine = index.as_query_engine()
query = query_utils.tweak_query('What is the AI Alliance doing in the area of material science?', MY_CONFIG.LLM_MODEL)
res = query_engine.query(query)
print(res)

<think>

</think>

The AI Alliance is focused on advancing AI systems that address challenges in climate, human health, and beyond. It is also working on creating a catalog of vetted safety, security, and trust tools, supporting the development and use of AI systems, and fostering an ecosystem that promotes responsible AI development.


In [12]:
query_engine = index.as_query_engine()
query = query_utils.tweak_query('How do I join the AI Alliance?', MY_CONFIG.LLM_MODEL)
res = query_engine.query(query)
print(res)

<think>

</think>

To join the AI Alliance, you can visit the [contact form](https://thealliance.ai/contact) and submit a message to our membership team. Once your application is reviewed and approved, you will be invited to the AI Alliance Slack and receive additional instructions on how to join our community.


In [13]:
query_engine = index.as_query_engine()
query = query_utils.tweak_query('When was the moon landing?', MY_CONFIG.LLM_MODEL)
res = query_engine.query(query)
print(res)

<think>

</think>

The information provided in the context is about AI-related projects, demos, and open-source initiatives, and there is no mention of the moon landing or its date. Therefore, based on the given context, I cannot provide a specific answer to the query about the moon landing.
