In [1]:
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, Settings
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.llms.huggingface import HuggingFaceLLM
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig, pipeline

# Set up embedding model
Settings.embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-base-en-v1.5")

# Model and Tokenizer (Using 4-bit Quantization)
model_name = "mistralai/Mistral-7B-Instruct-v0.1"
hf_token = "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"

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype="bfloat16",
    bnb_4bit_use_double_quant=True
)

tokenizer = AutoTokenizer.from_pretrained(model_name, token=hf_token)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    quantization_config=bnb_config,  # 4-bit quantization
    device_map="auto",  # Auto GPU mapping
    low_cpu_mem_usage=True,
    token=hf_token
)

# Set up LLM in LlamaIndex (Pass model + tokenizer correctly)
Settings.llm = HuggingFaceLLM(
    model_name=model_name,
    model=model,
    tokenizer=tokenizer,
    device_map="auto",
    model_kwargs={"torch_dtype": "bfloat16"}  # Ensure efficient inference
)

# Load documents from the "data" directory
documents = SimpleDirectoryReader("data").load_data()

# Step 4: Create the index
index = VectorStoreIndex.from_documents(documents)

# Step 5: Persist the index for future use
index.storage_context.persist(persist_dir="./storage")

# Step 6: Create a query engine
query_engine = index.as_query_engine()

# Step 7: Ask a sample query
response = query_engine.query("Summarize the key points from these documents.")
print(response)

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.




The documents discuss the political process of renegotiation of the United Kingdom's (UK) demands on matters of sovereignty since its launch by David Cameron in January 2013. The demands include "an even closer union," subsidiarity, the role of national parliaments, the British exception with regard to the Area of Freedom, Security and Justice (AFSJ) and the issue of national security. The documents analyze the Prime Minister's explicit request, the position of the main Member States, the negotiation and its outcome, and the reasons for the demand from the British viewpoint. The documents also discuss the impact that the result may have both on the current model of the EU constitutionalized in the Treaties and on the integration project, that is, the chances of its future development. The documents also discuss the European Council agreement of February 2016, in both its political and legal aspects. The study of the legal nature of the Decision by the Heads of State or Government is 

In [3]:
response = query_engine.query("Give me a quick summary of the UK gov renegotiation?")
print(response)

Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.



The UK Government’s renegotiation of EU membership was discussed in Parliament. The 
Parliamentary Sovereignty and Scrutiny committee discussed various paragraphs and agreed 
to them. The committee also discussed and agreed to various amendments. The committee 
also discussed and agreed to various provisions in the Referendum Bill concerning 
information to be provided using Government resources. The committee also discussed 
and disagreed to a paragraph concerning the provision of the Government’s opinion.
