- System Architecture
- Llama-index Setup
- Loading Documents
- Creating the Vector Store Index
- Creating Query Engine
- Q&A Over Documents
- How to Persist the Index
- What is inside the Index?
- How to change the default LLM
- Change the Chunk Size
- Use Open Source LLM with Llama Index

In [6]:
!pip install -q llama-index
!pip install -q transformers
!pip install -q accelerate

In [5]:
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from IPython.display import Markdown, display

In [7]:
documents = SimpleDirectoryReader("data").load_data()

In [8]:
documents

[Document(id_='84d9f69f-18cd-42ff-9f78-f35400c3ec09', embedding=None, metadata={'page_label': '1', 'file_name': 'Advanced auditing exam notes.pdf', 'file_path': '/content/data/Advanced auditing exam notes.pdf', 'file_type': 'application/pdf', 'file_size': 262194, 'creation_date': '2024-04-10', 'last_modified_date': '2024-04-10'}, excluded_embed_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], excluded_llm_metadata_keys=['file_name', 'file_type', 'file_size', 'creation_date', 'last_modified_date', 'last_accessed_date'], relationships={}, text="Advanced auditing  \nModule 1  \n1. Qualification and disqualification of a company auditor.  \nThe qualification and disqualification of a company auditor are critical aspects of \ncorporate governance and financial oversight. Auditors play a vital role in ensuring the \naccuracy and reliability of a company's financial statements. Here is  a brief overview \nof the qualification 

In [20]:
!pip install -q llama-index-embeddings-google

In [21]:
!pip install llama-index-llms-palm

In [22]:
import pprint
import google.generativeai as palm

palm_api_key = "AIzaSyBX5NBOic_Lb4k2OFF_OQWgMJtCwfUe-QM"
palm.configure(api_key=palm_api_key)
models = [
    m
    for m in palm.list_models()
    if "generateText" in m.supported_generation_methods
]
model = models[0].name
print(model)

models/text-bison-001


In [23]:
from llama_index.core import ServiceContext, set_global_service_context
from llama_index.core import Settings

In [32]:
from llama_index.llms.palm import PaLM
from llama_index.embeddings.google import GooglePaLMEmbedding

# configure service context
Settings.embed_model= GooglePaLMEmbedding(model_name='models/embedding-gecko-001', api_key=palm_api_key)
# embeddings = embed_model.get_text_embedding("Google PaLM Embeddings.")

In [33]:
# set_global_service_context(service_context)
# service_context = ServiceContext.from_defaults(llm=embed_model)
index = VectorStoreIndex.from_documents(documents, embed_model=embed_model)

In [36]:
query_engine = index.as_query_engine(llm=PaLM(model='models/text-bison-001', api_key=palm_api_key))

In [37]:
response = query_engine.query(" What is meant by auditing in CIS environment?")

In [38]:
display(Markdown(f"<b>{response}</b>"))

<b>Auditing in a CIS environment refers to the systematic examination and evaluation of an organization's information technology systems, processes, and controls to ensure they are operating effectively, efficiently, and securel y.</b>

In [39]:
index.storage_context.persist()

In [41]:
from llama_index.core import StorageContext, load_index_from_storage

storage_context = StorageContext.from_defaults(persist_dir='./storage')
index = load_index_from_storage(storage_context=storage_context)

Using a Hugging Face LLM

In [1]:
!pip install -q llama-index-embeddings-huggingface

In [2]:
!pip install -q llama-index-llms-huggingface

In [3]:
!pip install -q bitsandbytes

In [9]:
from llama_index.embeddings.huggingface import HuggingFaceEmbedding


embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")
index = VectorStoreIndex.from_documents(documents, embed_model=embed_model)

In [10]:
!huggingface-cli login


    _|    _|  _|    _|    _|_|_|    _|_|_|  _|_|_|  _|      _|    _|_|_|      _|_|_|_|    _|_|      _|_|_|  _|_|_|_|
    _|    _|  _|    _|  _|        _|          _|    _|_|    _|  _|            _|        _|    _|  _|        _|
    _|_|_|_|  _|    _|  _|  _|_|  _|  _|_|    _|    _|  _|  _|  _|  _|_|      _|_|_|    _|_|_|_|  _|        _|_|_|
    _|    _|  _|    _|  _|    _|  _|    _|    _|    _|    _|_|  _|    _|      _|        _|    _|  _|        _|
    _|    _|    _|_|      _|_|_|    _|_|_|  _|_|_|  _|      _|    _|_|_|      _|        _|    _|    _|_|_|  _|_|_|_|

    A token is already saved on your machine. Run `huggingface-cli whoami` to get more information or `huggingface-cli logout` if you want to log out.
    Setting a new token will erase the existing one.
    To login, `huggingface_hub` requires a token generated from https://huggingface.co/settings/tokens .
Token: 
Add token as git credential? (Y/n) y
Token is valid (permission: read).
[1m[31mCannot authenticate through g

In [11]:
from llama_index.llms.huggingface import HuggingFaceLLM
from llama_index.core.prompts import PromptTemplate
import torch
from transformers import BitsAndBytesConfig

system_prompt = """
You are a Q&A assistant. Your goal is to answer questions as
accurately as possible based on the instructions and context provided.
"""

query_wrapper_prompt = PromptTemplate('<|system|>\n</s>\n<|user|>\n{query_str}</s>\n<|assistant|>')
quantization_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_compute_dtype=torch.float16,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_use_double_quant=True,
)
llm = HuggingFaceLLM(
    tokenizer_name="meta-llama/Llama-2-7b-chat-hf",
    model_name="meta-llama/Llama-2-7b-chat-hf",
    context_window=3900,
    system_prompt=system_prompt,
    max_new_tokens=256,
    model_kwargs={"quantization_config": quantization_config},
    generate_kwargs={"temperature": 0.5, "top_k": 50, "top_p": 0.95},
    query_wrapper_prompt = query_wrapper_prompt,
    device_map="auto",
)

model.safetensors.index.json:   0%|          | 0.00/26.8k [00:00<?, ?B/s]

Downloading shards:   0%|          | 0/2 [00:00<?, ?it/s]

model-00001-of-00002.safetensors:   0%|          | 0.00/9.98G [00:00<?, ?B/s]

model-00002-of-00002.safetensors:   0%|          | 0.00/3.50G [00:00<?, ?B/s]

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/188 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/1.62k [00:00<?, ?B/s]

tokenizer.model:   0%|          | 0.00/500k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.84M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/414 [00:00<?, ?B/s]

In [12]:
query_engine = index.as_query_engine(llm=llm)

In [13]:
response = query_engine.query(" What is meant by auditing in CIS environment?")

In [14]:
display(Markdown(f"<b>{response}</b>"))

<b>
Auditing in a Computer Information Systems (CIS) environment refers to the systematic examination and evaluation of an organization's information technology systems, processes, and controls to ensure they are operating effectively, efficiently, and securely. It involves assessing the integrity, reliability, and security of an organization's computer systems, networks, and digital data to identify any weaknesses or vulnerabilities and provide recommendations for improvement.

The purpose of auditing in a CIS environment is to identify and mitigate risks associated with information technology, such as data breaches, financial fraud, and operational disruptions. It also helps ensure compliance with relevant laws and regulations, maintains trust and confidence in the organization's digital operations, and safeguards the organization's reputation.

Auditing in a CIS environment encompasses a wide range of areas, including hardware, software, network infrastructure, data storage, access controls, and disaster recovery plans. It also evaluates the alignment of IT strategies with an organization's overall business objectives.

The auditing process involves a systematic and methodical examination of IT systems and controls, which may include data</b>