os:

The os module provides functions for interacting with the operating system, such as reading or writing to the file system.


llama_index.core:

SimpleDirectoryReader: Used to read documents from a directory.
VectorStoreIndex: Manages a vector store for efficient retrieval of data.
service_context: Provides a context for running services.
StorageContext: Manages the storage context for data.
load_index_from_storage: Loads an index from stored data.
llama_index.llms.gemini:

Gemini: Represents the Gemini large language model (LLM), which can be used for generating text, embeddings, etc.
IPython.display:

Markdown: Used to display text formatted in Markdown in Jupyter notebooks.
display: Displays the Markdown formatted text.
google.generativeai:

This module is used for accessing Google's Generative AI services, such as text generation, image generation, etc.
llama_index.embeddings.gemini:

GeminiEmbedding: Used to create embeddings (vector representations) for documents using Gemini models.
dotenv:

load_dotenv: Loads environment variables from a .env file into the environment, making them accessible in the code. This is useful for storing sensitive information like API keys.

In [2]:
# Importing the 'os' module for interacting with the operating system.
import os

# Importing classes from 'llama_index.core' for directory reading, vector storage, service context, and storage context.
from llama_index.core import SimpleDirectoryReader, VectorStoreIndex, service_context, StorageContext, load_index_from_storage

# Importing the 'Gemini' class from 'llama_index.llms.gemini' for using Gemini LLMs.
from llama_index.llms.gemini import Gemini

# Importing the 'Markdown' and 'display' classes from 'IPython.display' for displaying Markdown formatted text in Jupyter notebooks.
from IPython.display import Markdown, display

# Importing the 'google.generativeai' module for using Google's Generative AI services.
import google.generativeai as genai

# Importing the 'GeminiEmbedding' class from 'llama_index.embeddings.gemini' for creating embeddings using Gemini models.
from llama_index.embeddings.gemini import GeminiEmbedding

# Importing the 'load_dotenv' function from 'dotenv' to load environment variables from a '.env' file.
from dotenv import load_dotenv

# Loading environment variables from a '.env' file.
load_dotenv()


True

In [3]:
key=os.getenv("gemini_api_key")


In [4]:
genai.configure(api_key=key)
# to see all the available models
for models in genai.list_models():
    if 'generateContent' in models.supported_generation_methods:#'generateContent' is the category in which I want to see the available models. The models name and working may be upgraded in the future and  as a 4th year student my english writing skill may be a little bit heasty please bear with it
        print(models.name)

models/gemini-1.0-pro
models/gemini-1.0-pro-001
models/gemini-1.0-pro-latest
models/gemini-1.0-pro-vision-latest
models/gemini-1.5-flash
models/gemini-1.5-flash-001
models/gemini-1.5-flash-latest
models/gemini-1.5-pro
models/gemini-1.5-pro-001
models/gemini-1.5-pro-latest
models/gemini-pro
models/gemini-pro-vision


In [5]:
#Based on the location database we are going to make quaries to avoid hallucination
try:
    docs = SimpleDirectoryReader(r"C:\Users\Sayan Maity\OneDrive\Desktop\LLM\LLAMAINDEX\Data")
    doc=docs.load_data()
    print(type(doc))
    for all_items in doc:
        print(all_items.text)
except:
    print("nothing inside Data")

<class 'list'>
 
GenAI for next- gen 
governments  
January 2024
 
GenAI for next -gen governments   January 2024  
PwC   2 
Executive summary 
Generative AI (GenAI ) has the power to transform governments by facilitating inclusion and improving productivity 
and accessibility. GenAI, powered by data and driven by sophisticated algorithms, can create novel solutions for 
governments through informed decision making and hyperpersonalised citizen interaction.  
GenAI has the potential to radically transform both the public and private sectors. It has the potential to 
significantly boost individual productivity while also bridging the digital divide by promoting equitable inf ormation 
access and processes.  GenAI, like any other technology, has its unique set of challenges. Issues such as 
algorithmic bias, accountability, privacy and fairness become central when deploying GenAI for public use. This 
paper advocates the use of GenAI in  public service delivery and governance, and emphas

Gemini Model Instance:

Gemini: This class is imported from llama_index.llms.gemini and represents the Gemini large language model (LLM).
models="gemini-1.5-flash-latest": Specifies the version of the Gemini model you want to use. Here, "gemini-1.5-flash-latest" is chosen, which is the latest version of this model.
api_key=key: This is your API key required to access the Gemini model. This key is typically stored securely in an environment variable or a configuration file.
Gemini Embedding Instance:

GeminiEmbedding: This class is imported from llama_index.embeddings.gemini and is used to generate embeddings for text data.
model_name="models/embedding-001": Specifies the name or path of the embedding model to be used for generating vector representations of the text.


By creating these instances, I set up two main components for this project:

Language Model (model): Used for generating or processing text with the capabilities of the Gemini model.
Embedding Model (embedding_model): Used for converting text into numerical vectors (embeddings) that can be used in machine learning models or for similarity searches in vector stores.

In [9]:
# Create an instance of the Gemini model using a specific model version and an API key.
# This instance will be used for generating responses, text, or other tasks that require a language model.
model = Gemini(models="gemini-1.5-pro", api_key=key)#Change the model with the vavilable latest one for the best performance

# Create an instance of GeminiEmbedding using a specific embedding model name.
# This instance will be used to generate embeddings (vector representations) for text data.
embedding_model = GeminiEmbedding(model_name="models/embedding-001")


In [10]:
from llama_index.core.node_parser import SentenceSplitter
from llama_index.core import Settings

# Set the global language model to the Gemini instance
Settings.llm = model

# Set the global embedding model to the GeminiEmbedding instance
Settings.embed_model = embedding_model

# Configure the SentenceSplitter for splitting text into chunks
Settings.node_parser = SentenceSplitter(chunk_size=512, chunk_overlap=20)

# Set the number of output tokens and context window size
Settings.num_output = 512
Settings.context_window = 3900
Settings.index_storage_path="./Metadata"


In [14]:
# Assuming 'documents' contains the documents to be indexed and 'transformations' is a list of text transformations

# Create a vector store index using the embedding model
# This index will store embeddings of the documents for efficient similarity search
index = VectorStoreIndex.from_documents(
    doc, embed_model=embedding_model
)





DefaultCredentialsError: Your default credentials were not found. To set up Application Default Credentials, see https://cloud.google.com/docs/authentication/external/set-up-adc for more information.