### Creating a Rag system using llama index to query different pdf files
- The data is stored in the Data directory
- Inside the Data directory there exists 2 papers about few shot learning
- There is a .env file in my local directory that holds the OpenAI_API_KEY




In [2]:
import os
import os.path
from dotenv import load_dotenv
from llama_index.core import(   
VectorStoreIndex, 
SimpleDirectoryReader,
StorageContext,
load_index_from_storage,
)
from llama_index.core.retrievers import VectorIndexRetriever
from llama_index.core.query_engine import RetrieverQueryEngine
from llama_index.core.indices.postprocessor import SimilarityPostprocessor
from llama_index.core.response.pprint_utils import pprint_response

In [3]:
# loading the env file and variables 
def reading_indexing_files(data_dir,API_KEY_NAME):
    #loading the env variables from the .env file
    load_dotenv()
    #setting up the OPENAI_API_KEY
    os.environ["OPENAI_API_KEY"]= os.getenv(API_KEY_NAME)
    #read the different files in the Data directory and creating the meta data
    files = SimpleDirectoryReader(data_dir).load_data()
    #creating the indicies for those files
    index = VectorStoreIndex.from_documents(files,show_progress= True)
    #returning the different files and the corresponding indices for further use
    return files, index
files,index = reading_indexing_files(data_dir ="Data",API_KEY_NAME = "API_KEY" )


  from .autonotebook import tqdm as notebook_tqdm
Parsing nodes: 100%|██████████| 300/300 [00:00<00:00, 404.08it/s]
Generating embeddings: 100%|██████████| 390/390 [00:12<00:00, 31.05it/s]


### Setting up the query engine and its paramters:
- The retriever paramters takes as paramters the index of the files as well as the number of top answers we want the retriever to get back
- The postprocessor handles till what percentage of similarity does we want the retriever to bring back
- The query engine uses the above 


In [4]:
def setting_query_engine(index,similarity_top_k,similarity_cutoff):
    retriever = VectorIndexRetriever(index = index, similarity_top_k = similarity_top_k )
    postprocessor = SimilarityPostprocessor(similarity_cutoff = similarity_cutoff )
    query_engine = RetrieverQueryEngine(retriever = retriever, node_postprocessors= [postprocessor])
    return query_engine
query_engine = setting_query_engine(index,4,0.80)

### Handling a query through the quesry engine
#### Method querying_read_from_storage
- Takes as a parmater the storage directory where the indices of the files are gonna be stored on the hard disk rather than the memory so the memory don't get exhuased if we have a lot of files
- If the storage dir is found then we load from it else we create it and save the indices in it
- Then use the query engine we created, with the wanted paramters and running the query

In [5]:
# check if storage already exists
def querying_read_from_storage(storage_dir,index,query_engine,query):
    storage_dir = storage_dir
    if not os.path.exists(storage_dir):
        index.storage_context.persist(persist_dir=storage_dir)
    else:
        # load the existing index
        storage_context = StorageContext.from_defaults(persist_dir=storage_dir)
        index = load_index_from_storage(storage_context)

    # either way we can now query the index
    query_engine = query_engine
    response = query_engine.query(query)
    return response
response = querying_read_from_storage(storage_dir="./storage",index= index,query_engine=query_engine,query = "Explain Few SHot Learning?" )
pprint_response(response,show_source= True)

Final Response: Few-shot learning (FSL) is a learning method that
involves rapidly acquiring valid information from just a few or even
zero samples. It is inspired by human reasoning capabilities and is
commonly found in edge computing scenarios. FSL aims to address the
challenge of effectively learning from small sample datasets or cross-
domain scenarios.
______________________________________________________________________
Source Node 1/4
Node ID: 930d1281-0835-4f77-ac6b-36a2a4c3d5c6
Similarity: 0.8668009876908989
Text: 3 TABLE 2 A List Of Key Acronyms NOMENCLATURE Full Form
Abbreviation Full Form Abbreviation Artiﬁcial Intelligence AI Few-Shot
Learning FSL Deep Learning DL Machine Learning ML Zero-Shot Learning
ZSL One-Shot Learning OSL Neural Architecture Search NAS Conventional
Neural Network CNN K-NearestNeighbor KNN Support Vector Machine SVM
Nearestcentro...
______________________________________________________________________
Source Node 2/4
Node ID: 56f26db1-9600-4249-b68c

In [6]:
response = querying_read_from_storage(storage_dir="./storage",index= index,query_engine=query_engine,query = "What is the Effect of Impact Angle on the Secondary Droplets at High Impact Velocity?" )
pprint_response(response,show_source= True)

Final Response: The effect of impact angle on the secondary droplets
at high impact velocity is that it affects the shape and droplet size
distribution, while the velocity of the ejected droplets remains
constant in the azimuthal direction.
______________________________________________________________________
Source Node 1/4
Node ID: 70efba66-f565-414c-9b21-395633a6fac7
Similarity: 0.9185320029897003
Text: ILASS–Europe 2019, 29th Conference on Liquid Atomization and
Spray Systems, 2-4 September 2019, Paris, France The Effect of Impact
Angle on the Secondary Droplets at High Impact Velocity David A.
Burzynski∗, Stephan E. Bansmer Technische Universität Braunschweig,
Institute of Fluid Mechanics, Braunschweig, Germany *Corresponding
author: d.burzyn...
______________________________________________________________________
Source Node 2/4
Node ID: 7ac6938b-c60a-4489-9c96-c856d359c183
Similarity: 0.9046166471970942
Text: (a) demonstrates the effect of the impact angle on the secondary
dro

In [7]:
response = querying_read_from_storage(storage_dir="./storage",index= index,query_engine=query_engine,query = "What is the Impact of High-Speed Drops on Dry and Wetted Surfaces?" )
pprint_response(response,show_source= True)

Final Response: The impact of high-speed drops on dry and wetted
surfaces is a complex phenomenon that involves various stages such as
drop deformation, spreading, and receding. The dynamics of the impact
can be influenced by surface properties like roughness, porosity,
wettability, temperature, and stiffness. High-speed drop impacts can
lead to splashing phenomena, resulting in the generation of secondary
droplets. The splashing mechanisms are described by the
Rayleigh–Taylor instability of the accelerating liquid film, leading
to different splashing regimes such as corona and prompt splash. The
size, velocity, and volume of the ejected droplets can be affected by
the impact conditions, with the Weber number weakly affecting the
outcome compared to the Reynolds number. The study of high-speed drop
impacts on surfaces is crucial for various technical applications,
including vehicle soiling, aircraft icing, and additive manufacturing
technologies.
_______________________________________

In [8]:
response = querying_read_from_storage(storage_dir="./storage",index= index,query_engine=query_engine,query = "Is the Impact of High-Speed Drops different on Dry surfaces than on Wetted Surfaces?" )
pprint_response(response,show_source= True)

Final Response: The impact of high-speed drops on dry surfaces differs
from that on wetted surfaces. The study indicates that the physical
properties of the drop and film being the same do not result in the
break-up of an emerging crown during splashing on wetted surfaces.
Additionally, the governing physics of the different splashing regimes
suggest that they cannot be unified into one general theory,
indicating distinct behaviors based on the surface conditions.
______________________________________________________________________
Source Node 1/4
Node ID: 0d1a5f99-18fa-4b2a-8f97-043e598b9e62
Similarity: 0.8759468484591265
Text: 3 Drop splashing on dry surfaces 3.1 Time evolution of an
impacting drop The drop impact on dry smooth surfaces can be described
in four fun- damental phases depending on the elapsed time: (1) Drop
deformation and gas entrapment before contact with the surface, (2)
ejection of a thin lamella with possible break-up into small secondary
droplets,(...
__________