# Azure AI Search az NVIDIA NIM √©s LlamaIndex integr√°ci√≥j√°val

Ebben a notebookban bemutatjuk, hogyan lehet az NVIDIA AI modelljeit √©s a LlamaIndexet felhaszn√°lni egy hat√©kony Retrieval-Augmented Generation (RAG) folyamat l√©trehoz√°s√°hoz. Az NVIDIA LLM-jeit √©s embeddingjeit fogjuk haszn√°lni, integr√°lva azokat az Azure AI Search szolg√°ltat√°ssal, mint vektort√°rol√≥val, √©s RAG-ot v√©grehajtva a keres√©si min≈ës√©g √©s hat√©konys√°g jav√≠t√°sa √©rdek√©ben.

## El≈ëny√∂k
- **Sk√°l√°zhat√≥s√°g**: Haszn√°lja ki az NVIDIA nagy nyelvi modelljeit √©s az Azure AI Search szolg√°ltat√°st a sk√°l√°zhat√≥ √©s hat√©kony lek√©rdez√©s √©rdek√©ben.
- **K√∂lts√©ghat√©konys√°g**: Optimaliz√°lja a keres√©st √©s a lek√©rdez√©st hat√©kony vektort√°rol√°ssal √©s hibrid keres√©si technik√°kkal.
- **Magas teljes√≠tm√©ny**: Kombin√°lja az er≈ëteljes LLM-eket a vektoriz√°lt keres√©ssel a gyorsabb √©s pontosabb v√°laszok √©rdek√©ben.
- **Min≈ës√©g**: Tartsa fenn a magas keres√©si min≈ës√©get az LLM v√°laszainak relev√°ns dokumentumokkal val√≥ al√°t√°maszt√°s√°val.

## El≈ëfelt√©telek
- üêç Python 3.9 vagy √∫jabb
- üîó [Azure AI Search szolg√°ltat√°s](https://learn.microsoft.com/azure/search/)
- üîó NVIDIA API kulcs az NVIDIA LLM-ekhez √©s embeddingekhez val√≥ hozz√°f√©r√©shez az NVIDIA NIM mikroszolg√°ltat√°sokon kereszt√ºl

## Lefedett funkci√≥k
- ‚úÖ NVIDIA LLM integr√°ci√≥ (a [Phi-3.5-MOE](https://build.nvidia.com/microsoft/phi-3_5-moe) modellt fogjuk haszn√°lni)
- ‚úÖ NVIDIA embeddingek (a [nv-embedqa-e5-v5](https://build.nvidia.com/nvidia/nv-embedqa-e5-v5) modellt fogjuk haszn√°lni)
- ‚úÖ Azure AI Search fejlett lek√©rdez√©si m√≥dok
- ‚úÖ Dokumentumindexel√©s a LlamaIndex seg√≠ts√©g√©vel
- ‚úÖ RAG az Azure AI Search √©s a LlamaIndex haszn√°lat√°val, NVIDIA LLM-ekkel

Kezdj√ºnk neki!


In [None]:
!pip install azure-search-documents==11.5.1
!pip install --upgrade llama-index
!pip install --upgrade llama-index-core
!pip install --upgrade llama-index-readers-file
!pip install --upgrade llama-index-llms-nvidia
!pip install --upgrade llama-index-embeddings-nvidia
!pip install --upgrade llama-index-postprocessor-nvidia-rerank
!pip install --upgrade llama-index-vector-stores-azureaisearch
!pip install python-dotenv

## Telep√≠t√©s √©s k√∂vetelm√©nyek
Hozzon l√©tre egy Python-k√∂rnyezetet Python 3.10 vagy √∫jabb verzi√≥val.

## Els≈ë l√©p√©sek!


Ahhoz, hogy elkezdhesd, sz√ºks√©ged lesz egy `NVIDIA_API_KEY` kulcsra az NVIDIA AI Foundation modellek haszn√°lat√°hoz:
1) Hozz l√©tre egy ingyenes fi√≥kot az [NVIDIA](https://build.nvidia.com/explore/discover) oldal√°n.
2) Kattints a v√°lasztott modellre.
3) Az Input alatt v√°laszd ki a Python f√ºlet, majd kattints a **Get API Key** gombra, √©s ut√°na a **Generate Key** gombra.
4) M√°sold ki √©s mentsd el a gener√°lt kulcsot NVIDIA_API_KEY n√©ven. Ett≈ël kezdve hozz√°f√©rsz az endpointokhoz.


In [3]:
import getpass
import os
from dotenv import load_dotenv

# Load environment variables
load_dotenv()

if not os.environ.get("NVIDIA_API_KEY", "").startswith("nvapi-"):
    nvidia_api_key = getpass.getpass("Enter your NVIDIA API key: ")
    assert nvidia_api_key.startswith("nvapi-"), f"{nvidia_api_key[:5]}... is not a valid key"
    os.environ["NVIDIA_API_KEY"] = nvidia_api_key


## RAG p√©lda LLM √©s be√°gyaz√°s haszn√°lat√°val
### 1) Az LLM inicializ√°l√°sa
A `llama-index-llms-nvidia`, m√°s n√©ven az NVIDIA LLM csatlakoz√≥ja, lehet≈ëv√© teszi, hogy csatlakozz kompatibilis modellekhez √©s gener√°lj az NVIDIA API katal√≥gus√°ban el√©rhet≈ë modellekb≈ël. Itt tal√°lhat√≥ a chat completion modellek list√°ja: https://build.nvidia.com/search?term=Text-to-Text

Ebben a p√©ld√°ban a **mixtral-8x7b-instruct-v0.1** modellt fogjuk haszn√°lni.


In [75]:
from llama_index.core import Settings
from llama_index.llms.nvidia import NVIDIA

# Here we are using mixtral-8x7b-instruct-v0.1 model from API Catalog
Settings.llm = NVIDIA(model="microsoft/phi-3.5-moe-instruct", api_key=os.getenv("NVIDIA_API_KEY"))

### 2) Az embedding inicializ√°l√°sa
A `llama-index-embeddings-nvidia`, m√°s n√©ven NVIDIA Embeddings csatlakoz√≥, lehet≈ëv√© teszi, hogy csatlakozz kompatibilis modellekhez, √©s gener√°lj az NVIDIA API katal√≥gusban el√©rhet≈ë modellekb≈ël. Az `nvidia/nv-embedqa-e5-v5` modellt v√°lasztottuk embedding modellk√©nt. Itt tal√°lhat√≥ egy lista a sz√∂veg embedding modellekr≈ël: https://build.nvidia.com/nim?filters=usecase%3Ausecase_text_to_embedding%2Cusecase%3Ausecase_image_to_embedding


In [6]:
from llama_index.embeddings.nvidia import NVIDIAEmbedding

Settings.embed_model = NVIDIAEmbedding(model="nvidia/nv-embedqa-e5-v5", api_key=os.getenv("NVIDIA_API_KEY"))

### 3) Hozzon l√©tre egy Azure AI Search Vector Store-t


In [76]:
import logging
import sys
import os
import getpass
from azure.core.credentials import AzureKeyCredential
from azure.search.documents import SearchClient
from azure.search.documents.indexes import SearchIndexClient
from IPython.display import Markdown, display
from llama_index.vector_stores.azureaisearch import AzureAISearchVectorStore, IndexManagement


search_service_api_key = os.getenv('AZURE_SEARCH_ADMIN_KEY') or getpass.getpass('Enter your Azure Search API key: ')
search_service_endpoint = os.getenv('AZURE_SEARCH_SERVICE_ENDPOINT') or getpass.getpass('Enter your Azure Search service endpoint: ')
search_service_api_version = "2024-07-01"
credential = AzureKeyCredential(search_service_api_key)

# Index name to use
index_name = "llamaindex-nvidia-azureaisearch-demo"

# Use index client to demonstrate creating an index
index_client = SearchIndexClient(
    endpoint=search_service_endpoint,
    credential=credential,
)

# Use search client to demonstrate using existing index
search_client = SearchClient(
    endpoint=search_service_endpoint,
    index_name=index_name,
    credential=credential,
)

In [None]:
vector_store = AzureAISearchVectorStore(
    search_or_index_client=index_client,
    index_name=index_name,
    index_management=IndexManagement.CREATE_IF_NOT_EXISTS,
    id_field_key="id",
    chunk_field_key="chunk",
    embedding_field_key="embedding",
    embedding_dimensionality=1024, # dimensionality for nv-embedqa-e5-v5 model
    metadata_string_field_key="metadata",
    doc_id_field_key="doc_id",
    language_analyzer="en.lucene",
    vector_algorithm_type="exhaustiveKnn",
    # compression_type="binary" # Option to use "scalar" or "binary". NOTE: compression is only supported for HNSW
)

In [20]:
from llama_index.core import SimpleDirectoryReader, StorageContext, VectorStoreIndex
from llama_index.core.text_splitter import TokenTextSplitter

# Configure text splitter (nv-embedqa-e5-v5 model has a limit of 512 tokens per input size)
text_splitter = TokenTextSplitter(separator=" ", chunk_size=500, chunk_overlap=10)

# Load documents
documents = SimpleDirectoryReader(
    input_files=["data/txt/state_of_the_union.txt"]
).load_data()
storage_context = StorageContext.from_defaults(vector_store=vector_store)

# Create index with text splitter
index = VectorStoreIndex.from_documents(
    documents,
    transformations=[text_splitter],
    storage_context=storage_context,
)

### 5) Hozz l√©tre egy lek√©rdez√©si motort, hogy k√©rd√©seket tehess fel az adataiddal kapcsolatban

Itt van egy lek√©rdez√©s, amely tiszt√°n vektorkeres√©st haszn√°l az Azure AI Search-ben, √©s a v√°laszt a LLM-√ºnkh√∂z (Phi-3.5-MOE) k√∂ti.


In [69]:
query_engine = index.as_query_engine()
response = query_engine.query("Who did the speaker mention as being present in the chamber?")
display(Markdown(f"{response}"))

 The speaker mentioned the Ukrainian Ambassador to the United States, along with other members of Congress, the Cabinet, and various officials such as the Vice President, the First Lady, and the Second Gentleman, as being present in the chamber.

Itt van egy lek√©rdez√©s hibrid keres√©s haszn√°lat√°val az Azure AI Search-ben.


In [70]:
from llama_index.core.query_engine import RetrieverQueryEngine
from llama_index.core.vector_stores.types import VectorStoreQueryMode
from IPython.display import Markdown, display
from llama_index.core.schema import MetadataMode

# Initialize hybrid retriever and query engine
hybrid_retriever = index.as_retriever(vector_store_query_mode=VectorStoreQueryMode.HYBRID)
hybrid_query_engine = RetrieverQueryEngine(retriever=hybrid_retriever)

# Query execution
query = "What were the exact economic consequences mentioned in relation to Russia's stock market?"
response = hybrid_query_engine.query(query)

# Display the response
display(Markdown(f"{response}"))
print("\n")

# Print the source nodes
print("Source Nodes:")
for node in response.source_nodes:
    print(node.get_content(metadata_mode=MetadataMode.LLM))

 The Russian stock market experienced a significant drop, losing 40% of its value. Additionally, trading had to be suspended due to the ongoing situation.



Source Nodes:
file_path: data\txt\state_of_the_union.txt

building a coalition of other freedom-loving nations from Europe and the Americas to Asia and Africa to confront Putin. 

I spent countless hours unifying our European allies. We shared with the world in advance what we knew Putin was planning and precisely how he would try to falsely justify his aggression.  

We countered Russia‚Äôs lies with truth.   

And now that he has acted the free world is holding him accountable. 

Along with twenty-seven members of the European Union including France, Germany, Italy, as well as countries like the United Kingdom, Canada, Japan, Korea, Australia, New Zealand, and many others, even Switzerland. 

We are inflicting pain on Russia and supporting the people of Ukraine. Putin is now isolated from the world more than ever. 

Together with our allies ‚Äìwe are right now enforcing powerful economic sanctions. 

We are cutting off Russia‚Äôs largest banks from the international financial syste

#### Vektorkeres√©s elemz√©se
Az LLM v√°lasza pontosan megragadja az orosz t≈ëzsd√©vel kapcsolatos kulcsfontoss√°g√∫ gazdas√°gi k√∂vetkezm√©nyeket, amelyeket a forr√°ssz√∂veg eml√≠t. Konkr√©tan meg√°llap√≠tja, hogy az orosz t≈ëzsde jelent≈ës visszaes√©st szenvedett el, √©rt√©k√©nek 40%-√°t elvesz√≠tette, √©s a keresked√©st felf√ºggesztett√©k a kialakult helyzet miatt. Ez a v√°lasz j√≥l √∂sszhangban van a forr√°sban megadott inform√°ci√≥kkal, jelezve, hogy az LLM helyesen azonos√≠totta √©s foglalta √∂ssze a t≈ëzsdei hat√°sokkal kapcsolatos relev√°ns r√©szleteket, amelyek Oroszorsz√°g l√©p√©sei √©s a szankci√≥k k√∂vetkezm√©nyek√©nt jelentkeztek.

#### Forr√°scsom√≥pontok megjegyz√©sei
A forr√°scsom√≥pontok r√©szletesen bemutatj√°k azokat a gazdas√°gi k√∂vetkezm√©nyeket, amelyekkel Oroszorsz√°g a nemzetk√∂zi szankci√≥k miatt szembes√ºlt. A sz√∂veg kiemeli, hogy az orosz t≈ëzsde √©rt√©k√©nek 40%-√°t elvesz√≠tette, √©s a keresked√©st felf√ºggesztett√©k. Emellett m√°s gazdas√°gi hat√°sokat is eml√≠t, p√©ld√°ul a rubel le√©rt√©kel≈ëd√©s√©t √©s Oroszorsz√°g gazdas√°g√°nak sz√©lesebb k√∂r≈± elszigetel≈ëd√©s√©t. Az LLM v√°lasza hat√©konyan √∂sszefoglalta ezekb≈ël a csom√≥pontokb√≥l a legfontosabb pontokat, k√ºl√∂n√∂s tekintettel a t≈ëzsdei hat√°sokra, ahogyan azt a k√©rd√©sben k√©rt√©k.


Most n√©zz√ºnk meg egy olyan lek√©rdez√©st, ahol a Hibrid Keres√©s nem ad megalapozott v√°laszt:


In [71]:
# Query execution
query = "What was the precise date when Russia invaded Ukraine?"
response = hybrid_query_engine.query(query)

# Display the response
display(Markdown(f"{response}"))
print("\n")

# Print the source nodes
print("Source Nodes:")
for node in response.source_nodes:
    print(node.get_content(metadata_mode=MetadataMode.LLM))


 The provided context does not specify the exact date of Russia's invasion of Ukraine. However, it does mention that the events discussed are happening in the current era and that the actions taken are in response to Putin's aggression. For the precise date, one would need to refer to external sources or historical records.



Source Nodes:
file_path: data\txt\state_of_the_union.txt

our forces are not engaged and will not engage in conflict with Russian forces in Ukraine.  

Our forces are not going to Europe to fight in Ukraine, but to defend our NATO Allies ‚Äì in the event that Putin decides to keep moving west.  

For that purpose we‚Äôve mobilized American ground forces, air squadrons, and ship deployments to protect NATO countries including Poland, Romania, Latvia, Lithuania, and Estonia. 

As I have made crystal clear the United States and our Allies will defend every inch of territory of NATO countries with the full force of our collective power.  

And we remain clear-eyed. The Ukrainians are fighting back with pure courage. But the next few days weeks, months, will be hard on them.  

Putin has unleashed violence and chaos.  But while he may make gains on the battlefield ‚Äì he will pay a continuing high price over the long run. 

And a proud Ukrainian people, who have known 30 years  of indepen

### Hibrid Keres√©s: LLM V√°lasz Elemz√©se
A Hibrid Keres√©s p√©ld√°j√°ban az LLM v√°lasza arra utal, hogy a megadott kontextus nem tartalmazza Oroszorsz√°g Ukrajna elleni inv√°zi√≥j√°nak pontos d√°tum√°t. Ez a v√°lasz azt sugallja, hogy az LLM a forr√°sdokumentumokban el√©rhet≈ë inform√°ci√≥kra t√°maszkodik, mik√∂zben elismeri, hogy a sz√∂vegben nincsenek pontos r√©szletek.

A v√°lasz helyesen azonos√≠tja, hogy a kontextus eml√≠ti Oroszorsz√°g agresszi√≥j√°val kapcsolatos esem√©nyeket, de nem hat√°rozza meg az inv√°zi√≥ konkr√©t d√°tum√°t. Ez j√≥l mutatja az LLM k√©pess√©g√©t arra, hogy meg√©rtse a rendelkez√©sre √°ll√≥ inform√°ci√≥kat, mik√∂zben felismeri a tartalom hi√°nyoss√°gait. Az LLM hat√©konyan √∂szt√∂nzi a felhaszn√°l√≥t arra, hogy k√ºls≈ë forr√°sokat vagy t√∂rt√©nelmi feljegyz√©seket keressen a pontos d√°tum √©rdek√©ben, √≥vatoss√°got tan√∫s√≠tva, amikor az inform√°ci√≥ hi√°nyos.

### Forr√°scsom√≥pontok Elemz√©se
A Hibrid Keres√©s p√©ld√°j√°ban a forr√°scsom√≥pontok egy besz√©d r√©szleteit tartalmazz√°k, amely az Egyes√ºlt √Ållamok v√°lasz√°t t√°rgyalja Oroszorsz√°g Ukrajna elleni l√©p√©seire. Ezek a csom√≥pontok hangs√∫lyozz√°k a sz√©lesebb geopolitikai hat√°sokat, valamint az Egyes√ºlt √Ållamok √©s sz√∂vets√©gesei √°ltal az inv√°zi√≥ra adott v√°laszl√©p√©seket, de nem eml√≠tik az inv√°zi√≥ konkr√©t d√°tum√°t. Ez √∂sszhangban van az LLM v√°lasz√°val, amely helyesen azonos√≠tja, hogy a kontextusb√≥l hi√°nyzik a pontos d√°tumra vonatkoz√≥ inform√°ci√≥.


In [72]:
# Initialize hybrid retriever and query engine
semantic_reranker_retriever = index.as_retriever(vector_store_query_mode=VectorStoreQueryMode.SEMANTIC_HYBRID)
semantic_reranker_query_engine = RetrieverQueryEngine(retriever=semantic_reranker_retriever)

# Query execution
query = "What was the precise date when Russia invaded Ukraine?"
response = semantic_reranker_query_engine.query(query)

# Display the response
display(Markdown(f"{response}"))
print("\n")

# Print the source nodes
print("Source Nodes:")
for node in response.source_nodes:
    print(node.get_content(metadata_mode=MetadataMode.LLM))


 The provided context does not specify the exact date of Russia's invasion of Ukraine. However, it mentions that the event occurred six days before the speech was given. To determine the precise date, one would need to know the date of the speech.



Source Nodes:
file_path: data\txt\state_of_the_union.txt

Madam Speaker, Madam Vice President, our First Lady and Second Gentleman. Members of Congress and the Cabinet. Justices of the Supreme Court. My fellow Americans.  

Last year COVID-19 kept us apart. This year we are finally together again. 

Tonight, we meet as Democrats Republicans and Independents. But most importantly as Americans. 

With a duty to one another to the American people to the Constitution. 

And with an unwavering resolve that freedom will always triumph over tyranny. 

Six days ago, Russia‚Äôs Vladimir Putin sought to shake the foundations of the free world thinking he could make it bend to his menacing ways. But he badly miscalculated. 

He thought he could roll into Ukraine and the world would roll over. Instead he met a wall of strength he never imagined. 

He met the Ukrainian people. 

From President Zelenskyy to every Ukrainian, their fearlessness, their courage, their determination, inspires the world

### Hibrid √∫jrarangsorol√°ssal: LLM v√°laszelemz√©s
A Hibrid √∫jrarangsorol√°s p√©ld√°j√°ban az LLM v√°lasza tov√°bbi kontextust ny√∫jt azzal, hogy megjegyzi, az esem√©ny hat nappal a besz√©d elhangz√°sa el≈ëtt t√∂rt√©nt. Ez azt jelzi, hogy az LLM k√©pes k√∂vetkeztetni az inv√°zi√≥ d√°tum√°ra a besz√©d id≈ëz√≠t√©se alapj√°n, b√°r a pontoss√°g √©rdek√©ben tov√°bbra is sz√ºks√©ge van a besz√©d pontos d√°tum√°nak ismeret√©re.

Ez a v√°lasz jav√≠tott k√©pess√©get mutat arra, hogy kontextu√°lis nyomokat haszn√°ljon egy informat√≠vabb v√°lasz megad√°s√°hoz. Kiemeli az √∫jrarangsorol√°s el≈ëny√©t, amely lehet≈ëv√© teszi az LLM sz√°m√°ra, hogy hozz√°f√©rjen √©s el≈ët√©rbe helyezze a relev√°nsabb inform√°ci√≥kat, √≠gy k√∂zelebb ker√ºlhet a k√≠v√°nt r√©szlet (azaz az inv√°zi√≥ d√°tuma) pontos meghat√°roz√°s√°hoz.

### Forr√°scsom√≥pontok elemz√©se
Ebben a p√©ld√°ban a forr√°scsom√≥pontok utalnak Oroszorsz√°g inv√°zi√≥j√°nak id≈ëz√≠t√©s√©re, k√ºl√∂n√∂sen megeml√≠tve, hogy az hat nappal a besz√©d el≈ëtt t√∂rt√©nt. B√°r a pontos d√°tum tov√°bbra sincs kifejezetten megadva, a csom√≥pontok id≈ëbeli kontextust biztos√≠tanak, amely lehet≈ëv√© teszi az LLM sz√°m√°ra, hogy √°rnyaltabb v√°laszt adjon. Ennek a r√©szletnek a be√©p√≠t√©se bemutatja, hogyan jav√≠thatja az √∫jrarangsorol√°s az LLM k√©pess√©g√©t arra, hogy a rendelkez√©sre √°ll√≥ kontextusb√≥l inform√°ci√≥t nyerjen ki √©s k√∂vetkeztessen, ez√°ltal pontosabb √©s informat√≠vabb v√°laszokat eredm√©nyezve.


**Megjegyz√©s:**
Ebben a jegyzetf√ºzetben az NVIDIA API Katal√≥gusb√≥l sz√°rmaz√≥ NVIDIA NIM mikroszolg√°ltat√°sokat haszn√°ltuk.  
A fent eml√≠tett API-k, p√©ld√°ul `NVIDIA (llms)`, `NVIDIAEmbedding`, √©s [Azure AI Search Semantic Hybrid Retrieval (be√©p√≠tett √∫jrarangsorol√°s)](https://learn.microsoft.com/azure/search/semantic-search-overview).  
Fontos megjegyezni, hogy a fent eml√≠tett API-k √∂n√°ll√≥an √ºzemeltetett mikroszolg√°ltat√°sokat is t√°mogatnak.  

**P√©lda:**
```python
NVIDIA(model="meta/llama3-8b-instruct", base_url="http://your-nim-host-address:8000/v1")



---

**Felel≈ëss√©gkiz√°r√°s**:  
Ez a dokumentum az [Co-op Translator](https://github.com/Azure/co-op-translator) AI ford√≠t√°si szolg√°ltat√°s seg√≠ts√©g√©vel k√©sz√ºlt. B√°r t√∂reksz√ºnk a pontoss√°gra, k√©rj√ºk, vegye figyelembe, hogy az automatikus ford√≠t√°sok hib√°kat vagy pontatlans√°gokat tartalmazhatnak. Az eredeti dokumentum az eredeti nyelv√©n tekintend≈ë hiteles forr√°snak. Kritikus inform√°ci√≥k eset√©n javasolt a professzion√°lis, emberi ford√≠t√°s ig√©nybev√©tele. Nem v√°llalunk felel≈ëss√©get a ford√≠t√°s haszn√°lat√°b√≥l ered≈ë f√©lre√©rt√©sek√©rt vagy t√©ves √©rtelmez√©sek√©rt.
