[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/pinecone-io/examples/blob/master/learn/generation/langchain/00-azure-openai-retrieval.ipynb) [![Open nbviewer](https://raw.githubusercontent.com/pinecone-io/examples/master/assets/nbviewer-shield.svg)](https://nbviewer.org/github/pinecone-io/examples/blob/master/learn/generation/langchain/00-azure-openai-retrieval.ipynb)

# Using Azure's OpenAI with LangChain

In [1]:
!pip install -qU \
    langchain==0.0.227 \
    openai==0.27.8 \
    "pinecone-client[grpc]"==2.2.2 \
    pinecone-datasets=='0.5.0rc10'

## Building the Knowledge Base

Adding an external knowledge to chatbots allows us to ground generation to this external knowledge. For our use-case our external knowledge will be the LangChain docs. We can load this from Pinecone datasets like so:

In [2]:
from pinecone_datasets import load_dataset

dataset = load_dataset('langchain-python-docs-text-embedding-ada-002')
# we drop sparse_values as they are not needed for this example
dataset.documents.drop(['metadata', 'sparse_values'], axis=1, inplace=True)
dataset.documents.rename(columns={'blob': 'metadata'}, inplace=True)
dataset.head()

Unnamed: 0,id,values,metadata
0.0,417ede5d-39be-498f-b518-f47ed4e53b90,"[0.005949743557721376, 0.01983247883617878, -0...","{'chunk': 0, 'text': '.rst .pdf Welcome to Lan..."
1.0,110f550d-110b-4378-b95e-141397fa21bc,"[0.009401749819517136, 0.02443608082830906, 0....","{'chunk': 1, 'text': 'Use Cases# Best practice..."
2.0,d5f00f02-3295-4567-b297-5e3262dc2728,"[-0.005517194513231516, 0.0208403542637825, 0....","{'chunk': 2, 'text': 'Gallery: A collection of..."
3.0,0b6fe3c6-1f0e-4608-a950-43231e46b08a,"[-0.006499645300209522, 0.0011573900701478124,...","{'chunk': 0, 'text': 'Search Error Please acti..."
4.0,39d5f15f-b973-42c0-8c9b-a2df49b627dc,"[-0.005658374633640051, 0.00817849114537239, 0...","{'chunk': 0, 'text': '.md .pdf Dependents Depe..."


We must change the `"url"` field in the **metadata** column to `"source"` for compatibility with later LangChain components.

In [3]:
for i, row in dataset.documents.iterrows():
    row['metadata']['source'] = row['metadata'].pop('url')

Our input docs are ready so we can move onto indexing everything.

## Initializing the Index

Now we need a place to store these embeddings and enable a efficient vector search through them all. To do that we use Pinecone, we can get a [free API key](https://app.pinecone.io/) and enter it below where we will initialize our connection to Pinecone and create a new index.

In [4]:
import os
from pinecone import Pinecone

# initialize connection to pinecone (get API key at app.pinecone.io)
api_key = os.getenv("PINECONE_API_KEY") or "PINECONE_API_KEY"
# find your environment/region next to the api key in pinecone console
env = os.getenv("PINECONE_ENVIRONMENT") or "PINECONE_ENV"

pc = Pinecone(api_key=api_key)

In [5]:
index_name = 'azure-openai-langchain-intro'

In [6]:
import time

# check if index already exists (it shouldn't if this is first time)
if index_name not in pinecone.list_indexes().names():
    # if does not exist, create index
    pinecone.create_index(
        index_name,
        dimension=1536,  # dimensionality of text-embedding-ada-002
        metric='cosine'
    )
    # wait for index to be initialized
    while not pinecone.describe_index(index_name).status['ready']:
        time.sleep(1)

# connect to index
index = pinecone.Index(index_name)
# view index stats
index.describe_index_stats()

{'dimension': 1536,
 'index_fullness': 0.0,
 'namespaces': {},
 'total_vector_count': 0}

Now we add all of our docs to Pinecone:

In [7]:
index.upsert_from_dataframe(dataset.documents, batch_size=100)

sending upsert requests:   0%|          | 0/6952 [00:00<?, ?it/s]

collecting async responses:   0%|          | 0/70 [00:00<?, ?it/s]

upserted_count: 6952

After indexing everything we can check the number of vectors in our index like so:

In [8]:
index.describe_index_stats()

{'dimension': 1536,
 'index_fullness': 0.0,
 'namespaces': {'': {'vector_count': 3476}},
 'total_vector_count': 3476}

## Initializing Azure OpenAI

To use OpenAI's service via Azure we first need to setup the service in Azure and in **Azure OpenAI Studio** we need to create two *Deployments*, one using `gpt-4` and another using `text-embedding-ada-002`.

Once we've done this we need to set a few environment variables (all found in **Azure OpenAI Studio**) like so:

In [9]:
os.environ['OPENAI_API_KEY'] = 'YOUR_OPENAI_API_KEY'
os.environ['OPENAI_API_TYPE'] = 'azure'
os.environ['OPENAI_API_VERSION'] = '2023-03-15-preview'
os.environ['OPENAI_API_BASE'] = 'https://azure-pinecone-demo.openai.azure.com/'

We can now connect to both of our deployments via LangChain. First our `ChatCompletion` endpoint which uses `gpt-3.5-turbo`:

In [10]:
from langchain.chat_models import AzureChatOpenAI

llm = AzureChatOpenAI(
    deployment_name="gpt4",
    model_name="gpt-4"
)

And then our embedding endpoint which uses `text-embedding-ada-002`:

In [11]:
from langchain.embeddings import OpenAIEmbeddings

embed = OpenAIEmbeddings(
    deployment='embedding',
    model='text-embedding-ada-002'
)

## Initializing Retrieval Component with LangChain

Before we move on, we must also initialize a connection to our index via LangChain. We need this for compatibility with later LangChain components. To use this we pass the `index` from above into a LangChain `vectorstores.Pinecone` object:

In [12]:
from langchain.vectorstores import Pinecone

text_field = "text"

# switch back to normal index for langchain
index = pinecone.Index(index_name)

vectorstore = Pinecone(
    index, embed.embed_query, text_field
)

## Initializing the RetrievalQA Component

The `RetrievalQA` and `RetrievalQAWithSourcesChain` are both components in LangChain that allow us to ask a natural language query and return a response grounded in the knowledge retrieved from our knowledge base. We can implement this and include original data sources like so:

In [13]:
from langchain.chains import RetrievalQAWithSourcesChain

qa = RetrievalQAWithSourcesChain.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=vectorstore.as_retriever()
)

Now we can begin asking questions about LangChain!

In [14]:
qa("can you tell me about the PromptLayer for OpenAI in LangChain?")

{'question': 'can you tell me about the PromptLayer for OpenAI in LangChain?',
 'answer': 'PromptLayer for OpenAI in LangChain is a middleware that allows developers to track, manage, and share GPT prompt engineering. It records all OpenAI API requests, enabling users to search and explore request history in the PromptLayer dashboard. LangChain provides PromptLayer wrappers for LLM, PromptLayerChatOpenAI, and PromptLayerOpenAIChat. To use PromptLayer within LangChain, you need to install the promptlayer python library, create a PromptLayer account, and create an API token to set as an environment variable (PROMPTLAYER_API_KEY).\n\n',
 'sources': '\n- https://python.langchain.com/en/latest/integrations/promptlayer.html\n- https://python.langchain.com/en/latest/modules/models/llms/integrations/promptlayer_openai.html\n- https://python.langchain.com/en/latest/modules/models/chat/integrations/promptlayer_chatopenai.html'}

We can format responses nicely like so:

In [16]:
from IPython.display import display, Markdown

res = qa("why would I use an output parser in LangChain?")
display(Markdown(res['answer']))
print(res['sources'])

You would use an output parser in LangChain to get more structured information than just text back from language model responses. Output parsers are classes that help structure language model responses by implementing methods to format and parse the output into a desired structure. This can be useful in cases where you need specific data structures or structured information for further processing.


https://python.langchain.com/docs/modules/model_io/output_parsers/


In [15]:
from IPython.display import display, Markdown

res = qa("how can I use output parsers?")
display(Markdown(res['answer']))
print(res['sources'])

To use output parsers, you need to follow these steps:

1. Choose the output parser that fits your needs, such as PydanticOutputParser, RetryOutputParser, or OutputFixingParser.
2. Implement the necessary methods in the output parser, such as `get_format_instructions()` and `parse(str)`.
3. Optionally, implement the `parse_with_prompt(str, PromptValue)` method if you need additional information from the prompt to parse the output.
4. Use the output parser in your language model code, like in the PromptTemplate or ChatOpenAI.

Example usage with PydanticOutputParser:

```python
from langchain.prompts import PromptTemplate
from langchain.llms import OpenAI
from langchain.output_parsers import PydanticOutputParser
from pydantic import BaseModel, Field, validator

class Joke(BaseModel):
    setup: str = Field(description="question to set up a joke")
    punchline: str = Field(description="answer to resolve the joke")

parser = PydanticOutputParser(pydantic_object=Joke)
prompt = PromptTemplate(
    template="Your prompt template here",
    input_variables=["your_input_variables"],
    partial_variables={"format_instructions": parser.get_format_instructions()}
)
```



- https://python.langchain.com/en/latest/modules/prompts/output_parsers.html
- https://python.langchain.com/docs/modules/model_io/output_parsers/
- https://python.langchain.com/en/latest/modules/prompts/output_parsers/examples/retry.html


---