### Generating Guides from OpenAPI Specification

*iteratively create more complicated examples to prototype generation of high quality guides*

### (1) Rudimentary Example

```mermaid
graph LR
    allow.com.yaml --> DocumentLoader
    DocumentLoader --> Chat
    Query[How do I create a journey in Python?] --> Chat
    Chat --> Guide
```

In [6]:
from langchain_openai import ChatOpenAI



chat = ChatOpenAI(model="gpt-3.5-turbo-0125", temperature=0.2)

In [7]:
from langchain.document_loaders.text import TextLoader

file_path = 'alloy.com.yaml'
loader = TextLoader(file_path)
docs = loader.load()

### gpt-4-turbo Pricing

- Input: $10.00 / 1M tokens

- Output: $30.00 / 1M tokens

### gpt-3.5-turbo Pricing

- Input: $0.50 / 1M tokens

- Output: $1.50 / 1M tokens

In [8]:
import tiktoken

# To get the tokeniser corresponding to a specific model in the OpenAI API:
enc = tiktoken.encoding_for_model("gpt-4")

with open("alloy.com.yaml") as f:
    doc = f.read()

tokens = enc.encode(doc)
len(tokens)
gpt4_price = len(tokens) * 10 / 1_000_000
gpt35_price = len(tokens) * 0.5 / 1_000_000
print("gpt-4-turbo: Price of passing in alloy.com.yaml 1 time: ${}".format(gpt4_price))
print("gpt-4-turbo: Price of passing in alloy.com.yaml 20 times: ${}".format(gpt4_price * 20))
print("gpt-3.5-turbo: Price of passing in alloy.com.yaml 1 time: ${}".format(gpt35_price))
print("gpt-3.5-turbo: Price of passing in alloy.com.yaml 20 times: ${}".format(gpt35_price * 20))

gpt-4-turbo: Price of passing in alloy.com.yaml 1 time: $0.50122
gpt-4-turbo: Price of passing in alloy.com.yaml 20 times: $10.0244
gpt-3.5-turbo: Price of passing in alloy.com.yaml 1 time: $0.025061
gpt-3.5-turbo: Price of passing in alloy.com.yaml 20 times: $0.50122


### ⛔️ Problem #1: Passing entire OpenAPI Specification is too expensive
### ⛔️ Problem #2: Entire OpenAPI Specification does not fit in gpt-3.5-turbo
### 🤔 Solution: Chunk the OpenAPI Specification, retrieve from vector store, and pass in as context instead

In [9]:

from langchain_chroma import Chroma
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter

# store OAS in vector DB
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
splits = text_splitter.split_documents(docs)
vectorstore = Chroma.from_documents(documents=splits, embedding=OpenAIEmbeddings())

# Retrieve and generate using the relevant snippets of the blog.
retriever = vectorstore.as_retriever()

In [10]:
query = "How do I create a journey in Python?"

In [11]:
from langchain import hub

prompt = hub.pull("rlm/rag-prompt")

In [12]:
print(prompt)
print(type(prompt))

input_variables=['context', 'question'] metadata={'lc_hub_owner': 'rlm', 'lc_hub_repo': 'rag-prompt', 'lc_hub_commit_hash': '50442af133e61576e74536c6556cefe1fac147cad032f4377b60c436e6cdcb6e'} messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context', 'question'], template="You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.\nQuestion: {question} \nContext: {context} \nAnswer:"))]
<class 'langchain_core.prompts.chat.ChatPromptTemplate'>


In [13]:
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough

def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | chat
    | StrOutputParser()
)

output = rag_chain.invoke(query)

In [14]:
from IPython.display import display, Markdown

display(Markdown(output))

To create a journey in Python, you would need to define the properties of the journey object such as journey_name, journey_type, journey_token, and journey_version_num. You can also include links to related objects using the _links property. Additionally, you can embed journey applications within the journey object by defining the properties of the journey_applications array.

### 👎 Results 

Output is nearly useless. Does not provide environment setup instructions or example code.

### 🤔 Maybe provide system instructions that are more explicit?

In [15]:
from langchain_core.prompts import ChatPromptTemplate

better_prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are an assistant for question-answering tasks. f you don't know the answer, just say that you don't know. Use markdown to ensure code blocks and commands are propertly formatted. Make sure to always include environment setup instructions and code blocks that are helpful for a developer to copy-paste. Always explain inputs and outputs of API requests. Be as detailed as possible. Answer the user's questions based on the below context:\n\n{context}",
        ),
        (
            "user",
            "{question}"
        )
    ]
)

In [16]:
rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | better_prompt
    | chat
    | StrOutputParser()
)

In [17]:
output = rag_chain.invoke(query)

In [18]:
display(Markdown(output))

To create a journey in Python, you would typically need to make an API request to the journey creation endpoint provided by the service you are using. Below is a general outline of the steps involved in creating a journey in Python using requests library:

1. Import the necessary libraries:
```python
import requests
import json
```

2. Define the endpoint URL for creating a journey:
```python
url = "https://api.example.com/create_journey"
```

3. Prepare the payload data for the journey:
```python
payload = {
    "journey_name": "My Journey",
    "journey_type": "application",
    "journey_version_num": "1.0",
    "journey_token": "my_journey_token"
}
```

4. Convert the payload data to JSON format:
```python
payload_json = json.dumps(payload)
```

5. Set the headers for the request:
```python
headers = {
    "Content-Type": "application/json",
    "Authorization": "Bearer YOUR_ACCESS_TOKEN"
}
```

6. Make a POST request to create the journey:
```python
response = requests.post(url, data=payload_json, headers=headers)

# Check if the request was successful
if response.status_code == 201:
    print("Journey created successfully!")
else:
    print("Failed to create journey. Status code:", response.status_code)
```

7. Handle the response data as needed:
```python
journey_data = response.json()
# Process the journey data as needed
```

Please note that the actual implementation may vary depending on the specific API and service you are using to create journeys. Make sure to replace the placeholder values (e.g., URL, payload data, headers) with your actual data. Additionally, ensure you have the necessary permissions and access tokens to create journeys through the API.

### 👎 Results

Looks like the result is lacking the correct base URL, proper parameters, and explanation of inputs and outputs. This makes the answer still nearly useless. Likely because the model was not given enough context.

### 🤔 Lets investigate what our vector store is returning

In [20]:
results = retriever.invoke(query)

for doc in results:
    print(doc.page_content)

type: object
                                  properties:
                                    href:
                                      type: string
                      journey:
                        type: object
                        properties:
                          journey_name:
                            type: string
                          journey_type:
                            type: string
                            enum:
                              - application
                              - alert
                          journey_token:
                            type: string
                          journey_version_num:
                            type: string
                          _links:
                            type: object
                            properties:
                              self:
                                type: object
                                properties:
type: object
                                  propertie

### 👎 Useless chunk results

The results don't contain anything about the relevant API endpoint

### 🤔 Can we improve the chunk results by doing smarter chunking?

Maybe if we chunk the OpenAPI spec based on operations, we can give the LLM better context.

### (2) Add documentation and custom configuration to input

```mermaid
graph LR
    allow.com.yaml --> DocumentLoader
    Documentation --> DocumentLoader
    JSON[Custom JSON Configuration] --> DocumentLoader
    DocumentLoader --> Chat
    Query[How do I create a journey in Python?] --> Chat
    Chat --> Guide
```