### Generating Guides from OpenAPI Specification

*iteratively create more complicated examples to prototype generation of high quality guides*

### 1️⃣ Rudimentary Example

```mermaid
graph LR
    alloy.com.yaml --> DocumentLoader
    DocumentLoader --> Chat
    Query[How do I create a journey in Python?] --> Chat
    Chat --> Guide
```

In [1]:
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain.chains.combine_documents import create_stuff_documents_chain


chat = ChatOpenAI(model="gpt-4-turbo", temperature=0.2)
question_answering_prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "When answering questions, use markdown to ensure code blocks and commands are propertly formatted. Answer the user's questions based on the below context:\n\n{context}",
        ),
        MessagesPlaceholder(variable_name="messages"),
    ]
)
document_chain = create_stuff_documents_chain(chat, question_answering_prompt)
from langchain.document_loaders.text import TextLoader

file_path = 'alloy.com.yaml'
loader = TextLoader(file_path)
docs = loader.load()
from langchain.memory import ChatMessageHistory

demo_ephemeral_chat_history = ChatMessageHistory()

demo_ephemeral_chat_history.add_user_message("How do I create a journey in Python?")

output = document_chain.invoke(
    {
        "messages": demo_ephemeral_chat_history.messages,
        "context": docs,
    }
)
from IPython.display import display, Markdown

display(Markdown(output))

To create a journey in Python using the Alloy API, you'll need to make an HTTP POST request to the appropriate endpoint with the necessary headers and JSON body. Below is an example of how you can do this using the `requests` library in Python.

First, ensure you have the `requests` library installed. If not, you can install it using pip:

```bash
pip install requests
```

Here's a Python script to create a journey:

```python
import requests
import json

# Define the API endpoint
url = "https://demo-qasandbox.alloy.co/v1/journeys"

# Specify your API credentials
application_token = "your_application_token"
application_secret = "your_application_secret"

# Encode your credentials in base64 format for Basic Auth
import base64
credentials = f"{application_token}:{application_secret}"
encoded_credentials = base64.b64encode(credentials.encode("utf-8")).decode("utf-8")

# Set up the headers
headers = {
    "Authorization": f"Basic {encoded_credentials}",
    "Content-Type": "application/json"
}

# Define the JSON body of the request
data = {
    "entities": [
        {
            "external_entity_id": "entity-123",
            "data": {
                "name_first": "John",
                "name_last": "Doe",
                "birth_date": "1990-01-25",
                "document_ssn": "123-45-6789",
                "addresses": [
                    {
                        "type": "primary",
                        "line_1": "123 Main St",
                        "city": "Anytown",
                        "state": "CA",
                        "postal_code": "12345",
                        "country_code": "US"
                    }
                ],
                "emails": [
                    {
                        "email_address": "john.doe@example.com"
                    }
                ],
                "phones": [
                    {
                        "phone_number": "555-1234"
                    }
                ]
            },
            "entity_type": "person",
            "branch_name": "default"
        }
    ]
}

# Make the POST request
response = requests.post(url, headers=headers, data=json.dumps(data))

# Check the response
if response.status_code == 201:
    print("Journey created successfully.")
    print(response.json())
else:
    print("Failed to create journey.")
    print(response.text)
```

### Explanation:
1. **URL and Headers**: The URL points to the journey creation endpoint. The headers include authorization (using Basic Auth with your application token and secret) and content type set to `application/json`.

2. **Data**: The JSON body contains the details of the entities involved in the journey. Adjust the fields according to the specific requirements of your workflow.

3. **Request**: The `requests.post` method is used to send the POST request with the specified URL, headers, and JSON data.

4. **Response Handling**: The script checks if the response status code is 201 (Created), indicating success. It then prints the JSON response or error message.

Make sure to replace `"your_application_token"` and `"your_application_secret"` with your actual Alloy API credentials. Adjust the entity details and other parameters as per your specific use case.

### gpt-4-turbo Pricing

- Input: $10.00 / 1M tokens

- Output: $30.00 / 1M tokens

### gpt-3.5-turbo Pricing

- Input: $0.50 / 1M tokens

- Output: $1.50 / 1M tokens

In [2]:
import tiktoken

# To get the tokeniser corresponding to a specific model in the OpenAI API:
enc = tiktoken.encoding_for_model("gpt-4")

with open("alloy.com.yaml") as f:
    doc = f.read()

tokens = enc.encode(doc)
len(tokens)
gpt4_price = len(tokens) * 10 / 1_000_000
gpt35_price = len(tokens) * 0.5 / 1_000_000
print("gpt-4-turbo: Price of passing in alloy.com.yaml 1 time: ${}".format(gpt4_price))
print("gpt-4-turbo: Price of passing in alloy.com.yaml 20 times: ${}".format(gpt4_price * 20))
print("gpt-3.5-turbo: Price of passing in alloy.com.yaml 1 time: ${}".format(gpt35_price))
print("gpt-3.5-turbo: Price of passing in alloy.com.yaml 20 times: ${}".format(gpt35_price * 20))

gpt-4-turbo: Price of passing in alloy.com.yaml 1 time: $0.50122
gpt-4-turbo: Price of passing in alloy.com.yaml 20 times: $10.0244
gpt-3.5-turbo: Price of passing in alloy.com.yaml 1 time: $0.025061
gpt-3.5-turbo: Price of passing in alloy.com.yaml 20 times: $0.50122


### ⛔️ Problem #1: Passing entire OpenAPI Specification is too expensive
### ⛔️ Problem #2: Entire OpenAPI Specification does not fit in gpt-3.5-turbo
### 🤔 Solution: Chunk the OpenAPI Specification, retrieve from vector store, and pass in as context instead

#### Data Pipeline

```mermaid
graph LR
    alloy.com.yaml --> Chunk1
    alloy.com.yaml --> Chunk2
    alloy.com.yaml --> Chunk3
    Chunk1 -->|embed| VectorStore
    Chunk2 -->|embed| VectorStore
    Chunk3 -->|embed| VectorStore
    VectorStore -->|How do I create a journey in Python?| Chunk1Out[Chunk1]
    VectorStore -->|How do I create a journey in Python?| Chunk3Out[Chunk3]
    Chunk1Out --> Context
    Chunk3Out --> Context
    Context --> Chat
    Query[How do I create a journey in Python?] --> Chat
    Chat --> Guide
```

#### RAG Chain

```mermaid
graph LR
    alloy.com.yaml --> TextSplitter
    TextSplitter --> Chroma[Chroma 'Vector Store']
    Chroma --> Retriever
    Retriever --> Context
    Context --> Chat
    Query[How do I create a journey in Python?] --> Chat
    Chat --> Guide
```

In [3]:
from langchain_openai import ChatOpenAI

chat = ChatOpenAI(model="gpt-3.5-turbo-0125", temperature=0, model_kwargs={"seed": 0})

In [4]:
from langchain.document_loaders.text import TextLoader
def init_docs():
    file_path = 'alloy.com.yaml'
    loader = TextLoader(file_path)
    docs = loader.load()
    return docs

In [5]:
docs = init_docs()

In [6]:
from langchain_chroma import Chroma
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter

# store OAS in vector DB
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
splits = text_splitter.split_documents(docs)
vectorstore = Chroma.from_documents(documents=splits, embedding=OpenAIEmbeddings())

# Retrieve and generate using the relevant snippets of the blog.
retriever = vectorstore.as_retriever()

In [7]:
query = "How do I create a journey in Python?"

In [8]:
from langchain import hub

prompt = hub.pull("rlm/rag-prompt")

In [9]:
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough

def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | chat
    | StrOutputParser()
)

output = rag_chain.invoke(query)

In [10]:
from IPython.display import display, Markdown

display(Markdown(output))

To create a journey in Python, you would need to define the journey properties such as journey_name, journey_type, journey_token, and journey_version_num. You can then use the provided endpoints to interact with the journey, such as creating applications or adding notes to applications. Additionally, you can create batches of journeys using the specified parameters and workflows.

### 👎 Results 

Output is nearly useless. Does not provide environment setup instructions or example code.

### 🤔 Maybe provide system instructions that are more explicit?

In [11]:
from langchain_core.prompts import ChatPromptTemplate

better_prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are an assistant for question-answering tasks. f you don't know the answer, just say that you don't know. Use markdown to ensure code blocks and commands are propertly formatted. Make sure to always include environment setup instructions and code blocks that are helpful for a developer to copy-paste. Always explain inputs and outputs of API requests. Be as detailed as possible. Answer the user's questions based on the below context:\n\n{context}",
        ),
        (
            "user",
            "{question}"
        )
    ]
)

In [12]:
rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | better_prompt
    | chat
    | StrOutputParser()
)

In [13]:
output = rag_chain.invoke(query)

In [14]:
display(Markdown(output))

To create a journey in Python, you typically need to make an API request to the endpoint that allows you to create a journey. Below are the general steps to create a journey using Python:

1. **Import necessary libraries**: You will need the `requests` library to make API requests.

```python
import requests
```

2. **Set up the base URL and authentication**: You need to have the base URL of the API and any authentication tokens required to access the API.

```python
base_url = "https://api.example.com"
headers = {
    "Authorization": "Bearer YOUR_ACCESS_TOKEN",
    "Content-Type": "application/json"
}
```

3. **Prepare the payload**: You need to prepare the data that you want to send in the API request. This data should follow the structure expected by the API.

```python
payload = {
    "journey_name": "My New Journey",
    "journey_type": "application",
    "journey_version_num": "1.0",
    # Add any other required fields for the journey
}
```

4. **Make the API request to create the journey**:

```python
response = requests.post(f"{base_url}/journeys", headers=headers, json=payload)

if response.status_code == 201:
    print("Journey created successfully")
    journey_data = response.json()
    journey_token = journey_data["journey_token"]
else:
    print("Failed to create journey. Status code:", response.status_code)
```

5. **Handle the response**: Check the response status code to ensure the journey was created successfully. If successful, you can extract the `journey_token` from the response for future reference.

Please note that the actual implementation may vary based on the specific API you are using. Make sure to refer to the API documentation for the exact endpoint, payload structure, and any additional parameters required for creating a journey.

### 👎 Results

Looks like the result is lacking the correct base URL, proper parameters, and explanation of inputs and outputs. This makes the answer still nearly useless. Likely because the model was not given enough context.

### 🤔 Lets investigate what our vector store is returning

In [15]:
results = retriever.invoke(query)

for doc in results:
    print(doc.page_content)

type: object
                                  properties:
                                    href:
                                      type: string
                      journey:
                        type: object
                        properties:
                          journey_name:
                            type: string
                          journey_type:
                            type: string
                            enum:
                              - application
                              - alert
                          journey_token:
                            type: string
                          journey_version_num:
                            type: string
                          _links:
                            type: object
                            properties:
                              self:
                                type: object
                                properties:
properties:
                          self:
            

### 👎 Useless chunk results

The results don't contain anything about the relevant API endpoint

### 🤔 Can we improve the chunk results by doing smarter chunking?

Maybe if we chunk the OpenAPI spec based on operations, we can give the LLM better context. For smarter chunking, we can try to chunk the OpenAPI into documents that only contain relevant information for a specific operation but keep all the contextual information (everything besides `paths`)

In [16]:
import yaml
import jsonref
from jsonref import replace_refs
from langchain_core.documents.base import Document
from copy import deepcopy
from pprint import pprint

with open("alloy.com.yaml") as f:
    spec = f.read()

def chunk_openapi_by_operation(openapi: str):
    parsed = yaml.safe_load(openapi)

    operations: (str, str) = []
    # 1) list all operations by (path, HTTP method)
    for path, methods in parsed['paths'].items():
        for method in methods.keys():
            # if method is not an HTTP method then skip
            if method.lower() not in ['get', 'post', 'put', 'delete', 'patch', 'head', 'options', 'trace']:
                continue
            operations.append((path, method))

    # 2) create a chunk for every operation

    # 2.a) Dereference entire OpenAPI Spec
    dereferenced = replace_refs(parsed, lazy_load=False)

    chunks = []
    for operation in operations:
        path = operation[0]
        method = operation[1]
        chunk = deepcopy(dereferenced)
        if 'tags' in chunk['paths'][operation[0]][operation[1]]:
            tags = chunk['paths'][operation[0]][operation[1]]['tags']

        # first tag if possible
        if tags:
            tag_name = tags[0]

        # delete all tags on OAS except tag for this operation
        while len(chunk['tags']) > 1:
            for i in range(len(chunk['tags']) - 1, -1, -1):
                if chunk['tags'][i]['name'] != tag_name:
                    chunk['tags'].pop(i)

        if "summary" in chunk['paths'][path][method]:
            summary = chunk['paths'][path][method]['summary']
        else:
            summary = ""

        if "description" in chunk['paths'][path][method]:
            description = chunk['paths'][path][method]['description']
        else:
            description = ""

        # delete other operations
        for other_operation in operations:
            if other_operation[0] == operation[0]:
                continue
            if other_operation[0] in chunk['paths']:
                del chunk['paths'][other_operation[0]]

        # delete empty paths
        for path in chunk['paths'].keys():
            if not chunk['paths'][path]:
                del chunk['paths'][path]

        # delete other operations under same path
        keys = list(chunk['paths'][operation[0]].keys())
        for method in keys:
            if operation[1] == method:
                continue
            del chunk['paths'][operation[0]][method]

        # delete all components (should be inlined from 2.a)
        del chunk['components']
        chunks.append(({
            "path": operation[0],
            "method": operation[1],
            "openapi": yaml.dump(chunk),
            "tag": tag_name,
            "summary": summary,
            "description": description
        }))
    return list(map(lambda chunk: Document(page_content=chunk["openapi"], metadata={
        "path": chunk["path"],
        "method": chunk["method"],
        "tag": chunk["tag"],
        "summary": chunk["summary"],
        "description": chunk["description"]
    }), chunks))
chunks = chunk_openapi_by_operation(spec)
# for chunk in chunks:
#     print(len(yaml.safe_load(chunk.page_content)['paths']))
print(len(chunks))

72


In [17]:
# reset chroma DB
from langchain_community.vectorstores.chroma import Chroma

Chroma().delete_collection()

In [20]:
from langchain.chains.query_constructor.base import AttributeInfo
from langchain.retrievers.self_query.base import SelfQueryRetriever
from langchain_community.vectorstores.chroma import Chroma
from langchain_openai import ChatOpenAI
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter
import os

# for chunk in chunks:
#     print(len(yaml.safe_load(chunk.page_content)['paths']))

query = "How do I create a journey application in Python?"

chat = ChatOpenAI(model="gpt-3.5-turbo-0125", temperature=0, model_kwargs={"seed": 0})

# Reset the collection to remove embeddings from previous runs
# text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
# splits = text_splitter.split_documents(chunks)
vectorstore = Chroma.from_documents(chunks, OpenAIEmbeddings())

In [21]:
metadata_field_info = [
    AttributeInfo(
        name="path",
        description="The subpath for this operation",
        type="string",
    ),
    AttributeInfo(
        name="method",
        description="The HTTP Method for this operation",
        type="string",
    ),
    AttributeInfo(
        name="tag",
        description="The logical grouping for this API operation",
        type="string",
    ),
    AttributeInfo(
        name="summary",
        description="A short description of this operation's functionality",
        type="string",
    ),
    AttributeInfo(
        name="description",
        description="A more detailed description of this operation's functionality",
        type="string",
    ),
]
document_content_description = "The pruned OpenAPI specification which includes only the relevant information for a particular operation in the OpenAPI specification."

# Dylan: really stupid, but I had to upgrade langchain to get this to work for some reason
retriever = SelfQueryRetriever.from_llm(
    chat, vectorstore, document_content_description, metadata_field_info, verbose=True, search_kwargs={"k": 4}
)

In [22]:
relevant_docs = retriever.invoke(query)
pprint(list(map(lambda doc: doc.metadata, relevant_docs)))

[{'description': 'Create a journey application for one or more entities.\n',
  'method': 'post',
  'path': '/journeys/{journey_token}/applications',
  'summary': 'Create Journey Application',
  'tag': 'Journeys'},
 {'description': 'Create a note associated with the specified Journey '
                 'Application',
  'method': 'post',
  'path': '/journeys/{journey_token}/applications/{journey_application_token}/notes',
  'summary': 'Create Journey Application Note',
  'tag': 'Journeys'},
 {'description': 'If a journey application has the status '
                 '`pending_journey_application_review`, this endpoint can be '
                 'used to inform the system of the outcome of the manual '
                 'review and submit review notes. The outcome submitted here '
                 'will be the final outcome of the journey application.',
  'method': 'post',
  'path': '/journeys/{journey_token}/applications/{journey_application_token}/review',
  'summary': 'Manual Review Jour

In [23]:
from langchain_core.runnables import RunnablePassthrough
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

better_prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a world-class solutions engineer and amazing at question-answering tasks. If you don't know the answer, just say that you don't know. Use markdown to ensure code blocks and commands are propertly formatted. Make sure to always include environment setup instructions and code blocks that are helpful for a developer to copy-paste. Always explain inputs and outputs of API requests. Be as detailed as possible. Answer the user's questions based on the below context:\n\n{context}",
        ),
        (
            "user",
            "{question}"
        )
    ]
)

def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | better_prompt
    | chat
    | StrOutputParser()
)

In [24]:
from IPython.display import display, Markdown
output = rag_chain.invoke(query)
display(Markdown(output))

To create a journey application using the Alloy API in Python, you will need to send a POST request to the `/journeys/{journey_token}/applications` endpoint with the required payload. Here's a step-by-step guide on how to achieve this:

### Step 1: Install the Requests library
You can use the Requests library to make HTTP requests in Python. If you don't have it installed, you can install it using pip:

```bash
pip install requests
```

### Step 2: Make a POST request to create a journey application
Here's a sample Python script that demonstrates how to create a journey application using the Alloy API:

```python
import requests
import json

# Define the endpoint URL and journey token
url = "https://demo-qasandbox.alloy.co/v1/journeys/{journey_token}/applications"
journey_token = "your_journey_token_here"

# Define the payload for creating a journey application
payload = {
    "do_await_additional_entities": False,
    "entities": [
        {
            "branch_name": "persons",
            "data": {
                "addresses": [
                    {
                        "city": "New York",
                        "country_code": "US",
                        "line_1": "41 E. 11th",
                        "line_2": "2nd floor",
                        "postal_code": "10003",
                        "state": "NY",
                        "type": "primary"
                    }
                ],
                "birth_date": "1990-01-25",
                "document_ssn": "111223333",
                "email_address": "john@alloy.com",
                "ip_address_v4": "42.206.213.70",
                "meta": {
                    "user_type": "vip"
                },
                "name_first": "John",
                "name_last": "Doe",
                "name_middle": "Franklin",
                "phone_number": "8443825569"
            },
            "entity_type": "person",
            "external_entity_id": "my_system_entity_id_123"
        }
    ]
}

# Convert the payload to JSON
payload_json = json.dumps(payload)

# Make the POST request
response = requests.post(url.format(journey_token=journey_token), json=payload_json)

# Print the response
print(response.status_code)
print(response.json())
```

### Step 3: Replace placeholders
- Replace `your_journey_token_here` with the actual journey token you want to use.
- Update the payload with the necessary data for your journey application.

### Step 4: Run the script
Save the script to a Python file (e.g., `create_journey_application.py`) and run it. This will create a journey application using the Alloy API.

Make sure to handle any authentication requirements (OAuth2 or Basic Auth) as per the Alloy API documentation.

This script will send a POST request to create a journey application with the specified data. The response will contain the details of the created journey application.

If you have any specific authentication requirements or need further assistance, please let me know!

### 👎 Looks like we are missing security requirements

Notice how the naive approach includes directions on how to authenticate with the endpoint as well. One thing that the naive approach doesn't include is both authentication options: basic & OAuth.

### 🤔 Lets make sure the model has all the necessary information to infer security requirements

In [25]:
relevant_docs = retriever.invoke(query)
print(relevant_docs[0].page_content)

_id: 626c513d1c9b07002812708b
info:
  description: hey hey hey, it's the Alloy API!
  title: Alloy API
  version: 1.0.0
openapi: 3.0.0
paths:
  /journeys/{journey_token}/applications:
    post:
      description: 'Create a journey application for one or more entities.

        '
      requestBody:
        content:
          application/json:
            schema:
              properties:
                do_await_additional_entities:
                  default: false
                  description: 'If this value is true, additional entities can be
                    sent after this request by using the PUT endpoint that updates
                    a journey application.


                    The journey application will not complete until the parameter
                    `has_finished_sending_additional_entities` is sent with the value
                    `true` to the PUT endpoint.

                    '
                  example: false
                  type: boolean
                e

### 💡 Since we are deleting the components from OpenAPI spec, it loses context about security requirements

### 👉 Preserve components.securitySchemes

In [26]:
import yaml
import jsonref
from jsonref import replace_refs
from langchain_core.documents.base import Document
from copy import deepcopy
from pprint import pprint

with open("alloy.com.yaml") as f:
    spec = f.read()

def chunk_openapi_by_operation(openapi: str):
    parsed = yaml.safe_load(openapi)

    operations: (str, str) = []
    # 1) list all operations by (path, HTTP method)
    for path, methods in parsed['paths'].items():
        for method in methods.keys():
            # if method is not an HTTP method then skip
            if method.lower() not in ['get', 'post', 'put', 'delete', 'patch', 'head', 'options', 'trace']:
                continue
            operations.append((path, method))

    # 2) create a chunk for every operation

    # 2.a) Dereference entire OpenAPI Spec
    dereferenced = replace_refs(parsed, lazy_load=False)

    chunks = []
    for operation in operations:
        path = operation[0]
        method = operation[1]
        chunk = deepcopy(dereferenced)
        if 'tags' in chunk['paths'][operation[0]][operation[1]]:
            tags = chunk['paths'][operation[0]][operation[1]]['tags']

        # first tag if possible
        if tags:
            tag_name = tags[0]

        # delete all tags on OAS except tag for this operation
        while len(chunk['tags']) > 1:
            for i in range(len(chunk['tags']) - 1, -1, -1):
                if chunk['tags'][i]['name'] != tag_name:
                    chunk['tags'].pop(i)

        if "summary" in chunk['paths'][path][method]:
            summary = chunk['paths'][path][method]['summary']
        else:
            summary = ""

        if "description" in chunk['paths'][path][method]:
            description = chunk['paths'][path][method]['description']
        else:
            description = ""

        # delete other operations
        for other_operation in operations:
            if other_operation[0] == operation[0]:
                continue
            if other_operation[0] in chunk['paths']:
                del chunk['paths'][other_operation[0]]

        # delete empty paths
        for path in chunk['paths'].keys():
            if not chunk['paths'][path]:
                del chunk['paths'][path]

        # delete other operations under same path
        keys = list(chunk['paths'][operation[0]].keys())
        for method in keys:
            if operation[1] == method:
                continue
            del chunk['paths'][operation[0]][method]

        # delete all components besides securitySchemes (should be inlined from 2.a)
        if "components" in chunk:
            for key in chunk["components"]:
                if key == "securitySchemes":
                    continue
                del chunk['components'][key]
        
        chunks.append(({
            "path": operation[0],
            "method": operation[1],
            "openapi": yaml.dump(chunk),
            "tag": tag_name,
            "summary": summary,
            "description": description
        }))
    return list(map(lambda chunk: Document(page_content=chunk["openapi"], metadata={
        "path": chunk["path"],
        "method": chunk["method"],
        "tag": chunk["tag"],
        "summary": chunk["summary"],
        "description": chunk["description"]
    }), chunks))
chunks = chunk_openapi_by_operation(spec)
# for chunk in chunks:
#     print(len(yaml.safe_load(chunk.page_content)['paths']))
print(len(chunks))

72


In [27]:
# reset chroma DB
from langchain_community.vectorstores.chroma import Chroma

Chroma().delete_collection()

In [28]:
from langchain.chains.query_constructor.base import AttributeInfo
from langchain.retrievers.self_query.base import SelfQueryRetriever
from langchain_community.vectorstores.chroma import Chroma
from langchain_openai import ChatOpenAI
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter
import os

# for chunk in chunks:
#     print(len(yaml.safe_load(chunk.page_content)['paths']))

query = "How do I create a journey application in Python?"

chat = ChatOpenAI(model="gpt-3.5-turbo-0125", temperature=0, model_kwargs={"seed": 0})

# Reset the collection to remove embeddings from previous runs
# text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
# splits = text_splitter.split_documents(chunks)
vectorstore = Chroma.from_documents(chunks, OpenAIEmbeddings())

In [29]:
metadata_field_info = [
    AttributeInfo(
        name="path",
        description="The subpath for this operation",
        type="string",
    ),
    AttributeInfo(
        name="method",
        description="The HTTP Method for this operation",
        type="string",
    ),
    AttributeInfo(
        name="tag",
        description="The logical grouping for this API operation",
        type="string",
    ),
    AttributeInfo(
        name="summary",
        description="A short description of this operation's functionality",
        type="string",
    ),
    AttributeInfo(
        name="description",
        description="A more detailed description of this operation's functionality",
        type="string",
    ),
]
document_content_description = "The pruned OpenAPI specification which includes only the relevant information for a particular operation in the OpenAPI specification."

# Dylan: really stupid, but I had to upgrade langchain to get this to work for some reason
retriever = SelfQueryRetriever.from_llm(
    chat, vectorstore, document_content_description, metadata_field_info, verbose=True, search_kwargs={"k": 4}
)

In [30]:
relevant_docs = retriever.invoke(query)
print(relevant_docs[0].page_content)

_id: 626c513d1c9b07002812708b
components:
  securitySchemes:
    basic:
      description: HTTP basic authorization using a workflow token and secret
      scheme: Basic
      type: http
    oauth2:
      description: Oauth2 using a workflow token and secret to generate a bearer token
      flows:
        clientCredentials:
          tokenUrl: /oauth/bearer
      type: oauth2
      x-default: viSRrSuUJEid8u0l3dyRTj5ATsWpHX9ShD51TH3j
info:
  description: hey hey hey, it's the Alloy API!
  title: Alloy API
  version: 1.0.0
openapi: 3.0.0
paths:
  /journeys/{journey_token}/applications:
    post:
      description: 'Create a journey application for one or more entities.

        '
      requestBody:
        content:
          application/json:
            schema:
              properties:
                do_await_additional_entities:
                  default: false
                  description: 'If this value is true, additional entities can be
                    sent after this reques

In [31]:
from langchain_core.runnables import RunnablePassthrough
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

better_prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are a world-class solutions engineer and amazing at question-answering tasks. If you don't know the answer, just say that you don't know. Use markdown to ensure code blocks and commands are propertly formatted. Make sure to always include environment setup instructions and code blocks that are helpful for a developer to copy-paste. Always explain inputs and outputs of API requests. Be as detailed as possible. Answer the user's questions based on the below context:\n\n{context}",
        ),
        (
            "user",
            "{question}"
        )
    ]
)

def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | better_prompt
    | chat
    | StrOutputParser()
)

In [32]:
from IPython.display import display, Markdown
output = rag_chain.invoke(query)
display(Markdown(output))

To create a journey application in Python, you will need to make an HTTP POST request to the endpoint `/journeys/{journey_token}/applications` with the necessary payload containing the entities and other relevant information. Here's a step-by-step guide on how to do this using the `requests` library in Python:

1. Install the `requests` library if you haven't already. You can install it using pip:

```bash
pip install requests
```

2. Import the necessary libraries:

```python
import requests
import json
```

3. Define the base URL and the journey token:

```python
base_url = "https://demo-qasandbox.alloy.co/v1"
journey_token = "JourneyTokenHere"
url = f"{base_url}/journeys/{journey_token}/applications"
```

4. Prepare the payload for the journey application. You can use the example payload provided in the API documentation:

```python
payload = {
    "do_await_additional_entities": False,
    "entities": [
        {
            "branch_name": "persons",
            "data": {
                "addresses": [
                    {
                        "city": "New York",
                        "country_code": "US",
                        "line_1": "41 E. 11th",
                        "line_2": "2nd floor",
                        "postal_code": "10003",
                        "state": "NY",
                        "type": "primary"
                    }
                ],
                "birth_date": "1990-01-25",
                "document_ssn": "111223333",
                "email_address": "john@alloy.com",
                "ip_address_v4": "42.206.213.70",
                "meta": {
                    "user_type": "vip"
                },
                "name_first": "John",
                "name_last": "Doe",
                "name_middle": "Franklin",
                "phone_number": "8443825569"
            },
            "entity_type": "person",
            "external_entity_id": "my_system_entity_id_123"
        }
    ]
}
```

5. Make the POST request to create the journey application:

```python
headers = {
    "Content-Type": "application/json"
}

response = requests.post(url, headers=headers, data=json.dumps(payload))

if response.status_code == 201:
    print("Journey application created successfully!")
    print(response.json())
else:
    print("Failed to create journey application.")
    print(response.text)
```

6. Replace `JourneyTokenHere` with the actual journey token you want to use.

7. Run the Python script, and it will create a journey application based on the provided payload.

This script will send a POST request to the Alloy API to create a journey application with the specified entities. Make sure to replace the placeholders with actual data before running the script.

### 👎 The guide did not explicitly guide the user through authentication

### 🤔 Maybe better prompting will do the trick

In [33]:
from langchain_core.runnables import RunnablePassthrough
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

better_prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            """You are a world-class solutions engineer and amazing at question-answering tasks. You are also a world-class API
            integrations specialist and deeply understand OpenAPI Specification.
            
            You strictly follow these rules:
            - If you don't know the answer, just say that you don't know.
            - Use markdown to ensure code blocks and commands are propertly formatted.
            - Make sure to always include environment setup instructions and code blocks that are helpful for a developer to copy-paste.
            - Always explain inputs and outputs of API requests. Be as detailed as possible.
            - When presented with an OpenAPI Specification and asked questions about it, always include necessary steps to
            authenticate with the endpoint.
            - When presented with multiple authentication options, make sure to guide the customer through both methods
            
            Answer the user's questions based on the below context:\n\n{context}""",
        ),
        (
            "user",
            "{question}"
        )
    ]
)

def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | better_prompt
    | chat
    | StrOutputParser()
)

In [34]:
from IPython.display import display, Markdown
output = rag_chain.invoke(query)
display(Markdown(output))

To create a journey application in Python using the Alloy API, you will need to make a POST request to the `/journeys/{journey_token}/applications` endpoint with the necessary payload. Here's a step-by-step guide on how to achieve this:

### Prerequisites:
- You need to have the `requests` library installed. If you don't have it installed, you can install it using pip:
  ```bash
  pip install requests
  ```

### Python code to create a journey application:
```python
import requests
import json

# Define the endpoint URL and journey token
url = "https://demo-qasandbox.alloy.co/v1/journeys/{journey_token}/applications"
journey_token = "your_journey_token_here"

# Define the payload for creating a journey application
payload = {
    "do_await_additional_entities": False,
    "entities": [
        {
            "branch_name": "persons",
            "data": {
                "addresses": [
                    {
                        "city": "New York",
                        "country_code": "US",
                        "line_1": "41 E. 11th",
                        "line_2": "2nd floor",
                        "postal_code": "10003",
                        "state": "NY",
                        "type": "primary"
                    }
                ],
                "birth_date": "1990-01-25",
                "document_ssn": "111223333",
                "email_address": "john@alloy.com",
                "ip_address_v4": "42.206.213.70",
                "meta": {
                    "user_type": "vip"
                },
                "name_first": "John",
                "name_last": "Doe",
                "name_middle": "Franklin",
                "phone_number": "8443825569"
            },
            "entity_type": "person",
            "external_entity_id": "my_system_entity_id_123"
        }
    ]
}

# Convert the payload to JSON
payload_json = json.dumps(payload)

# Make the POST request to create the journey application
response = requests.post(url.format(journey_token=journey_token), data=payload_json, headers={"Content-Type": "application/json"})

# Check if the request was successful
if response.status_code == 201:
    print("Journey application created successfully!")
else:
    print("Failed to create journey application. Status code:", response.status_code)
    print("Response:", response.json())
```

In the code snippet above:
- Replace `your_journey_token_here` with the actual journey token you want to create the application for.
- Update the `payload` dictionary with the necessary data for creating the journey application.

This Python script will send a POST request to create a journey application with the specified data. Make sure to handle any authentication required by the Alloy API based on the security schemes provided in the OpenAPI Specification.

### 👎 Better prompting was not helpful, what if we switch out the model for gpt-4 instead of gpt-3.5

### 🤔 What is the cost of tokens for gpt-4 with our new RAG architecture

In [35]:
import tiktoken

# To get the tokeniser corresponding to a specific model in the OpenAI API:
enc = tiktoken.get_encoding("cl100k_base")

relevant_docs = retriever.invoke(query)

tokens = []
for doc in relevant_docs:
    tokens += enc.encode(doc.page_content)
len(tokens)
gpt4_price = len(tokens) * 10 / 1_000_000
gpt35_price = len(tokens) * 0.5 / 1_000_000
print("gpt-4-turbo: Price of passing in alloy.com.yaml 1 time: ${}".format(gpt4_price))
print("gpt-4-turbo: Price of passing in alloy.com.yaml 20 times: ${}".format(gpt4_price * 20))
print("gpt-3.5-turbo: Price of passing in alloy.com.yaml 1 time: ${}".format(gpt35_price))
print("gpt-3.5-turbo: Price of passing in alloy.com.yaml 20 times: ${}".format(gpt35_price * 20))

gpt-4-turbo: Price of passing in alloy.com.yaml 1 time: $0.12483
gpt-4-turbo: Price of passing in alloy.com.yaml 20 times: $2.4966
gpt-3.5-turbo: Price of passing in alloy.com.yaml 1 time: $0.0062415
gpt-3.5-turbo: Price of passing in alloy.com.yaml 20 times: $0.12483


### RAG vs. Long Context

| Model   | RAG     | Long Context |
|---------|---------|--------------|
| GPT-4   | \$0.124  | \$0.5 (4x)    |
| GPT-3.5 | \$0.006  | \$0.025 (4x)   |



### 👎 Still kind of expensive

### 🤔 What if we limit context to 2 documents

In [36]:
metadata_field_info = [
    AttributeInfo(
        name="path",
        description="The subpath for this operation",
        type="string",
    ),
    AttributeInfo(
        name="method",
        description="The HTTP Method for this operation",
        type="string",
    ),
    AttributeInfo(
        name="tag",
        description="The logical grouping for this API operation",
        type="string",
    ),
    AttributeInfo(
        name="summary",
        description="A short description of this operation's functionality",
        type="string",
    ),
    AttributeInfo(
        name="description",
        description="A more detailed description of this operation's functionality",
        type="string",
    ),
]
document_content_description = "The pruned OpenAPI specification which includes only the relevant information for a particular operation in the OpenAPI specification."

# Dylan: really stupid, but I had to upgrade langchain to get this to work for some reason
retriever = SelfQueryRetriever.from_llm(
    chat, vectorstore, document_content_description, metadata_field_info, verbose=True, search_kwargs={"k": 2}
)

In [37]:
import tiktoken

# To get the tokeniser corresponding to a specific model in the OpenAI API:
enc = tiktoken.get_encoding("cl100k_base")

relevant_docs = retriever.invoke(query)

tokens = []
for doc in relevant_docs:
    tokens += enc.encode(doc.page_content)
len(tokens)
gpt4_price = len(tokens) * 10 / 1_000_000
gpt35_price = len(tokens) * 0.5 / 1_000_000
print("gpt-4-turbo: Price of passing in alloy.com.yaml 1 time: ${}".format(gpt4_price))
print("gpt-4-turbo: Price of passing in alloy.com.yaml 20 times: ${}".format(gpt4_price * 20))
print("gpt-3.5-turbo: Price of passing in alloy.com.yaml 1 time: ${}".format(gpt35_price))
print("gpt-3.5-turbo: Price of passing in alloy.com.yaml 20 times: ${}".format(gpt35_price * 20))

gpt-4-turbo: Price of passing in alloy.com.yaml 1 time: $0.07569
gpt-4-turbo: Price of passing in alloy.com.yaml 20 times: $1.5137999999999998
gpt-3.5-turbo: Price of passing in alloy.com.yaml 1 time: $0.0037845
gpt-3.5-turbo: Price of passing in alloy.com.yaml 20 times: $0.07569000000000001


### RAG vs. Long Context

| Model   | RAG     | Long Context |
|---------|---------|--------------|
| GPT-4   | \$0.075  | \$0.5 (6.66x)    |
| GPT-3.5 | \$0.0038  | \$0.025 (6.66x)   |



### 🤔 Not bad, but does the output still include all the necessary info?

In [38]:
from langchain.chains.query_constructor.base import AttributeInfo
from langchain.retrievers.self_query.base import SelfQueryRetriever
from langchain_community.vectorstores.chroma import Chroma
from langchain_openai import ChatOpenAI
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter
import os

# for chunk in chunks:
#     print(len(yaml.safe_load(chunk.page_content)['paths']))

query = "How do I create a journey application in Python?"

chat = ChatOpenAI(model="gpt-3.5-turbo", temperature=0, model_kwargs={"seed": 0})

# Reset the collection to remove embeddings from previous runs
Chroma().delete_collection()

vectorstore = Chroma.from_documents(chunks, OpenAIEmbeddings())

In [39]:
metadata_field_info = [
    AttributeInfo(
        name="path",
        description="The subpath for this operation",
        type="string",
    ),
    AttributeInfo(
        name="method",
        description="The HTTP Method for this operation",
        type="string",
    ),
    AttributeInfo(
        name="tag",
        description="The logical grouping for this API operation",
        type="string",
    ),
    AttributeInfo(
        name="summary",
        description="A short description of this operation's functionality",
        type="string",
    ),
    AttributeInfo(
        name="description",
        description="A more detailed description of this operation's functionality",
        type="string",
    ),
]
document_content_description = "The pruned OpenAPI specification which includes only the relevant information for a particular operation in the OpenAPI specification."

# Dylan: really stupid, but I had to upgrade langchain to get this to work for some reason
retriever = SelfQueryRetriever.from_llm(
    chat, vectorstore, document_content_description, metadata_field_info, verbose=True, search_kwargs={"k": 2}
)

In [40]:
from langchain_core.runnables import RunnablePassthrough
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

better_prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            """You are a world-class solutions engineer and amazing at question-answering tasks. You are also a world-class API
            integrations specialist and deeply understand OpenAPI Specification.
            
            You strictly follow these rules:
            - If you don't know the answer, just say that you don't know.
            - Use markdown to ensure code blocks and commands are propertly formatted.
            - Make sure to always include environment setup instructions and code blocks that are helpful for a developer to copy-paste.
            - Always explain inputs and outputs of API requests. Be as detailed as possible.
            - When presented with an OpenAPI Specification and asked questions about it, always include necessary steps to
            authenticate with the endpoint.
            - When presented with multiple authentication options, make sure to guide the customer through both methods
            
            Answer the user's questions based on the below context:\n\n{context}""",
        ),
        (
            "user",
            "{question}"
        )
    ]
)

def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | better_prompt
    | chat
    | StrOutputParser()
)

In [41]:
from IPython.display import display, Markdown
output = rag_chain.invoke(query)

display(Markdown(output))

To create a journey application using the Alloy API in Python, you will need to follow these steps:

1. **Authenticate with the Alloy API**:
   - The Alloy API uses OAuth2 for authentication. You will need to obtain a bearer token using your workflow token and secret.
   - Here is an example of how you can authenticate using OAuth2 in Python:

```python
import requests

url = "https://demo-qasandbox.alloy.co/oauth/bearer"
payload = {
    "grant_type": "client_credentials",
    "client_id": "your_workflow_token",
    "client_secret": "your_workflow_secret"
}
response = requests.post(url, data=payload)
bearer_token = response.json()["access_token"]
```

2. **Create a Journey Application**:
   - Once you have obtained the bearer token, you can proceed to create a journey application by sending a POST request to the `/journeys/{journey_token}/applications` endpoint.
   - You will need to provide the necessary data in the request body, such as the list of entities to be processed in the application.
   - Here is an example of how you can create a journey application in Python:

```python
import requests

url = "https://demo-qasandbox.alloy.co/v1/journeys/{journey_token}/applications"
headers = {
    "Authorization": f"Bearer {bearer_token}",
    "Content-Type": "application/json"
}
payload = {
    "do_await_additional_entities": False,
    "entities": [
        {
            "branch_name": "persons",
            "data": {
                "addresses": [
                    {
                        "city": "New York",
                        "country_code": "US",
                        "line_1": "41 E. 11th",
                        "line_2": "2nd floor",
                        "postal_code": "10003",
                        "state": "NY",
                        "type": "primary"
                    }
                ],
                "birth_date": "1990-01-25",
                "document_ssn": "111223333",
                "email_address": "john@alloy.com",
                "ip_address_v4": "42.206.213.70",
                "meta": {
                    "user_type": "vip"
                },
                "name_first": "John",
                "name_last": "Doe",
                "name_middle": "Franklin",
                "phone_number": "8443825569"
            },
            "entity_type": "person",
            "external_entity_id": "my_system_entity_id_123"
        }
    ]
}

response = requests.post(url, headers=headers, json=payload)
print(response.json())
```

3. **Handle the Response**:
   - The response will contain information about the created journey application, such as the application token and any additional details.

This is a basic example of how you can create a journey application using the Alloy API in Python. Make sure to replace placeholders like `{journey_token}`, `your_workflow_token`, and `your_workflow_secret` with your actual values.

### 👎 Guide for security authorization was not helpful

### 🤔 Does it produce significantl better results with gpt-4?

In [42]:
from langchain.chains.query_constructor.base import AttributeInfo
from langchain.retrievers.self_query.base import SelfQueryRetriever
from langchain_community.vectorstores.chroma import Chroma
from langchain_openai import ChatOpenAI
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter
import os

# for chunk in chunks:
#     print(len(yaml.safe_load(chunk.page_content)['paths']))

query = "How do I create a journey application in Python?"

chat = ChatOpenAI(model="gpt-4-turbo", temperature=0, model_kwargs={"seed": 0})

# Reset the collection to remove embeddings from previous runs
Chroma().delete_collection()

vectorstore = Chroma.from_documents(chunks, OpenAIEmbeddings())

In [43]:
metadata_field_info = [
    AttributeInfo(
        name="path",
        description="The subpath for this operation",
        type="string",
    ),
    AttributeInfo(
        name="method",
        description="The HTTP Method for this operation",
        type="string",
    ),
    AttributeInfo(
        name="tag",
        description="The logical grouping for this API operation",
        type="string",
    ),
    AttributeInfo(
        name="summary",
        description="A short description of this operation's functionality",
        type="string",
    ),
    AttributeInfo(
        name="description",
        description="A more detailed description of this operation's functionality",
        type="string",
    ),
]
document_content_description = "The pruned OpenAPI specification which includes only the relevant information for a particular operation in the OpenAPI specification."

# Dylan: really stupid, but I had to upgrade langchain to get this to work for some reason
retriever = SelfQueryRetriever.from_llm(
    chat, vectorstore, document_content_description, metadata_field_info, verbose=True, search_kwargs={"k": 2}
)

In [44]:
from langchain_core.runnables import RunnablePassthrough
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

better_prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            """You are a world-class solutions engineer and amazing at question-answering tasks. You are also a world-class API
            integrations specialist and deeply understand OpenAPI Specification.
            
            You strictly follow these rules:
            - If you don't know the answer, just say that you don't know.
            - Use markdown to ensure code blocks and commands are propertly formatted.
            - Make sure to always include environment setup instructions and code blocks that are helpful for a developer to copy-paste.
            - Always explain inputs and outputs of API requests. Be as detailed as possible.
            - When presented with an OpenAPI Specification and asked questions about it, always include necessary steps to
            authenticate with the endpoint.
            - When presented with multiple authentication options, make sure to guide the customer through both methods
            
            Answer the user's questions based on the below context:\n\n{context}""",
        ),
        (
            "user",
            "{question}"
        )
    ]
)

def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | better_prompt
    | chat
    | StrOutputParser()
)

In [45]:
from IPython.display import display, Markdown
output = rag_chain.invoke(query)
display(Markdown(output))

To create a journey application using the Alloy API in Python, you'll need to follow these steps. This involves setting up your environment, authenticating with the API, and making a POST request to the appropriate endpoint.

### Environment Setup

First, ensure you have Python installed on your system. You can download Python from [python.org](https://www.python.org/downloads/). You will also need to install the `requests` library, which is used for making HTTP requests. You can install this library using pip:

```bash
pip install requests
```

### Authentication

The Alloy API provides two methods for authentication: Basic Authentication and OAuth2. Here's how you can authenticate using both methods:

#### 1. Basic Authentication

For Basic Authentication, you need a workflow token and secret. These should be encoded in base64 and used as the Authorization header.

```python
import base64

def get_basic_auth_header(token, secret):
    token_secret = f"{token}:{secret}".encode('utf-8')
    b64_encoded = base64.b64encode(token_secret).decode('utf-8')
    return {"Authorization": f"Basic {b64_encoded}"}
```

#### 2. OAuth2 Authentication

For OAuth2, you need to obtain a bearer token using the client credentials flow.

```python
def get_oauth2_token(client_id, client_secret, token_url):
    response = requests.post(
        token_url,
        auth=(client_id, client_secret)
    )
    return response.json()['access_token']

# Example usage
token_url = 'https://demo-qasandbox.alloy.co/v1/oauth/bearer'
access_token = get_oauth2_token('your_client_id', 'your_client_secret', token_url)
auth_header = {"Authorization": f"Bearer {access_token}"}
```

### Making the POST Request

Once authenticated, you can create a journey application by making a POST request to the `/journeys/{journey_token}/applications` endpoint.

```python
import requests
import json

def create_journey_application(journey_token, entities, auth_header):
    url = f"https://demo-qasandbox.alloy.co/v1/journeys/{journey_token}/applications"
    payload = {
        "entities": entities,
        "do_await_additional_entities": False
    }
    headers = {
        **auth_header,
        "Content-Type": "application/json"
    }
    response = requests.post(url, headers=headers, data=json.dumps(payload))
    return response.json()

# Example usage
journey_token = "your_journey_token"
entities = [
    {
        "branch_name": "persons",
        "data": {
            "name_first": "John",
            "name_last": "Doe",
            "birth_date": "1990-01-25",
            "document_ssn": "111223333",
            "email_address": "john@alloy.com",
            "phone_number": "8443825569",
            "addresses": [
                {
                    "line_1": "41 E. 11th",
                    "line_2": "2nd floor",
                    "city": "New York",
                    "state": "NY",
                    "postal_code": "10003",
                    "country_code": "US",
                    "type": "primary"
                }
            ]
        },
        "entity_type": "person",
        "external_entity_id": "my_system_entity_id_123"
    }
]

# Choose the authentication method
auth_header = get_basic_auth_header('your_token', 'your_secret')
# Or use OAuth2
# auth_header = {"Authorization": f"Bearer {access_token}"}

response = create_journey_application(journey_token, entities, auth_header)
print(response)
```

This script sets up the necessary headers, constructs the JSON payload for the request, and handles the API response. Adjust the `entities` list according to the specific requirements of your application.

### 👍 Pretty good

Includes both options for authentication and copy-pastable code examples

### 🤔 What about making it so security credentials are pulled through environment variables

Lets try editing the system prompt

In [46]:
from langchain.chains.query_constructor.base import AttributeInfo
from langchain.retrievers.self_query.base import SelfQueryRetriever
from langchain_community.vectorstores.chroma import Chroma
from langchain_openai import ChatOpenAI
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter
import os

# for chunk in chunks:
#     print(len(yaml.safe_load(chunk.page_content)['paths']))

query = "How do I create a journey application in Python?"

chat = ChatOpenAI(model="gpt-4-turbo", temperature=0, model_kwargs={"seed": 0})

# Reset the collection to remove embeddings from previous runs
Chroma().delete_collection()

vectorstore = Chroma.from_documents(chunks, OpenAIEmbeddings())

In [47]:
metadata_field_info = [
    AttributeInfo(
        name="path",
        description="The subpath for this operation",
        type="string",
    ),
    AttributeInfo(
        name="method",
        description="The HTTP Method for this operation",
        type="string",
    ),
    AttributeInfo(
        name="tag",
        description="The logical grouping for this API operation",
        type="string",
    ),
    AttributeInfo(
        name="summary",
        description="A short description of this operation's functionality",
        type="string",
    ),
    AttributeInfo(
        name="description",
        description="A more detailed description of this operation's functionality",
        type="string",
    ),
]
document_content_description = "The pruned OpenAPI specification which includes only the relevant information for a particular operation in the OpenAPI specification."

# Dylan: really stupid, but I had to upgrade langchain to get this to work for some reason
retriever = SelfQueryRetriever.from_llm(
    chat, vectorstore, document_content_description, metadata_field_info, verbose=True, search_kwargs={"k": 2}
)

In [48]:
from langchain_core.runnables import RunnablePassthrough
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

better_prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            """You are a world-class solutions engineer and amazing at question-answering tasks. You are also a world-class API
            integrations specialist and deeply understand OpenAPI Specification. You also deeply unde
            rstand computer security which
            allows you to produce secure code examples and guidance.
            
            You strictly follow these rules:
            - If you don't know the answer, just say that you don't know.
            - Use markdown to ensure code blocks and commands are propertly formatted.
            - Make sure to always include environment setup instructions and code blocks that are helpful for a developer to copy-paste.
            - Always explain inputs and outputs of API requests. Be as detailed as possible.
            - When presented with an OpenAPI Specification and asked questions about it, always include necessary steps to
            authenticate with the endpoint.
            - When presented with multiple authentication options, make sure to guide the customer through both methods
            - Always name variables for security credentials based on standard protocols, API name, or descriptions
            - For example if the following security scheme is:
  "securitySchemes:
    basic:
      type: http
      description: HTTP basic authorization using a workflow token and secret
      scheme: Basic"

or

    "oauth2:
      type: oauth2
      x-default: viSRrSuUJEid8u0l3dyRTj5ATsWpHX9ShD51TH3j
      description: Oauth2 using a workflow token and secret to generate a bearer token
      flows:
        clientCredentials:
          tokenUrl: /oauth/bearer"
      
              Then you would name the variables in code as "ALLOY_WORKFLOW_TOKEN" and "ALLOY_WORKFLOW_SECRET".
            - When referencing security credentials, always pull values from environment variables.
            
            Answer the user's questions based on the below context:\n\n{context}""",
        ),
        (
            "user",
            "{question}"
        )
    ]
)

def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | better_prompt
    | chat
    | StrOutputParser()
)

In [49]:
from IPython.display import display, Markdown
output = rag_chain.invoke(query)
display(Markdown(output))

To create a journey application using the Alloy API in Python, you'll need to authenticate and send a POST request to the appropriate endpoint. Below, I'll guide you through setting up your environment, authenticating using both Basic and OAuth2 methods, and making the API request.

### Environment Setup

First, ensure you have Python and `requests` library installed. You can install the `requests` library using pip if you haven't already:

```bash
pip install requests
```

### Authentication

The Alloy API provides two methods for authentication: Basic and OAuth2. I'll provide examples for both.

#### 1. Basic Authentication

For Basic Authentication, you need your workflow token and secret. These should be stored securely and not hard-coded in your scripts. Use environment variables to store these credentials.

```python
import os
import requests
from requests.auth import HTTPBasicAuth

# Environment variables for security credentials
ALLOY_WORKFLOW_TOKEN = os.getenv('ALLOY_WORKFLOW_TOKEN')
ALLOY_WORKFLOW_SECRET = os.getenv('ALLOY_WORKFLOW_SECRET')

# API endpoint
url = "https://demo-qasandbox.alloy.co/v1/journeys/{journey_token}/applications"

# Replace {journey_token} with your actual journey token
url = url.format(journey_token="your_journey_token_here")

# Request body
payload = {
    "do_await_additional_entities": False,
    "entities": [
        {
            "branch_name": "persons",
            "data": {
                "addresses": [
                    {
                        "city": "New York",
                        "country_code": "US",
                        "line_1": "41 E. 11th",
                        "line_2": "2nd floor",
                        "postal_code": "10003",
                        "state": "NY",
                        "type": "primary"
                    }
                ],
                "birth_date": "1990-01-25",
                "document_ssn": 111223333,
                "email_address": "john@alloy.com",
                "ip_address_v4": "42.206.213.70",
                "meta": {
                    "user_type": "vip"
                },
                "name_first": "John",
                "name_last": "Doe",
                "name_middle": "Franklin",
                "phone_number": 8443825569
            },
            "entity_type": "person",
            "external_entity_id": "my_system_entity_id_123"
        }
    ]
}

# Headers
headers = {
    'Content-Type': 'application/json'
}

# Make the POST request
response = requests.post(url, json=payload, auth=HTTPBasicAuth(ALLOY_WORKFLOW_TOKEN, ALLOY_WORKFLOW_SECRET), headers=headers)

# Print the response
print(response.json())
```

#### 2. OAuth2 Authentication

For OAuth2, you'll first need to obtain a bearer token using the client credentials flow.

```python
# Token endpoint
token_url = "https://demo-qasandbox.alloy.co/v1/oauth/bearer"

# Obtain the bearer token
token_response = requests.post(token_url, auth=HTTPBasicAuth(ALLOY_WORKFLOW_TOKEN, ALLOY_WORKFLOW_SECRET))
access_token = token_response.json().get('access_token')

# Headers with the bearer token
headers = {
    'Authorization': f'Bearer {access_token}',
    'Content-Type': 'application/json'
}

# Make the POST request
response = requests.post(url, json=payload, headers=headers)

# Print the response
print(response.json())
```

### Inputs and Outputs

- **Inputs**: The request body includes details about the entities involved in the journey application, such as personal information, addresses, and other metadata.
- **Outputs**: The response will include details about the created journey application, such as tokens, links to related resources, and status information.

Make sure to replace placeholders like `your_journey_token_here` with actual values relevant to your use case. Always handle sensitive data such as tokens and personal information securely.

### 👍 Excellent

Looks like we should be using RAG (to reduce cost) + GPT-4 (for quality).

### 🤔 What about in Java?

In [50]:
from langchain.chains.query_constructor.base import AttributeInfo
from langchain.retrievers.self_query.base import SelfQueryRetriever
from langchain_community.vectorstores.chroma import Chroma
from langchain_openai import ChatOpenAI
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter
import os

# for chunk in chunks:
#     print(len(yaml.safe_load(chunk.page_content)['paths']))

query = "How do I create a journey application in Java?"

chat = ChatOpenAI(model="gpt-4-turbo", temperature=0, model_kwargs={"seed": 0})

# Reset the collection to remove embeddings from previous runs
Chroma().delete_collection()

vectorstore = Chroma.from_documents(chunks, OpenAIEmbeddings())

In [51]:
metadata_field_info = [
    AttributeInfo(
        name="path",
        description="The subpath for this operation",
        type="string",
    ),
    AttributeInfo(
        name="method",
        description="The HTTP Method for this operation",
        type="string",
    ),
    AttributeInfo(
        name="tag",
        description="The logical grouping for this API operation",
        type="string",
    ),
    AttributeInfo(
        name="summary",
        description="A short description of this operation's functionality",
        type="string",
    ),
    AttributeInfo(
        name="description",
        description="A more detailed description of this operation's functionality",
        type="string",
    ),
]
document_content_description = "The pruned OpenAPI specification which includes only the relevant information for a particular operation in the OpenAPI specification."

# Dylan: really stupid, but I had to upgrade langchain to get this to work for some reason
retriever = SelfQueryRetriever.from_llm(
    chat, vectorstore, document_content_description, metadata_field_info, verbose=True, search_kwargs={"k": 2}
)

In [52]:
from langchain_core.runnables import RunnablePassthrough
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

better_prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            """You are a world-class solutions engineer and amazing at question-answering tasks. You are also a world-class API
            integrations specialist and deeply understand OpenAPI Specification. You also deeply unde
            rstand computer security which
            allows you to produce secure code examples and guidance.
            
            You strictly follow these rules:
            - If you don't know the answer, just say that you don't know.
            - Use markdown to ensure code blocks and commands are propertly formatted.
            - Make sure to always include environment setup instructions and code blocks that are helpful for a developer to copy-paste.
            - Always explain inputs and outputs of API requests. Be as detailed as possible.
            - When presented with an OpenAPI Specification and asked questions about it, always include necessary steps to
            authenticate with the endpoint.
            - When presented with multiple authentication options, make sure to guide the customer through both methods of authentication in code
            - Always name variables for security credentials based on standard protocols, API name, or descriptions
            - For example if the following security scheme is:
  "securitySchemes:
    basic:
      type: http
      description: HTTP basic authorization using a workflow token and secret
      scheme: Basic"

or

    "oauth2:
      type: oauth2
      x-default: viSRrSuUJEid8u0l3dyRTj5ATsWpHX9ShD51TH3j
      description: Oauth2 using a workflow token and secret to generate a bearer token
      flows:
        clientCredentials:
          tokenUrl: /oauth/bearer"
      
              Then you would name the variables in code as "ALLOY_WORKFLOW_TOKEN" and "ALLOY_WORKFLOW_SECRET".
            - When referencing security credentials, always pull values from environment variables.
            
            Answer the user's questions based on the below context:\n\n{context}""",
        ),
        (
            "user",
            "{question}"
        )
    ]
)

def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

rag_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | better_prompt
    | chat
    | StrOutputParser()
)

In [53]:
from IPython.display import display, Markdown
output = rag_chain.invoke(query)
display(Markdown(output))

To create a journey application using the Alloy API in Java, you'll need to set up your environment and authenticate with the API. You can choose between Basic Authentication and OAuth2 for authentication. Below, I'll guide you through both methods.

### Prerequisites
1. Java Development Kit (JDK) - Ensure you have Java installed on your system. You can download it from [Oracle's JDK Download Page](https://www.oracle.com/java/technologies/javase-jdk11-downloads.html).
2. An IDE or a simple text editor for writing your Java code.
3. A library for making HTTP requests. In this example, I'll use `OkHttp` for making requests. You can add it to your project using Maven or Gradle.

### Adding Dependencies
If you are using Maven, add the following dependencies to your `pom.xml`:

```xml
<dependencies>
    <dependency>
        <groupId>com.squareup.okhttp3</groupId>
        <artifactId>okhttp</artifactId>
        <version>4.9.0</version>
    </dependency>
</dependencies>
```

For Gradle, add this to your `build.gradle`:

```gradle
dependencies {
    implementation 'com.squareup.okhttp3:okhttp:4.9.0'
}
```

### Environment Setup
Set up environment variables for your credentials. This is a secure way to handle sensitive data like API keys and tokens.

```bash
export ALLOY_WORKFLOW_TOKEN="your_workflow_token"
export ALLOY_WORKFLOW_SECRET="your_workflow_secret"
```

### Code Example
Here’s how you can write the Java code to create a journey application. I'll provide examples for both Basic Authentication and OAuth2.

#### Using Basic Authentication

```java
import okhttp3.Credentials;
import okhttp3.MediaType;
import okhttp3.OkHttpClient;
import okhttp3.Request;
import okhttp3.RequestBody;
import okhttp3.Response;

public class AlloyApiExample {
    public static void main(String[] args) {
        try {
            OkHttpClient client = new OkHttpClient();

            String credentials = Credentials.basic(System.getenv("ALLOY_WORKFLOW_TOKEN"), System.getenv("ALLOY_WORKFLOW_SECRET"));
            MediaType mediaType = MediaType.parse("application/json");
            RequestBody body = RequestBody.create(mediaType, "{\n    \"entities\": [\n        {\n            \"branch_name\": \"persons\",\n            \"data\": {\n                \"name_first\": \"John\",\n                \"name_last\": \"Doe\",\n                \"birth_date\": \"1990-01-25\",\n                \"email_address\": \"john@alloy.com\"\n            },\n            \"entity_type\": \"person\"\n        }\n    ]\n}");
            Request request = new Request.Builder()
                .url("https://demo-qasandbox.alloy.co/v1/journeys/{journey_token}/applications")
                .post(body)
                .addHeader("Authorization", credentials)
                .addHeader("Content-Type", "application/json")
                .build();

            Response response = client.newCall(request).execute();
            System.out.println(response.body().string());
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}
```

#### Using OAuth2

First, you need to obtain a bearer token using the client credentials flow:

```java
import okhttp3.*;

public class OAuthTokenFetcher {
    public static String fetchAccessToken() throws Exception {
        OkHttpClient client = new OkHttpClient();
        String credentials = Credentials.basic(System.getenv("ALLOY_WORKFLOW_TOKEN"), System.getenv("ALLOY_WORKFLOW_SECRET"));
        RequestBody body = RequestBody.create(MediaType.parse("application/x-www-form-urlencoded"), "grant_type=client_credentials");
        Request request = new Request.Builder()
            .url("https://demo-qasandbox.alloy.co/oauth/bearer")
            .post(body)
            .addHeader("Authorization", credentials)
            .build();

        Response response = client.newCall(request).execute();
        // Extract the access token from the response body
        // This is a simplified example, you should parse the JSON response
        return "extracted_access_token";
    }
}
```

Then use the token to make the API request:

```java
public class AlloyApiOAuthExample {
    public static void main(String[] args) {
        try {
            String accessToken = OAuthTokenFetcher.fetchAccessToken();
            OkHttpClient client = new OkHttpClient();

            MediaType mediaType = MediaType.parse("application/json");
            RequestBody body = RequestBody.create(mediaType, "{\n    \"entities\": [\n        {\n            \"branch_name\": \"persons\",\n            \"data\": {\n                \"name_first\": \"John\",\n                \"name_last\": \"Doe\",\n                \"birth_date\": \"1990-01-25\",\n                \"email_address\": \"john@alloy.com\"\n            },\n            \"entity_type\": \"person\"\n        }\n    ]\n}");
            Request request = new Request.Builder()
                .url("https://demo-qasandbox.alloy.co/v1/journeys/{journey_token}/applications")
                .post(body)
                .addHeader("Authorization", "Bearer " + accessToken)
                .addHeader("Content-Type", "application/json")
                .build();

            Response response = client.newCall(request).execute();
            System.out.println(response.body().string());
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}
```

### Explanation
- Replace `{journey_token}` with the actual journey token you want to use.
- The JSON body in the `RequestBody.create` method should be modified according to the specific requirements of your journey application.

This setup should help you successfully create a journey application using the Alloy API in Java.

In [54]:
query = "How do I create a journey application in TypeScript?"

In [55]:
from IPython.display import display, Markdown
output = rag_chain.invoke(query)
display(Markdown(output))

To create a journey application using the Alloy API in TypeScript, you'll need to follow these steps. This involves setting up your environment, handling authentication, and making the API request. Below, I'll guide you through both Basic and OAuth2 authentication methods as specified in the API documentation.

### Environment Setup

1. **Install Node.js**: Ensure Node.js is installed in your environment. You can download it from [nodejs.org](https://nodejs.org/).

2. **Setup TypeScript**: Install TypeScript globally using npm if you haven't already:
   ```bash
   npm install -g typescript
   ```

3. **Create a new project**:
   ```bash
   mkdir alloy-journey-application
   cd alloy-journey-application
   npm init -y
   tsc --init
   ```

4. **Install required packages**:
   - `axios` for making HTTP requests.
   - `dotenv` for managing environment variables.

   ```bash
   npm install axios dotenv
   npm install @types/axios @types/node --save-dev
   ```

5. **Setup your environment variables**:
   Create a `.env` file in your project root and add the following (replace placeholders with actual values):
   ```plaintext
   ALLOY_WORKFLOW_TOKEN=your_workflow_token
   ALLOY_WORKFLOW_SECRET=your_workflow_secret
   ```

### TypeScript Code Example

Create a file named `createJourneyApplication.ts` and add the following TypeScript code. This example includes both Basic and OAuth2 authentication methods.

```typescript
import axios from 'axios';
import * as dotenv from 'dotenv';

dotenv.config();

const apiUrl = 'https://demo-qasandbox.alloy.co/v1/journeys/{journey_token}/applications';
const journeyToken = 'your_journey_token'; // Replace with your actual journey token

// Replace placeholders in apiUrl
const endpoint = apiUrl.replace('{journey_token}', journeyToken);

// Data to be sent in the request body
const requestData = {
  do_await_additional_entities: false,
  entities: [
    {
      branch_name: 'persons',
      data: {
        addresses: [
          {
            city: "New York",
            country_code: "US",
            line_1: "41 E. 11th",
            line_2: "2nd floor",
            postal_code: "10003",
            state: "NY",
            type: "primary"
          }
        ],
        birth_date: "1990-01-25",
        document_ssn: 111223333,
        email_address: "john@alloy.com",
        ip_address_v4: "42.206.213.70",
        meta: {
          user_type: "vip"
        },
        name_first: "John",
        name_last: "Doe",
        name_middle: "Franklin",
        phone_number: 8443825569
      },
      entity_type: "person",
      external_entity_id: "my_system_entity_id_123"
    }
  ]
};

// Function to perform Basic Authentication
async function createJourneyApplicationBasicAuth() {
  const auth = {
    username: process.env.ALLOY_WORKFLOW_TOKEN!,
    password: process.env.ALLOY_WORKFLOW_SECRET!
  };

  try {
    const response = await axios.post(endpoint, requestData, { auth });
    console.log('Response:', response.data);
  } catch (error) {
    console.error('Error:', error.response?.data || error.message);
  }
}

// Function to perform OAuth2 Authentication
async function createJourneyApplicationOAuth2() {
  const tokenResponse = await axios.post('https://demo-qasandbox.alloy.co/oauth/bearer', {}, {
    auth: {
      username: process.env.ALLOY_WORKFLOW_TOKEN!,
      password: process.env.ALLOY_WORKFLOW_SECRET!
    }
  });

  const accessToken = tokenResponse.data.access_token;

  try {
    const response = await axios.post(endpoint, requestData, {
      headers: { Authorization: `Bearer ${accessToken}` }
    });
    console.log('Response:', response.data);
  } catch (error) {
    console.error('Error:', error.response?.data || error.message);
  }
}

// Uncomment the function you want to use:
// createJourneyApplicationBasicAuth();
// createJourneyApplicationOAuth2();
```

### Running the Code

Compile and run your TypeScript file:
```bash
tsc createJourneyApplication.ts
node createJourneyApplication.js
```

This script will send a POST request to create a journey application with the specified data. Make sure to handle the response and errors appropriately based on your application's needs.

### 2️⃣ Add documentation and custom configuration to input

```mermaid
graph LR
    alloy.com.yaml --> Context
    HWD[Hand-written Documentation] --> Context
    JSON[JSON Payload] --> Context
    Context --> Chat
    Query[How do I create a journey in Python?] --> Chat
    Chat --> Guide

```

In [56]:
import yaml
import jsonref
from jsonref import replace_refs
from langchain_core.documents.base import Document
from copy import deepcopy
from pprint import pprint

with open("alloy.com.yaml") as f:
    spec = f.read()

def chunk_openapi_by_operation(openapi: str):
    parsed = yaml.safe_load(openapi)

    operations: (str, str) = []
    # 1) list all operations by (path, HTTP method)
    for path, methods in parsed['paths'].items():
        for method in methods.keys():
            # if method is not an HTTP method then skip
            if method.lower() not in ['get', 'post', 'put', 'delete', 'patch', 'head', 'options', 'trace']:
                continue
            operations.append((path, method))

    # 2) create a chunk for every operation

    # 2.a) Dereference entire OpenAPI Spec
    dereferenced = replace_refs(parsed, lazy_load=False)

    chunks = []
    for operation in operations:
        path = operation[0]
        method = operation[1]
        chunk = deepcopy(dereferenced)
        if 'tags' in chunk['paths'][operation[0]][operation[1]]:
            tags = chunk['paths'][operation[0]][operation[1]]['tags']

        # first tag if possible
        if tags:
            tag_name = tags[0]

        # delete all tags on OAS except tag for this operation
        while len(chunk['tags']) > 1:
            for i in range(len(chunk['tags']) - 1, -1, -1):
                if chunk['tags'][i]['name'] != tag_name:
                    chunk['tags'].pop(i)

        if "summary" in chunk['paths'][path][method]:
            summary = chunk['paths'][path][method]['summary']
        else:
            summary = ""

        if "description" in chunk['paths'][path][method]:
            description = chunk['paths'][path][method]['description']
        else:
            description = ""

        # delete other operations
        for other_operation in operations:
            if other_operation[0] == operation[0]:
                continue
            if other_operation[0] in chunk['paths']:
                del chunk['paths'][other_operation[0]]

        # delete empty paths
        for path in chunk['paths'].keys():
            if not chunk['paths'][path]:
                del chunk['paths'][path]

        # delete other operations under same path
        keys = list(chunk['paths'][operation[0]].keys())
        for method in keys:
            if operation[1] == method:
                continue
            del chunk['paths'][operation[0]][method]

        # delete all components besides securitySchemes (should be inlined from 2.a)
        if "components" in chunk:
            for key in chunk["components"]:
                if key == "securitySchemes":
                    continue
                del chunk['components'][key]
        
        chunks.append(({
            "path": operation[0],
            "method": operation[1],
            "openapi": yaml.dump(chunk),
            "tag": tag_name,
            "summary": summary,
            "description": description
        }))
    return list(map(lambda chunk: Document(page_content=chunk["openapi"], metadata={
        "path": chunk["path"],
        "method": chunk["method"],
        "tag": chunk["tag"],
        "summary": chunk["summary"],
        "description": chunk["description"]
    }), chunks))
chunks = chunk_openapi_by_operation(spec)
# for chunk in chunks:
#     print(len(yaml.safe_load(chunk.page_content)['paths']))
print(len(chunks))

72


In [57]:
import phoenix as px

# Launch phoenix
session = px.launch_app()

# Once you have started a Phoenix server, you can start your LangChain application with the OpenInferenceTracer as a callback. To do this, you will have to instrument your LangChain application with the tracer:

from phoenix.trace.langchain import LangChainInstrumentor

# By default, the traces will be exported to the locally running Phoenix server.
LangChainInstrumentor().instrument()

session.url

🌍 To view the Phoenix app in your browser, visit http://localhost:6006/
📺 To view the Phoenix app in a notebook, run `px.active_session().view()`
📖 For more information on how to use Phoenix, check out https://docs.arize.com/phoenix


'http://localhost:6006/'

In [58]:
from langchain.chains.query_constructor.base import AttributeInfo
from langchain.retrievers.self_query.base import SelfQueryRetriever
from langchain_community.vectorstores.chroma import Chroma
from langchain_openai import ChatOpenAI
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter
import os

# for chunk in chunks:
#     print(len(yaml.safe_load(chunk.page_content)['paths']))

query = "How do I create a journey application in Python?"

chat = ChatOpenAI(model="gpt-4-turbo", temperature=0, model_kwargs={"seed": 0})

# Reset the collection to remove embeddings from previous runs
Chroma().delete_collection()

vectorstore = Chroma.from_documents(chunks, OpenAIEmbeddings())

In [59]:
metadata_field_info = [
    AttributeInfo(
        name="path",
        description="The subpath for this operation",
        type="string",
    ),
    AttributeInfo(
        name="method",
        description="The HTTP Method for this operation",
        type="string",
    ),
    AttributeInfo(
        name="tag",
        description="The logical grouping for this API operation",
        type="string",
    ),
    AttributeInfo(
        name="summary",
        description="A short description of this operation's functionality",
        type="string",
    ),
    AttributeInfo(
        name="description",
        description="A more detailed description of this operation's functionality",
        type="string",
    ),
]
document_content_description = "The pruned OpenAPI specification which includes only the relevant information for a particular operation in the OpenAPI specification."

# Dylan: really stupid, but I had to upgrade langchain to get this to work for some reason
retriever = SelfQueryRetriever.from_llm(
    chat, vectorstore, document_content_description, metadata_field_info, verbose=True, search_kwargs={"k": 2}
)

In [60]:
from operator import itemgetter
from langchain_core.runnables import RunnablePassthrough, RunnableLambda
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.documents.base import Document
from pprint import pprint
import json

better_prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            """You are a world-class solutions engineer and amazing at question-answering tasks. You are also a world-class API
            integrations specialist and deeply understand OpenAPI Specification. You also deeply understand computer security which
            allows you to produce secure code examples and guidance. You are also a world-class technical documentation writer and make sure
            to include and details, edge cases, or options that the customer might consider.
            
            You strictly follow these rules:
            - If you don't know the answer, just say that you don't know.
            - Use markdown to ensure code blocks and commands are propertly formatted.
            - Make sure to always include environment setup instructions and code blocks that are helpful for a developer to copy-paste.
            - Always explain inputs and outputs of API requests. Be as detailed as possible.
            - When presented with an OpenAPI Specification and asked questions about it, always include necessary steps to
            authenticate with the endpoint.
            - When presented with multiple authentication options, make sure to guide the customer through both methods
            - Always name variables for security credentials based on standard protocols, API name, or descriptions
            - Make sure to include sections about required parameters 
            - If optional parameters are provided, explain them in detail should the customer need them.
            - For example if the following security scheme is:
  "securitySchemes:
    basic:
      type: http
      description: HTTP basic authorization using a workflow token and secret
      scheme: Basic"

or

    "oauth2:
      type: oauth2
      x-default: viSRrSuUJEid8u0l3dyRTj5ATsWpHX9ShD51TH3j
      description: Oauth2 using a workflow token and secret to generate a bearer token
      flows:
        clientCredentials:
          tokenUrl: /oauth/bearer"
      
              Then you would name the variables in code as "ALLOY_WORKFLOW_TOKEN" and "ALLOY_WORKFLOW_SECRET".
            - When referencing security credentials, always pull values from environment variables.
            
            Answer the user's questions based on the below context:\n\n

            OpenAPI Specification:
            {openapi}

            Hand-written documentation:
            {hwd}

            Configurations:
            {configurations}
            """,
        ),
        (
            "user",
            "{question}"
        )
    ]
)

with open("creating-a-journey.md") as f:
    hwd = f.read()
    
def format_docs(docs: list[Document]):
    return "\n\n".join(doc.page_content for doc in docs)

def format_hwd(docs: list[str]):
    return "\n\n".join(docs)

def format_configurations(configurations: list[dict]):
    return "\n\n".join(map(json.dumps, configurations))

def log_prompt(chat_prompt):
    for msg in chat_prompt:
        print(chat_prompt.to_string())
    return chat_prompt

rag_chain = (
    {
        "openapi": itemgetter("question") | retriever | format_docs,
        "question": itemgetter("question"),
        "hwd": itemgetter("documentation") | RunnableLambda(format_hwd),
        "configurations": itemgetter("configurations") | RunnableLambda(format_configurations)
    }
    | better_prompt
    | chat
    | StrOutputParser()
)

In [61]:
from IPython.display import display, Markdown

with open("creating-a-journey.md") as f:
    documentation = f.read()

configuration = {
  "journey_token": "J-VCQoADBJxeHtmdAvFqoS",
  "journey_version": "1",
  "timestamp": 1632400000,
  "data": [
    {
      "branch_name": "branch1",
      "workflows": [
        {
          "workflow_name": "workflow1",
          "parameters": {
            "required": [
              {
                "gender": True,
                "addresses": {
                  "state": True
                }
              }
            ],
            "optional": [
              {
                "document_license": True
              }
            ],
            "or": []
          }
        }
      ]
    }
  ]
}

output = rag_chain.invoke({"question": query, "documentation": [documentation], "configurations": [configuration]})
display(Markdown(output))

To create a Journey Application using the Alloy API in Python, you'll need to follow these steps. This involves setting up your environment, handling authentication, and making the API request with the necessary parameters.

### Environment Setup

First, ensure you have Python installed on your system. You can download Python from [python.org](https://www.python.org/downloads/). You will also need the `requests` library, which can be installed using pip:

```bash
pip install requests
```

### Handling Authentication

The Alloy API provides two methods for authentication: Basic Auth and OAuth2. Below, I'll show you how to authenticate using both methods.

#### Using Basic Authentication

For Basic Auth, you need your workflow token and secret. These should be stored securely and not hard-coded in your scripts. Here’s how you can use Basic Auth:

```python
import requests
from requests.auth import HTTPBasicAuth
import os

# Environment variables for security credentials
ALLOY_WORKFLOW_TOKEN = os.getenv('ALLOY_WORKFLOW_TOKEN')
ALLOY_WORKFLOW_SECRET = os.getenv('ALLOY_WORKFLOW_SECRET')

# API endpoint
url = "https://demo-qasandbox.alloy.co/v1/journeys/{journey_token}/applications"

# Replace {journey_token} with your actual journey token
url = url.format(journey_token="J-VCQoADBJxeHtmdAvFqoS")

# Request headers
headers = {
    "Content-Type": "application/json"
}

# Request body
payload = {
    "entities": [
        {
            "branch_name": "persons",
            "data": {
                "name_first": "John",
                "name_last": "Doe",
                "birth_date": "1990-01-25",
                "document_ssn": 111223333,
                "email_address": "john@alloy.com",
                "ip_address_v4": "42.206.213.70",
                "phone_number": 8443825569,
                "addresses": [
                    {
                        "city": "New York",
                        "country_code": "US",
                        "line_1": "41 E. 11th",
                        "line_2": "2nd floor",
                        "postal_code": "10003",
                        "state": "NY",
                        "type": "primary"
                    }
                ]
            },
            "entity_type": "person"
        }
    ]
}

# Make the POST request
response = requests.post(url, json=payload, headers=headers, auth=HTTPBasicAuth(ALLOY_WORKFLOW_TOKEN, ALLOY_WORKFLOW_SECRET))

# Print the response
print(response.json())
```

#### Using OAuth2 Authentication

For OAuth2, you'll need to obtain a bearer token first:

```python
# Token endpoint
token_url = "https://demo-qasandbox.alloy.co/v1/oauth/bearer"

# Obtain the bearer token
token_response = requests.post(token_url, auth=HTTPBasicAuth(ALLOY_WORKFLOW_TOKEN, ALLOY_WORKFLOW_SECRET))
access_token = token_response.json().get('access_token')

# Headers with the bearer token
headers['Authorization'] = f"Bearer {access_token}"

# Make the POST request with the bearer token
response = requests.post(url, json=payload, headers=headers)

# Print the response
print(response.json())
```

### Required Parameters

Based on the provided configurations and the `Get Journey Parameters` endpoint, you must include the `gender` and `addresses.state` in your request payload under `entities.data`. These are marked as required. Optional parameters like `document_license` can also be included based on your application needs.

This setup should allow you to successfully create a Journey Application using the Alloy API in Python. Adjust the payload according to the specific requirements and parameters of your journey configuration.

### 👎 This guide doesn't do a good job at explaining important details

### 🤔 Can we use [agentic workflows](https://www.youtube.com/watch?v=sal78ACtGTc&t=735s) to do better?

 - agentic workflow youtube video (https://www.youtube.com/watch?v=sal78ACtGTc)
 - langchain docs (LCEL, 2-3 use cases under the use-case section)
 - raptor paper youtube video (https://www.youtube.com/watch?v=jbGchdTL7d0)
 - Phoenix Evals (https://docs.arize.com/phoenix/evaluation/llm-evals)
 - If time permits or it interests you, learn about LangGraph (https://python.langchain.com/docs/langgraph/)