In [2]:
pip install -r requirements.txt

Defaulting to user installation because normal site-packages is not writeable
You should consider upgrading via the '/Library/Developer/CommandLineTools/usr/bin/python3 -m pip install --upgrade pip' command.[0m
Note: you may need to restart the kernel to use updated packages.


In [3]:
import os 

from IPython.display import display, Markdown, Latex
from langchain.text_splitter import MarkdownHeaderTextSplitter
from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings, ChatOpenAI

from utils import set_keys, load_files



In [4]:
DB_PATH = "choreo_docs.db"

# set OPENAI KEY
set_keys()

## Building AI Assistant

At first, let's try to directly call LLMs and try to get the answers for a sample question. 

In [5]:
question = "how to fix endpoint.yaml missing error"

prompt = f"""You are an assistant to the Choreo users, to help them to use Choreo platform by answering them on how to use Choreo.
Question: {question}
"""

llm = ChatOpenAI(model_name='gpt-3.5-turbo')
answer = llm.invoke(prompt).content

display(Markdown(answer))

If you are encountering an "endpoint.yaml missing" error while using Choreo, it means that the configuration file for your API endpoints is not present in the designated location. To fix this error, you will need to create an endpoint.yaml file and properly configure it with the necessary endpoints for your APIs.

Here's a step-by-step guide on how to fix the "endpoint.yaml missing" error:

1. Create a new file named "endpoint.yaml" in the root directory of your Choreo project.
2. Open the endpoint.yaml file in a text editor and add the necessary API endpoints in the following format:

```yaml
endpoints:
  - name: endpoint1
    url: https://api.example.com/endpoint1
  - name: endpoint2
    url: https://api.example.com/endpoint2
```

3. Save the changes to the endpoint.yaml file.
4. Restart the Choreo platform and run your Choreo project again.

By following these steps, you should be able to fix the "endpoint.yaml missing" error and successfully use Choreo with the configured API endpoints. If you continue to encounter any issues, feel free to reach out to the Choreo support team for further assistance.

The answer is not correct, therefore we will use docs to provide knowledge to the LLMs. 


To do so, let's use a vector storage (Chroma DB), and index all the docs within that DB. We will use OpenAI embeddings to index the docs. 

In [6]:

markdown_splitter = MarkdownHeaderTextSplitter(headers_to_split_on=[("#", "Header1"), ("##", "Header2"), ("###", "Header3")])

sections = []
files = load_files("data/")
for file in files:
    document = file[1].read()
    sections += markdown_splitter.split_text(document)

len(sections), sections[4:7]

(705,
 [Document(page_content='An organization is a logical grouping of users and their resources. It may represent a company, community, or a single user. Users can belong to multiple organizations, and each organization can have different roles assigned to its users to control access to Choreo features.', metadata={'Header1': 'Frequently Asked Questions', 'Header2': 'General', 'Header3': 'Q: What is an organization in Choreo?'}),
  Document(page_content='A project is a logical grouping of related components to help you organize your work. Each project provides runtime isolation through namespaces when you deploy components.', metadata={'Header1': 'Frequently Asked Questions', 'Header2': 'General', 'Header3': 'Q: What is a project in Choreo?'}),
  Document(page_content='A component is a workload designed to run on Choreo. Examples of components include integrations, APIs, microservices, manual/scheduled jobs, web apps, and triggers.', metadata={'Header1': 'Frequently Asked Questions',

In [8]:
# store embeddings in the database, if not exists (to avoid the cost of re-embedding the documents)
embeddings = OpenAIEmbeddings(model = "text-embedding-3-small")
if not os.path.exists(DB_PATH):
    db = Chroma.from_documents(sections, embeddings, persist_directory=DB_PATH)
else:
    db =  Chroma( persist_directory=DB_PATH, embedding_function=embeddings)

Now let's try to retrieve the docs related to the given question using similarity search based on OpenAI embeddings.

In [9]:
docs = db.similarity_search(question, k = 3)

docs[0].page_content

'The `endpoints.yaml` file has a specific structure and contains the following details:  \n| Field                | Required     | Description                                                                      |\n|----------------------|--------------|----------------------------------------------------------------------------------|\n| **version**          | Required     | The version of the `endpoints.yaml` file.                                           |\n| **name**             | Required     | A unique name for the endpoint, which Choreo will use to generate the managed API.|\n| **port**             | Required     | The numeric port value that gets exposed via this endpoint.                      |\n| **type**             | Required     | The type of traffic this endpoint is accepting, such as `REST`, `GraphQL`, `gRPC`, `UDP`or `TCP`. Currently, the MI preset supports only the `REST` type.                                         |\n| **networkVisibility**| Required     | The netw

Now write the new prompt including the retrieve docs as information to be used to generate the answer. 

In [17]:
prompt_template = """You are an assistant to the Choreo users, to help them to use Choreo platform by using the given information to answer their questions. Also you can help try the users by generating the code snippets for the given information.

Information: {context}

Question: {question}
"""
prompt = prompt_template.format(context = docs, question = question)

display(Markdown(prompt))

You are an assistant to the Choreo users, to help them to use Choreo platform by using the given information to answer their questions. Also you can help try the users by generating the code snippets for the given information.

Information: [Document(page_content='The `endpoints.yaml` file has a specific structure and contains the following details:  \n| Field                | Required     | Description                                                                      |\n|----------------------|--------------|----------------------------------------------------------------------------------|\n| **version**          | Required     | The version of the `endpoints.yaml` file.                                           |\n| **name**             | Required     | A unique name for the endpoint, which Choreo will use to generate the managed API.|\n| **port**             | Required     | The numeric port value that gets exposed via this endpoint.                      |\n| **type**             | Required     | The type of traffic this endpoint is accepting, such as `REST`, `GraphQL`, `gRPC`, `UDP`or `TCP`. Currently, the MI preset supports only the `REST` type.                                         |\n| **networkVisibility**| Required     | The network level visibility of this endpoint, which defaults to `Project` if not specified. Accepted values are `Project`, `Organization`, or `Public`.|\n| **context**          | Required     | The context (base path) of the API that Choreo exposes via this endpoint.        |\n| **schemaFilePath**   | Required     |  The swagger definition file path. Defaults to the wildcard route if not provided. This field should be a relative path to the project path when using the **Java**, **Python**, **NodeJS**, **Go**, **PHP**, **Ruby**, and **WSO2 MI** buildpacks. For REST endpoint types, when using the **Ballerina** or **Dockerfile** buildpack, this field should be a relative path to the component root or Docker context.|  \n#### Sample endpoints.yaml  \n**File location**:  \n```bash\n<docker-build-context-path>/.choreo/endpoints.yaml\n```  \n!!! note\n- For components built with Ballerina buildpack `docker-build-context-path` should be replaced with `component-root`.\nFor example: `<component-root>/.choreo/endpoints.yaml`  \n- For components built with WSO2 MI buildpack `docker-build-context-path` should be replaced with `<Project Path>`.\nFor example: `<Project Path>/.choreo/endpoints.yaml`  \n**File content**:  \n```yaml\n# +required Version of the endpoint configuration YAML\nversion: 0.1\n\n# +required List of endpoints to create\nendpoints:\n# +required Unique name for the endpoint. (This name will be used when generating the managed API)\n- name: Greeting Service\n# +required Numeric port value that gets exposed via this endpoint\nport: 9090\n# +required Type of the traffic this endpoint is accepting. Example: REST, GraphQL, etc.\n# Allowed values: REST, GraphQL, GRPC, UDP, TCP\ntype: REST\n# +optional Network level visibility of this endpoint. Defaults to Project\n# Accepted values: Project|Organization|Public.\nnetworkVisibility: Project\n# +optional Context (base path) of the API that is exposed via this endpoint.\n# This is mandatory if the endpoint type is set to REST or GraphQL.\ncontext: /greeting\n# +optional Path to the schema definition file. Defaults to wild card route if not provided\n# This is only applicable to REST endpoint types.\n# The path should be relative to the docker context.\nschemaFilePath: greeting_openapi.yaml\n```', metadata={'Header2': 'Configure endpoints', 'Header3': 'Learn the endpoints.yaml file'}), Document(page_content='The method of defining endpoints depends on the buildpack. For buildpacks other than `Ballerina` and `WSO2 MI`, it is required to have an `endpoints.yaml` file in project root directory to create the Service component.', metadata={'Header2': 'Configure endpoints'}), Document(page_content='{% include "configure-endpoints-body.md" %}', metadata={'Header1': 'Configure Endpoints'})]

Question: how to fix endpoint.yaml missing error


In [18]:
llm = ChatOpenAI(model_name='gpt-3.5-turbo')
answer = llm.invoke(prompt).content

display(Markdown(answer))

If you are encountering an "endpoint.yaml missing error", it means that the endpoints.yaml file is not present in the required location. To fix this error, make sure to create an endpoints.yaml file with the correct structure and details as specified in the documentation.

Here is a sample endpoints.yaml file content that you can use as a reference:

```yaml
# Version of the endpoint configuration YAML
version: 0.1

# List of endpoints to create
endpoints:
  # Unique name for the endpoint. (This name will be used when generating the managed API)
  - name: Greeting Service
  # Numeric port value that gets exposed via this endpoint
    port: 9090
  # Type of the traffic this endpoint is accepting. Example: REST, GraphQL, etc.
  # Allowed values: REST, GraphQL, GRPC, UDP, TCP
    type: REST
  # Network level visibility of this endpoint. Defaults to Project
  # Accepted values: Project|Organization|Public.
    networkVisibility: Project
  # Context (base path) of the API that is exposed via this endpoint.
  # This is mandatory if the endpoint type is set to REST or GraphQL.
    context: /greeting
  # Path to the schema definition file. Defaults to wild card route if not provided
  # This is only applicable to REST endpoint types.
  # The path should be relative to the docker context.
    schemaFilePath: greeting_openapi.yaml
```

Make sure to place this endpoints.yaml file in the correct location within your project structure.

You can build a reusable chain for the answering any question as follows

In [19]:
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain.prompts import ChatPromptTemplate, SystemMessagePromptTemplate, HumanMessagePromptTemplate

In [20]:
prompt = """You are an assistant to the Choreo users, to help them to use Choreo platform by using the given information to answer their questions.

Information: {context}

Question: {question}
"""

prompt_template = ChatPromptTemplate.from_messages([HumanMessagePromptTemplate.from_template(prompt)])
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

rag_chain = (
    {"context": db.as_retriever() | format_docs, "question": RunnablePassthrough()}
    | prompt_template
    | llm
    | StrOutputParser()
)

In [21]:
answer = rag_chain.invoke(question)
display(Markdown(answer))

To fix the `endpoints.yaml` missing error, you need to create an `endpoints.yaml` file with the required structure and details in the appropriate location within your project. Here are the steps to do so:

1. Create a new file named `endpoints.yaml` in the `.choreo` directory at the build context path of your project.
2. Copy the sample `endpoints.yaml` content provided in the information above and paste it into the newly created `endpoints.yaml` file.
3. Make sure to fill in the required fields such as `version`, `name`, `port`, `type`, `networkVisibility`, `context`, and `schemaFilePath` with appropriate values for your endpoint.
4. Save the `endpoints.yaml` file and commit it to your source code repository.

By following these steps and ensuring that the `endpoints.yaml` file is correctly configured and located within your project, you should be able to resolve the `endpoints.yaml` missing error and successfully define your endpoint details for Choreo platform.

In [22]:
answer = rag_chain.invoke("how to learn more about Generative AI?")
display(Markdown(answer))

To learn more about Generative AI, you can explore online courses, tutorials, research papers, and resources provided by educational platforms, research institutions, and industry experts. Additionally, you can participate in workshops, attend conferences, and join online communities focused on Generative AI to stay updated on the latest advancements and best practices in the field. By actively engaging with the Generative AI community and continuously learning from various sources, you can deepen your understanding and expertise in this exciting and rapidly evolving technology.

In [23]:
prompt = """You are an assistant to the Choreo users, to help them to use Choreo platform by using the given information to answer their questions. DO NOT answer the questions unrelated to Choreo platform.

Choreo is an internal developer platform that allows build and deploying (CI/CD), observe, manage user services, proxies, web-apps etc

Information: {context}

Question: {question}
"""

prompt_template = ChatPromptTemplate.from_messages([HumanMessagePromptTemplate.from_template(prompt)])
def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

rag_chain = (
    {"context": db.as_retriever() | format_docs, "question": RunnablePassthrough()}
    | prompt_template
    | llm
    | StrOutputParser()
)

answer = rag_chain.invoke("how to learn more about Generative AI?")
display(Markdown(answer))

I can only provide information related to the Choreo platform. If you have any questions about Choreo, feel free to ask!