In [1]:
!pip3 install -qU langchain langchain-community friendli-client requests

In [2]:
import requests


FRIENDLI_TOKEN = "flp_xxxx"  # https://suite.friendli.ai/user-settings/tokens
def retrieve_contexts(document_ids: list[str], query: str, k: int) -> list[str]:
    resp = requests.post(
        "https://suite.friendli.ai/api/beta/retrieve",
        headers={
            "Content-Type": "application/json",
            "Authorization": f"Bearer {FRIENDLI_TOKEN}",
        },
        json={
            "document_ids": document_ids,
            "query": query,
            "k": k,
        }
    )
    data = resp.json()
    return [r["content"] for r in data["results"]]

In [3]:
document_ids = [...]
contexts = retrieve_contexts(document_ids, "What is Orca?", 2)
print(contexts)

['ORCA: A Distributed Serving System for\nTransformer-Based Generative Models...]


In [4]:
from langchain_community.chat_models.friendli import ChatFriendli

llm = ChatFriendli(model="meta-llama-3-70b-instruct", friendli_token=FRIENDLI_TOKEN)
llm.call_as_llm(message="What is Orca?")

"Orca can refer to different things, but I'll cover the most common meanings:\n\n1. **Orca (killer whale)**: The orca, also known as the killer whale, is a toothed whale belonging to the oceanic dolphin family. It is the largest member of the dolphin family and is known for its distinctive black and white coloring. Orcas are apex predators, which means they have no natural predators in the wild. They are highly social, intelligent, and communicate using a variety of clicks, whistles, and body language.\n2. **Orca (software)**: Orca is an open-source screen reader and assistive technology developed by the GNOME Project. It provides a way for people with visual impairments or blindness to interact with graphical user interfaces (GUIs) using a braille display or synthesized speech.\n3. **Orca (Marvel Comics)**: Orca is a fictional character in the Marvel Comics universe. She is a supervillain and an enemy of Namor the Sub-Mariner. Her real name is Suzanna Sherman, and she gained her abili

In [5]:
template = """Use the following pieces of context to answer the question at the end.
If you don’t know the answer, just say that you don’t know, don’t try to make up an answer.
Use three sentences maximum and keep the answer as concise as possible.
Always say “thanks for asking!” at the end of the answer.

{context}

Question: {question}

Helpful Answer:"""
rag_message = template.format(context="\n".join(contexts), question="What is Orca?")
llm.call_as_llm(message=rag_message)

'ORCA is a distributed serving system for Transformer-based generative models. It is designed to provide low-latency and high-throughput inference serving for large-scale Transformer models, such as GPT-3. Thanks for asking!'