This notebook is ran in a docker container where the project directory (i.e. same directory as README.md) is located in `/code`, which is set below. If you run locally you'll need to set the path of your project directory accordingly.

The `load_dotenv` function below loads all the variables found in the `.env` file as environment variables. You must have a `.env` file located in the project directory containing your OpenAI API key, in the following format.

```
OPENAI_API_KEY=sk-...
```

---

In [1]:
%cd /code

/code


In [2]:
from dotenv import load_dotenv
load_dotenv()

True

---

# Tools and more

Before an example of a chain is shown, it will be helpful to understand the various tools that will be used in the chain.

## DuckDuckGo Search

We can use the DuckDuckGo search engine to retrieve the top URLs associated with a search query. The `DuckDuckGoSearch` object is a callable object that returns a list of dictionaries. Each item in the list corresponds to a search result.

In [5]:
from llm_chain.tools import DuckDuckGoSearch

duckduckgo_search = DuckDuckGoSearch(top_n=3)
duckduckgo_search("What is a large language model?")

[{'title': 'What are LLMs, and how are they used in generative AI?',
  'href': 'https://www.computerworld.com/article/3697649/what-are-large-language-models-and-how-are-they-used-in-generative-ai.html',
  'body': "Large language models are the algorithmic basis for chatbots like OpenAI's ChatGPT and Google's Bard. The technology is tied back to billions — even trillions — of parameters that can make them..."},
 {'title': 'What is a large language model and how does it work? - Fast Company',
  'href': 'https://www.fastcompany.com/90884581/what-is-a-large-language-model',
  'body': 'Large language models are the foundational technology behind recent artificial intelligence advancements like ChatGPT.'},
 {'title': 'What are Large Language Models - MachineLearningMastery.com',
  'href': 'https://machinelearningmastery.com/what-are-large-language-models/',
  'body': 'Large language models (LLMs) are recent advances in deep learning models to work on human languages. Some great use case of L

# Chains

A chain consists of individual tasks called links. Each link is a callable (either a function or an object) where the output of one link is the input of another link.

- `DuckDuckGoSearch`: 

In [None]:
from llm_chain.base import Document, Chain, Value
from llm_chain.models import OpenAIEmbeddings, OpenAIChat
from llm_chain.tools import DuckDuckGoSearch, html_page_loader, split_documents
from llm_chain.indexes import ChromaDocumentIndex
from llm_chain.prompt_templates import DocSearchTemplate

duckduckgo_search = DuckDuckGoSearch(top_n=3)
document_index = ChromaDocumentIndex(embeddings_model=OpenAIEmbeddings(model_name='text-embedding-ada-002'))
prompt_template = DocSearchTemplate(doc_index=document_index, n_docs=2)
non_streaming_chat = OpenAIChat(model_name='gpt-3.5-turbo')
streaming_callback = lambda x: print(x.response, end='|')
streaming_chat = OpenAIChat(model_name='gpt-3.5-turbo', streaming_callback=streaming_callback)

# for each url, extracts text, cleans, returns doc
def search_results_to_docs(results: list[dict]) -> list[Document]:
    return [Document(content=html_page_loader(x['href']).replace('\n', ' ')) for x in results]

initial_question = Value()
question_2 = lambda x: f'Summarize the following in less than 20 words: "{x}"'

# each link is a callable where the output of one link is the input to the next
chain = Chain(links=[
    initial_question,
    duckduckgo_search,
    search_results_to_docs,
    split_documents,  # defaults to chunk-size of 500
    document_index,  # __call__ function calls add() or search() based on input
    initial_question,
    prompt_template,
    non_streaming_chat,
    question_2,
    streaming_chat,
])

response = chain("What is the meaning of life?")

In [None]:
from llm_chain.base import Document, Chain, Value
from llm_chain.models import OpenAIEmbeddings, OpenAIChat
from llm_chain.tools import DuckDuckGoSearch, html_page_loader, split_documents
from llm_chain.indexes import ChromaDocumentIndex
from llm_chain.prompt_templates import DocSearchTemplate

duckduckgo_search = DuckDuckGoSearch(top_n=3)

document_index = ChromaDocumentIndex(embeddings_model=OpenAIEmbeddings(model_name='text-embedding-ada-002'))
prompt_template = DocSearchTemplate(doc_index=document_index, n_docs=2)
# OpenAI Chat model
non_streaming_chat = OpenAIChat(model_name='gpt-3.5-turbo')

streaming_callback = lambda x: print(x.response, end='|')
streaming_chat = OpenAIChat(model_name='gpt-3.5-turbo', streaming_callback=streaming_callback)

# for each url, extracts text, cleans, returns doc
def search_results_to_docs(results: list[dict]) -> list[Document]:
    return [Document(content=html_page_loader(x['href']).replace('\n', ' ')) for x in results]

initial_question = Value()
question_2 = lambda x: f'Summarize the following in less than 20 words: "{x}"'
# each link is a callable where the output of one link is the input to the next
chain = Chain(links=[
    initial_question,
    duckduckgo_search,
    search_results_to_docs,
    split_documents,  # defaults to chunk-size of 500
    document_index,  # __call__ function calls add() or search() based on input
    initial_question,
    prompt_template,
    non_streaming_chat,
    question_2,
    streaming_chat,
])

response = chain("What is the meaning of life?")