In [1]:
from pprint import pprint
from IPython.display import display, Markdown

def mprint(val: str):
    display(Markdown(val))

# DuckDuckGo Web-Search

The following is an example of using DuckDuckGo to perform a web search.

In [1]:
from llm_chain.tools import DuckDuckGoSearch

duckduckgo_search = DuckDuckGoSearch(top_n=3)
duckduckgo_search("What is a large language model?")

[{'title': 'What are LLMs, and how are they used in generative AI?',
  'href': 'https://www.computerworld.com/article/3697649/what-are-large-language-models-and-how-are-they-used-in-generative-ai.html',
  'body': "Large language models are the algorithmic basis for chatbots like OpenAI's ChatGPT and Google's Bard. The technology is tied back to billions — even trillions — of parameters that can make them..."},
 {'title': 'What is a large language model and how does it work? - Fast Company',
  'href': 'https://www.fastcompany.com/90884581/what-is-a-large-language-model',
  'body': 'Large language models are the foundational technology behind recent artificial intelligence advancements like ChatGPT.'},
 {'title': 'Introduction to Large Language Models - Baeldung',
  'href': 'https://www.baeldung.com/cs/large-language-models',
  'body': 'A quick and practical guide to Large Language Models. Natural Language Processing (NLP), is an interdisciplinary subfield of linguistics, computer scienc

The `duckduckgo_search` object has a `history` property that allows us to view the searches performed: 

In [7]:
print(duckduckgo_search.history[0].query)
pprint(duckduckgo_search.history[0].results)

What is a large language model?
[{'body': 'Large language models are the algorithmic basis for chatbots like '
          "OpenAI's ChatGPT and Google's Bard. The technology is tied back to "
          'billions — even trillions — of parameters that can make them...',
  'href': 'https://www.computerworld.com/article/3697649/what-are-large-language-models-and-how-are-they-used-in-generative-ai.html',
  'title': 'What are LLMs, and how are they used in generative AI?'},
 {'body': 'Large language models are the foundational technology behind recent '
          'artificial intelligence advancements like ChatGPT.',
  'href': 'https://www.fastcompany.com/90884581/what-is-a-large-language-model',
  'title': 'What is a large language model and how does it work? - Fast '
           'Company'},
 {'body': 'A quick and practical guide to Large Language Models. Natural '
          'Language Processing (NLP), is an interdisciplinary subfield of '
          'linguistics, computer science, and artifici

---

# Scraping a URL

In [12]:
from llm_chain.tools import scrape_url

result = scrape_url(url='http://example.com')
mprint(result)

Example Domain







Example Domain
This domain is for use in illustrative examples in documents. You may use this
    domain in literature without prior coordination or asking for permission.
More information...

---

# Split Documents

A `Document` object is a simple class that contains `content` and `metadata` properties.

Sometimes we want to split a document up into multiple chunks.

In [2]:
from llm_chain.base import Document

doc_a = Document(
    content="Pretend this is a long document that we want to split.",
    metadata={'id': 'some unique-id'},
)
doc_b = Document(
    content="Pretend this is another long document that we want to split.",
    metadata={'id': 'another unique-id'},
)
print(doc_a)
print(doc_b)

content='Pretend this is a long document that we want to split.' metadata={'id': 'some unique-id'}
content='Pretend this is another long document that we want to split.' metadata={'id': 'another unique-id'}


In [3]:
from llm_chain.tools import split_documents

split_documents(docs=[doc_a, doc_b], max_chars=20)

[Document(content='Pretend this is a', metadata={'id': 'some unique-id'}),
 Document(content='long document that', metadata={'id': 'some unique-id'}),
 Document(content='we want to split.', metadata={'id': 'some unique-id'}),
 Document(content='Pretend this is', metadata={'id': 'another unique-id'}),
 Document(content='another long', metadata={'id': 'another unique-id'}),
 Document(content='document that we', metadata={'id': 'another unique-id'}),
 Document(content='want to split.', metadata={'id': 'another unique-id'})]