## Setup

Ollama is a library for running open source LLMs locally.

If you haven't already, download Ollama at [here](https://ollama.com/)

then, open your favorite terminal and run

`ollama run llama3`

Next, lets install our python dependencies, including langchain, faiss, ollama, and firecrawl-py

In [1]:
%pip install --upgrade --quiet langchain langchain-community groq faiss-cpu ollama firecrawl-py
%pip install --force-reinstall typing-extensions==4.5
%pip install --force-reinstall groq==0.5.0

[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
tables 3.8.0 requires blosc2~=2.0.0, which is not installed.
tables 3.8.0 requires cython>=0.29.21, which is not installed.
spyder 5.4.3 requires pyqt5<5.16, which is not installed.
spyder 5.4.3 requires pyqtwebengine<5.16, which is not installed.
storage3 0.5.2 requires httpx<0.24,>=0.23, but you have httpx 0.27.0 which is incompatible.
llama-index-legacy 0.9.48 requires aiohttp<4.0.0,>=3.8.6, but you have aiohttp 3.8.4 which is incompatible.
llama-index-legacy 0.9.48 requires requests>=2.31.0, but you have requests 2.28.1 which is incompatible.
llama-index-legacy 0.9.48 requires SQLAlchemy[asyncio]>=1.4.49, but you have sqlalchemy 1.4.47 which is incompatible.
llama-index 0.6.11 requires typing-extensions==4.5.0, but you have typing-extensions 4.11.0 which is incompatible.
sphinx 5.0.2 requires docutils<0.1

# Load website with Firecrawl

In [2]:
from langchain_community.document_loaders import FireCrawlLoader  # Importing the FireCrawlLoader

url = "https://firecrawl.dev"
loader = FireCrawlLoader(
    api_key="fc-YOUR_API_KEY", # Note: Replace 'YOUR_API_KEY' with your actual FireCrawl API key
    url=url,  # Target URL to crawl
    mode="crawl"  # Mode set to 'crawl' to crawl all accessible subpages
)
docs = loader.load()

In [3]:
from langchain_community.embeddings import OllamaEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_community.vectorstores import FAISS
import ollama

# Set up Vectorstore

In [4]:
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
splits = text_splitter.split_documents(docs)
vectorstore = FAISS.from_documents(documents=splits, embedding=OllamaEmbeddings())

## Retrieval and Generation

In [6]:
from groq import Groq

client = Groq(
    api_key="API_KEY",  # Note: Replace 'API_KEY' with your actual Groq API key
)

question = "What is firecrawl?"
docs = vectorstore.similarity_search(query=question)
completion = client.chat.completions.create(
    model="llama3-8b-8192",
    messages=[
        {
            "role": "user",
            "content": f"You are a friendly assistant. Your job is to answer the users question based on the documentation provided below:\nDocs:\n\n{docs}\n\nQuestion: {question}"
        }
    ],
    temperature=1,
    max_tokens=1024,
    top_p=1,
    stream=False,
    stop=None,
)

print(completion.choices[0].message)


ChoiceMessage(content='According to the documentation provided, FireCrawl is a tool that "crawls and converts any website into clean markdown" and specializes in converting web data into clean, well-formatted markdown.', role='assistant', tool_calls=None)
