Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cookbook with Groq and LanceDB #310

Closed
sridharaiyer opened this issue May 17, 2024 · 2 comments
Closed

Cookbook with Groq and LanceDB #310

sridharaiyer opened this issue May 17, 2024 · 2 comments
Assignees

Comments

@sridharaiyer
Copy link

Can we get a cookbook with Groq and LanceDb? Would I still need to have a local embedding? If so, then can I get an example of how this Streamlit app can be hosted using a proper Dockerfile that downloads all packages, including ollama to download the nomic-text embedding model

Goal: I want to be able to upload a PDF file using Streamlit, and have a summarizer assistant with a specific prompt to summarize the document. This I need in combination with a RAG chatbot input at the bottom (below the summarized output in markdown in the main Streamlit window). All this without my customer needing to run a pgvector docker container. In other words, using a local DB such as LanceDB.

@jacobweiss2305 jacobweiss2305 self-assigned this May 17, 2024
@jacobweiss2305
Copy link
Contributor

@sridharaiyer we have integration with Lancedb:

"""Run `pip install openai lancedb tantivy` to install dependencies."""

from phi.agent import Agent
from phi.model.openai import OpenAIChat
from phi.knowledge.pdf import PDFUrlKnowledgeBase
from phi.vectordb.lancedb import LanceDb, SearchType

db_uri = "tmp/lancedb"
knowledge_base = PDFUrlKnowledgeBase(
    urls=["https://phi-public.s3.amazonaws.com/recipes/ThaiRecipes.pdf"],
    vector_db=LanceDb(table_name="recipes", uri=db_uri, search_type=SearchType.vector),
)
# Load the knowledge base: Comment out after first run
knowledge_base.load(upsert=True)

agent = Agent(
    model=OpenAIChat(id="gpt-4o"),
    knowledge=knowledge_base,
    # Add a tool to read chat history.
    read_chat_history=True,
    show_tool_calls=True,
    markdown=True,
    # debug_mode=True,
)
agent.print_response("How do I make chicken and galangal in coconut milk soup", stream=True)

And its super simple to add groq:

from phi.agent import Agent, RunResponse  # noqa
from phi.model.groq import Groq

agent = Agent(model=Groq(id="llama3-groq-70b-8192-tool-use-preview"), markdown=True)

@rekdt
Copy link

rekdt commented Oct 19, 2024

@jacobweiss2305 How come PDFUrlKnowledgeBase supports LanceDb but WebsiteKnowledgeBase does not?

lance_db

def name_exists(self, name: str) -> bool:
    raise NotImplementedError

Update

def name_exists(self, name: str) -> bool:
    """Checks if a URL already exists in the LanceDB table.  Uses the 'id' column (MD5 hash)."""
    try:
        # Search for the name (URL) in the 'id' column (MD5 hash of the content)
        result = self.table.search().where(f"{self._id} = '{name}'").to_arrow()
        return len(result) > 0  # True if a row with matching ID is found
    except Exception as e:
        logger.error(f"Error checking for existing name '{name}': {e}")
        return False

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants