Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add File Retrieval to Agents #102

Merged
merged 5 commits into from
Apr 19, 2024
Merged

Add File Retrieval to Agents #102

merged 5 commits into from
Apr 19, 2024

Conversation

tmichaeldb
Copy link
Contributor

@tmichaeldb tmichaeldb commented Apr 19, 2024

Requires https://github.com/mindsdb/mindsdb/pull/9098/files for testing.

This PR adds file retrieval to agents through the retrieval skill.

Example usage:

agent = agents.get('my_agent')
agent.add_file('./hooblyblob.txt', 'It has information about the company hooblyblob')
agent.completion([{'question': 'When was hooblyblob founded?', 'answer': None}])

with contents of ./hooblyblob.txt:

Hooblyblob was founded in 2024.

Steps taken through the SDK:

  1. Check if the file exists from filename or upload to MindsDB if it doesn't exist
  2. Insert the file contents into a knowledge base (can optionally pass existing KB, otherwise a new one is created for the file)
  3. Create a new retrieval skill using the knowledge base from 2
  4. Add the new retrieval skill to the agent

Copy link

github-actions bot commented Apr 19, 2024

Coverage

Coverage Report
FileStmtsMissCoverMissing
mindsdb_sdk
   agents.py1224563%20, 79, 87, 90, 94, 96, 98, 100, 102, 168–206, 229–231, 252–254, 258–262
   databases.py42295%107, 134
   handlers.py39197%77
   jobs.py65395%96–99
   knowledge_bases.py1101289%56–59, 139, 143–152, 230–233
   ml_engines.py42393%94, 126, 128
   models.py1811393%107, 195, 204, 206, 276, 308, 336, 457, 465, 484, 500, 527, 531
   projects.py63198%167
   query.py13192%14
   skills.py50786%43, 45, 49, 56, 72–76, 118
   tables.py108496%177, 189, 205, 297
   views.py37295%105, 138
mindsdb_sdk/connectors
   rest_api.py1722884%16–26, 32–33, 48, 69–71, 88, 91, 98–101, 152–160
TOTAL111612289% 

Tests Skipped Failures Errors Time
14 0 💤 0 ❌ 0 🔥 8.039s ⏱️

@tmichaeldb tmichaeldb self-assigned this Apr 19, 2024
@tmichaeldb tmichaeldb added the enhancement New feature or request label Apr 19, 2024
mindsdb_sdk/agents.py Outdated Show resolved Hide resolved
Copy link
Contributor

@dusvyat dusvyat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

other than minor points, looks good 🚀

# Upload file if it doesn't exist.
with open(file_path, 'rb') as file:
content = file.read()
df = pd.DataFrame.from_records([{'content': content}])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should integrate chunking here. Can be alright for now but def one for after hackathon if we don't have time

Copy link
Contributor Author

@tmichaeldb tmichaeldb Apr 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was imagining we would handle chunking on the KB mindsdb side. When doing a KB insert we could simply check if we're selecting from the files table, then do chunking before inserting into the VectorDB.

So yes, we should integrate chunking in some way for sure.

@tmichaeldb
Copy link
Contributor Author

I updated this PR to use the knowledge_base skill type instead of retrieval for now while we polish retrieval to be working 100%.

@tmichaeldb tmichaeldb marked this pull request as ready for review April 19, 2024 19:31
@tmichaeldb tmichaeldb merged commit 2ce18ca into staging Apr 19, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants