Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add grammar-based sampling (for webui, llamacpp, and koboldcpp) #293

Merged
merged 12 commits into from
Nov 4, 2023

Conversation

cpacker
Copy link
Owner

@cpacker cpacker commented Nov 4, 2023

Closes #273

  • Adds koboldcpp and llama.cpp backend support
  • Adds grammar-based sampling support for koboldcpp, llama.cpp, and webui backends
  • Adds basic grammar (json.gbnf) and MemGPT-specific grammar (json_func_calls_with_inner_thoughts.gbnf)
  • If you are using a backend that supports grammars, will run airoboros wrapper w/ grammar by default

Adds grammar-based sampling to prevent JSON-related parsing errors when using local LLMs, see llama.cpp.

Example

Starting the server (llama.cpp on MacOS):

llama.cpp % ./server -m ~/models/TheBloke/dolphin-2.2.1-mistral-7B-GGUF/dolphin-2.2.1-mistral-7b.Q6_K.gguf -c 8000

Starting the server (koboldcpp on MacOS):

koboldcpp % ./koboldcpp.py ~/models/TheBloke/dolphin-2.2.1-mistral-7B-GGUF/dolphin-2.2.1-mistral-7b.Q6_K.gguf --contextsize 8192

Specifying a wrapper with a grammar:

python main.py --model airoboros-l2-70b-2.1-grammar --first

@cpacker cpacker marked this pull request as draft November 4, 2023 00:46
@Drake-AI
Copy link
Contributor

Drake-AI commented Nov 4, 2023

I'm working on a grammar to avoid errors, it is in progress but it looks like this, includes names of functions, names and types of params:

root ::= Function
Function ::= SendMessage | PauseHeartbeats
SendMessage ::= "{" ws ""function":" ws ""send_message"," ws ""params":" ws SendMessageParams "}"
PauseHeartbeats ::= "{" ws ""function":" ws ""pause_heartbeats"," ws ""params":" ws PauseHeartbeatsParams "}"
SendMessageParams ::= "{}" | "{" ws Message ("," ws Message)* ws "}"
PauseHeartbeatsParams ::= "{}" | "{" ws Minutes ("," ws Minutes)* ws "}"
Message ::= ""inner_thoughts":" ws string | ""message":" ws string | ""request_heartbeat":" ws boolean
Minutes ::= ""minutes":" ws number
string ::= """ ([^"]) """
boolean ::= "true" | "false"
ws ::= [ \t\n]

number ::= [0-9]+ "."? [0-9]*

With this should be possible to use any model without finetuning.

@Drake-AI
Copy link
Contributor

Drake-AI commented Nov 4, 2023

It is working now with all the basic functions, easy to expand to extra functions. Feel free to use it or adapt to your needs.

root ::= Function
Function ::= SendMessage | PauseHeartbeats | CoreMemoryAppend | CoreMemoryReplace | ConversationSearch | ConversationSearchDate | ArchivalMemoryInsert | ArchivalMemorySearch
SendMessage ::= "{"   ws   "\"function\":"   ws   "\"send_message\","   ws   "\"params\":"   ws   SendMessageParams   "}"
PauseHeartbeats ::= "{"   ws   "\"function\":"   ws   "\"pause_heartbeats\","   ws   "\"params\":"   ws   PauseHeartbeatsParams   "}"
CoreMemoryAppend ::= "{"   ws   "\"function\":"   ws   "\"core_memory_append\","   ws   "\"params\":"   ws   CoreMemoryAppendParams   "}"
CoreMemoryReplace ::= "{"   ws   "\"function\":"   ws   "\"core_memory_replace\","   ws   "\"params\":"   ws   CoreMemoryReplaceParams   "}"
ConversationSearch  ::= "{"   ws   "\"function\":"   ws   "\"conversation_search\","   ws   "\"params\":"   ws   ConversationSearchParams   "}"
ConversationSearchDate  ::= "{"   ws   "\"function\":"   ws   "\"conversation_search_date\","   ws   "\"params\":"   ws   ConversationSearchDateParams   "}"
ArchivalMemoryInsert  ::= "{"   ws   "\"function\":"   ws   "\"archival_memory_insert\","   ws   "\"params\":"   ws   ArchivalMemoryInsertParams   "}"
ArchivalMemorySearch  ::= "{"   ws   "\"function\":"   ws   "\"archival_memory_search\","   ws   "\"params\":"   ws   ArchivalMemorySearchParams   "}"
SendMessageParams ::= "{"   ws   InnerThoughtsParam   ","   ws   "\"message\":"   ws   string   ws   "}"
PauseHeartbeatsParams ::= "{"   ws   InnerThoughtsParam   ","      ws   "\"minutes\":"   ws   number   ws   "}"
CoreMemoryAppendParams ::= "{"   ws   InnerThoughtsParam   ","      ws   "\"name\":"   ws   namestring   ","   ws   "\"content\":"   ws   string   ws   ","   ws   RequestHeartbeatParam   ws   "}"
CoreMemoryReplaceParams ::= "{"   ws   InnerThoughtsParam   ","      ws   "\"name\":"   ws   namestring   ","   ws   "\"old_content\":"   ws   string   ","   ws   "\"new_content\":"   ws   string   ws   ","   ws   RequestHeartbeatParam   ws   "}"
ConversationSearchParams ::= "{"   ws   InnerThoughtsParam   ","      ws   "\"query\":"   ws   string   ws   ","   ws   "\"page\":"   ws   number   ws   ","   ws   RequestHeartbeatParam   ws   "}"
ConversationSearchDateParams ::= "{"   ws   InnerThoughtsParam   ","      ws   "\"start_date\":"   ws   string   ws   ","      ws   "\"end_date\":"   ws   string   ws   ","   ws   "\"page\":"   ws   number   ws   ","   ws   RequestHeartbeatParam   ws   "}"
ArchivalMemoryInsertParams ::= "{"   ws   InnerThoughtsParam    ","   ws   "\"content\":"   ws   string   ws   ","   ws   RequestHeartbeatParam   ws   "}"
ArchivalMemorySearchParams ::= "{"   ws   InnerThoughtsParam   ","  ws   "\"query\":"   ws   string   ws   ","   ws   "\"page\":"   ws   number   ws   ","   ws   RequestHeartbeatParam   ws   "}"
InnerThoughtsParam ::= "\"inner_thoughts\":"   ws   string
RequestHeartbeatParam ::= "\"request_heartbeat\":"   ws   boolean
namestring ::= "\"human\"" | "\"persona\""
string ::= "\""   ([^"\[\]{}]*)   "\""
boolean ::= "true" | "false"
ws ::= [ \\t\\n]*
number ::= [0-9]+

@cpacker
Copy link
Owner Author

cpacker commented Nov 4, 2023

This is really awesome, thank you @Drake-AI ! I'll merge it into this PR as an additional grammar file.

Out of curiosity @Drake-AI while testing this what backend are you using? llama.cpp? web UI?

@Drake-AI
Copy link
Contributor

Drake-AI commented Nov 4, 2023

This is really awesome, thank you @Drake-AI ! I'll merge it into this PR as an additional grammar file.

Out of curiosity @Drake-AI while testing this what backend are you using? llama.cpp? web UI?

koboldcpp which is a fork of llamacpp, should work with both.

@cpacker
Copy link
Owner Author

cpacker commented Nov 4, 2023

This is really awesome, thank you @Drake-AI ! I'll merge it into this PR as an additional grammar file.
Out of curiosity @Drake-AI while testing this what backend are you using? llama.cpp? web UI?

koboldcpp which is a fork of llamacpp, should work with both.

awesome, I guess I should probably also add "official" support for kobold with a catch for BACKEND_TYPE=kobold, though if it's a fork of llama.cpp I wouldn't be surprised if BACKEND_TYPE=llamacpp works for it

@Drake-AI
Copy link
Contributor

Drake-AI commented Nov 4, 2023

No, it needs this variables:

"stop_sequence": [
    "\nUSER:",
    "\nASSISTANT:",
    "\nFUNCTION RETURN:",
    # '\n' +
    # '</s>',
    # '<|',
    # '\n#',
    # '\n\n\n',
],
"max_content_length": 4096,
"max_length": 512,

And you can send all parameters like temperature and all of that via api.

@Drake-AI
Copy link
Contributor

Drake-AI commented Nov 4, 2023

I found a bug in my grammar, please change it before merge to this new version:

root ::= Function
Function ::= SendMessage | PauseHeartbeats | CoreMemoryAppend | CoreMemoryReplace | ConversationSearch | ConversationSearchDate | ArchivalMemoryInsert | ArchivalMemorySearch
SendMessage ::= "{"   ws   "\"function\":"   ws   "\"send_message\","   ws   "\"params\":"   ws   SendMessageParams   "}"
PauseHeartbeats ::= "{"   ws   "\"function\":"   ws   "\"pause_heartbeats\","   ws   "\"params\":"   ws   PauseHeartbeatsParams   "}"
CoreMemoryAppend ::= "{"   ws   "\"function\":"   ws   "\"core_memory_append\","   ws   "\"params\":"   ws   CoreMemoryAppendParams   "}"
CoreMemoryReplace ::= "{"   ws   "\"function\":"   ws   "\"core_memory_replace\","   ws   "\"params\":"   ws   CoreMemoryReplaceParams   "}"
ConversationSearch  ::= "{"   ws   "\"function\":"   ws   "\"conversation_search\","   ws   "\"params\":"   ws   ConversationSearchParams   "}"
ConversationSearchDate  ::= "{"   ws   "\"function\":"   ws   "\"conversation_search_date\","   ws   "\"params\":"   ws   ConversationSearchDateParams   "}"
ArchivalMemoryInsert  ::= "{"   ws   "\"function\":"   ws   "\"archival_memory_insert\","   ws   "\"params\":"   ws   ArchivalMemoryInsertParams   "}"
ArchivalMemorySearch  ::= "{"   ws   "\"function\":"   ws   "\"archival_memory_search\","   ws   "\"params\":"   ws   ArchivalMemorySearchParams   "}"
SendMessageParams ::= "{"   ws   InnerThoughtsParam   ","   ws   "\"message\":"   ws   string   ws   "}"
PauseHeartbeatsParams ::= "{"   ws   InnerThoughtsParam   ","      ws   "\"minutes\":"   ws   number   ws   "}"
CoreMemoryAppendParams ::= "{"   ws   InnerThoughtsParam   ","      ws   "\"name\":"   ws   namestring   ","   ws   "\"content\":"   ws   string   ws   ","   ws   RequestHeartbeatParam   ws   "}"
CoreMemoryReplaceParams ::= "{"   ws   InnerThoughtsParam   ","      ws   "\"name\":"   ws   namestring   ","   ws   "\"old_content\":"   ws   string   ","   ws   "\"new_content\":"   ws   string   ws   ","   ws   RequestHeartbeatParam   ws   "}"
ConversationSearchParams ::= "{"   ws   InnerThoughtsParam   ","      ws   "\"query\":"   ws   string   ws   ","   ws   "\"page\":"   ws   number   ws   ","   ws   RequestHeartbeatParam   ws   "}"
ConversationSearchDateParams ::= "{"   ws   InnerThoughtsParam   ","      ws   "\"start_date\":"   ws   string   ws   ","      ws   "\"end_date\":"   ws   string   ws   ","   ws   "\"page\":"   ws   number   ws   ","   ws   RequestHeartbeatParam   ws   "}"
ArchivalMemoryInsertParams ::= "{"   ws   InnerThoughtsParam    ","   ws   "\"content\":"   ws   string   ws   ","   ws   RequestHeartbeatParam   ws   "}"
ArchivalMemorySearchParams ::= "{"   ws   InnerThoughtsParam   ","  ws   "\"query\":"   ws   string   ws   ","   ws   "\"page\":"   ws   number   ws   ","   ws   RequestHeartbeatParam   ws   "}"
InnerThoughtsParam ::= "\"inner_thoughts\":"   ws   string
RequestHeartbeatParam ::= "\"request_heartbeat\":"   ws   boolean
namestring ::= "\"human\"" | "\"persona\""
string ::= "\""   ([^"\[\]{}]*)   "\""
boolean ::= "true" | "false"
ws ::= ""
number ::= [0-9]+

@cpacker
Copy link
Owner Author

cpacker commented Nov 4, 2023

I found a bug in my grammar, please change it before merge to this new version:

root ::= Function
Function ::= SendMessage | PauseHeartbeats | CoreMemoryAppend | CoreMemoryReplace | ConversationSearch | ConversationSearchDate | ArchivalMemoryInsert | ArchivalMemorySearch
SendMessage ::= "{"   ws   "\"function\":"   ws   "\"send_message\","   ws   "\"params\":"   ws   SendMessageParams   "}"
PauseHeartbeats ::= "{"   ws   "\"function\":"   ws   "\"pause_heartbeats\","   ws   "\"params\":"   ws   PauseHeartbeatsParams   "}"
CoreMemoryAppend ::= "{"   ws   "\"function\":"   ws   "\"core_memory_append\","   ws   "\"params\":"   ws   CoreMemoryAppendParams   "}"
CoreMemoryReplace ::= "{"   ws   "\"function\":"   ws   "\"core_memory_replace\","   ws   "\"params\":"   ws   CoreMemoryReplaceParams   "}"
ConversationSearch  ::= "{"   ws   "\"function\":"   ws   "\"conversation_search\","   ws   "\"params\":"   ws   ConversationSearchParams   "}"
ConversationSearchDate  ::= "{"   ws   "\"function\":"   ws   "\"conversation_search_date\","   ws   "\"params\":"   ws   ConversationSearchDateParams   "}"
ArchivalMemoryInsert  ::= "{"   ws   "\"function\":"   ws   "\"archival_memory_insert\","   ws   "\"params\":"   ws   ArchivalMemoryInsertParams   "}"
ArchivalMemorySearch  ::= "{"   ws   "\"function\":"   ws   "\"archival_memory_search\","   ws   "\"params\":"   ws   ArchivalMemorySearchParams   "}"
SendMessageParams ::= "{"   ws   InnerThoughtsParam   ","   ws   "\"message\":"   ws   string   ws   "}"
PauseHeartbeatsParams ::= "{"   ws   InnerThoughtsParam   ","      ws   "\"minutes\":"   ws   number   ws   "}"
CoreMemoryAppendParams ::= "{"   ws   InnerThoughtsParam   ","      ws   "\"name\":"   ws   namestring   ","   ws   "\"content\":"   ws   string   ws   ","   ws   RequestHeartbeatParam   ws   "}"
CoreMemoryReplaceParams ::= "{"   ws   InnerThoughtsParam   ","      ws   "\"name\":"   ws   namestring   ","   ws   "\"old_content\":"   ws   string   ","   ws   "\"new_content\":"   ws   string   ws   ","   ws   RequestHeartbeatParam   ws   "}"
ConversationSearchParams ::= "{"   ws   InnerThoughtsParam   ","      ws   "\"query\":"   ws   string   ws   ","   ws   "\"page\":"   ws   number   ws   ","   ws   RequestHeartbeatParam   ws   "}"
ConversationSearchDateParams ::= "{"   ws   InnerThoughtsParam   ","      ws   "\"start_date\":"   ws   string   ws   ","      ws   "\"end_date\":"   ws   string   ws   ","   ws   "\"page\":"   ws   number   ws   ","   ws   RequestHeartbeatParam   ws   "}"
ArchivalMemoryInsertParams ::= "{"   ws   InnerThoughtsParam    ","   ws   "\"content\":"   ws   string   ws   ","   ws   RequestHeartbeatParam   ws   "}"
ArchivalMemorySearchParams ::= "{"   ws   InnerThoughtsParam   ","  ws   "\"query\":"   ws   string   ws   ","   ws   "\"page\":"   ws   number   ws   ","   ws   RequestHeartbeatParam   ws   "}"
InnerThoughtsParam ::= "\"inner_thoughts\":"   ws   string
RequestHeartbeatParam ::= "\"request_heartbeat\":"   ws   boolean
namestring ::= "\"human\"" | "\"persona\""
string ::= "\""   ([^"\[\]{}]*)   "\""
boolean ::= "true" | "false"
ws ::= ""
number ::= [0-9]+

updated, thank you for the patch! also I added kobold support, though i haven't tested it yet

@cpacker cpacker marked this pull request as ready for review November 4, 2023 17:57
@cpacker cpacker changed the title [Draft] Add grammar-based sampling (initially via llama.cpp server directly) Add grammar-based sampling (initially via llama.cpp server directly) Nov 4, 2023
@cpacker cpacker requested a review from vivi November 4, 2023 17:58
@cpacker cpacker changed the title Add grammar-based sampling (initially via llama.cpp server directly) Add grammar-based sampling (for webui, llamacpp, and koboldcpp) Nov 4, 2023
@cpacker
Copy link
Owner Author

cpacker commented Nov 4, 2023

Tested on kobold.cpp, seems to be working:

MemGPT % python main.py --model airoboros-l2-70b-2.1-grammar --first
image image

@cpacker
Copy link
Owner Author

cpacker commented Nov 4, 2023

Pending extra review from @vivi will merge - @Drake-AI let me know if you catch any bugs in the grammar in the meantime.

Copy link
Contributor

@vivi vivi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 😈

@vivi
Copy link
Contributor

vivi commented Nov 4, 2023

Thanks for your help @Drake-AI! Will merge this in and add you as a co-author.

@vivi vivi merged commit 1524d6a into main Nov 4, 2023
2 checks passed
@cpacker cpacker deleted the grammar-based-sampling branch November 5, 2023 05:25
maociao added a commit to maociao/MemGPT that referenced this pull request Dec 10, 2023
* FIx cpacker#261 (cpacker#300)

* should fix issue 261 - pickle fail on DotDict class

* black patch

---------

Co-authored-by: cpacker <packercharles@gmail.com>

* Add grammar-based sampling (for webui, llamacpp, and koboldcpp) (cpacker#293)

* add llamacpp server support

* use gbnf loader

* cleanup and warning about grammar when not using llama.cpp

* added memgpt-specific grammar file

* add grammar support to webui api calls

* black

* typo

* add koboldcpp support

* no more defaulting to webui, should error out instead

* fix grammar

* patch kobold (testing, now working) + cleanup log messages

Co-Authored-By: Drake-AI <drake-ai@users.noreply.github.com>

* Bump version to 0.1.18-alpha.1

* fix: import PostgresStorageConnector only if postgres is selected as storage type (cpacker#310)

* Don't import postgres storage if not specified in config (cpacker#318)

* Aligned code with README that environment variable for Azure embeddings should be AZURE_OPENAI_EMBEDDINGS_DEPLOYMENT (cpacker#308)

* Fix: imported wrong storage connector  (cpacker#320)

* Fix formatting in README.md

* Remove embeddings as argument in archival_memory.insert (cpacker#284)

* Create docs pages (cpacker#328)

* Create docs  (cpacker#323)

* Create .readthedocs.yaml

* Update mkdocs.yml

* update

* revise

* syntax

* syntax

* syntax

* syntax

* revise

* revise

* spacing

* Docs (cpacker#327)

* add stuff

* patch homepage

* more docs

* updated

* updated

* refresh

* refresh

* refresh

* update

* refresh

* refresh

* refresh

* refresh

* missing file

* refresh

* refresh

* refresh

* refresh

* fix black

* refresh

* refresh

* refresh

* refresh

* add readme for just the docs

* Update README.md

* add more data loading docs

* cleanup data sources

* refresh

* revised

* add search

* make prettier

* revised

* updated

* refresh

* favi

* updated

---------

Co-authored-by: Sarah Wooders <sarahwooders@gmail.com>

* patch in-chat command info (cpacker#332)

* Update chat_completion_proxy.py (cpacker#326)

grammar_name Has to be defined, if not there's an issue with line 92

* cleanup cpacker#326 (cpacker#333)

* Stopping the app to repeat the user message in normal use. (cpacker#304)

- Removed repeating every user message like bein in debug mode
- Re-added the "dump" flag for the user message, to make it look nicer.
  I may "reformat" other message too when dumping, but that was what
  sticked out to me as unpleasant.

* Remove redundant docs from README (cpacker#334)

* Fix README local LLM link

* Add autogen+localllm docs (cpacker#335)

Co-authored-by: Jirito0 <jirito0@users.noreply.github.com>

* Update quickstart.md to show flag list properly

* Add `memgpt version` command and package version (cpacker#336)

* add ollama support (cpacker#314)

* untested

* patch

* updated

* clarified using tags in docs

* tested ollama, working

* fixed template issue by creating dummy template, also added missing context length indicator

* moved count_tokens to utils.py

* clean

* Better interface output for function calls (cpacker#296)

Co-authored-by: Charles Packer <packercharles@gmail.com>

* Better error message printing for function call failing (cpacker#291)

* Better error message printing for function call failing

* only one import traceback

* don't forward entire stack trace to memgpt

* Fixing some dict value checking for function_call (cpacker#249)

* Specify model inference and embedding endpoint separately  (cpacker#286)

* Fix config tests (cpacker#343)

Co-authored-by: Vivian Fang <hi@vivi.sh>

* Avoid throwing error for older `~/.memgpt/config` files due to missing section `archival_storage` (cpacker#344)

* avoid error if has old config type

* Dependency management  (cpacker#337)

* Divides dependencies into `pip install pymemgpt[legacy,local,postgres,dev]`. 
* Update docs

* Relax verify_first_message_correctness to accept any function call (cpacker#340)

* Relax verify_first_message_correctness to accept any function call

* Also allow missing internal monologue if request_heartbeat

* Cleanup

* get instead of raw dict access

* Update `poetry.lock` (cpacker#346)

* mark depricated API section

* add readme

* add readme

* add readme

* add readme

* add readme

* add readme

* add readme

* add readme

* add readme

* CLI bug fixes for azure

* check azure before running

* Update README.md

* Update README.md

* bug fix with persona loading

* remove print

* make errors for cli flags more clear

* format

* fix imports

* fix imports

* add prints

* update lock

* Add autogen example that lets you chat with docs (cpacker#342)

* Relax verify_first_message_correctness to accept any function call

* Also allow missing internal monologue if request_heartbeat

* Cleanup

* get instead of raw dict access

* Support attach in memgpt autogen agent

* Add docs example

* Add documentation, cleanup

* add gpt-4-turbo (cpacker#349)

* add gpt-4-turbo

* add in another place

* change to 3.5 16k

* Revert relaxing verify_first_message_correctness, still add archival_memory_search as an exception (cpacker#350)

* Revert "Relax verify_first_message_correctness to accept any function call (cpacker#340)"

This reverts commit 30e9110.

* add archival_memory_search as an exception for verify

* Bump version to 0.1.18 (cpacker#351)

* Remove `requirements.txt` and `requirements_local.txt` (cpacker#358)

* update requirements to match poetry

* update with extras

* remove requirements

* disable pretty exceptions (cpacker#367)

* Updated documentation for users (cpacker#365)


---------

Co-authored-by: Vivian Fang <hi@vivi.sh>

* Create pull_request_template.md (cpacker#368)

* Create pull_request_template.md

* Add pymemgpt-nightly workflow (cpacker#373)

* Add pymemgpt-nightly workflow

* change token name

* Update lmstudio.md (cpacker#382)

* Update lmstudio.md

* Update lmstudio.md

* Update lmstudio.md to show the Prompt Formatting Option (cpacker#384)

* Update lmstudio.md to show the Prompt Formatting Option

* Update lmstudio.md Update the screenshot

* Swap asset location from cpacker#384 (cpacker#385)

* Update poetry with `pg8000` and include `pgvector` in docs  (cpacker#390)

* Allow overriding config location with `MEMGPT_CONFIG_PATH` (cpacker#383)

* Always default to local embeddings if not OpenAI or Azure  (cpacker#387)

* Add support for larger archival memory stores (cpacker#359)

* Replace `memgpt run` flags error with warning + remove custom embedding endpoint option + add agent create time (cpacker#364)

* Update webui.md (cpacker#397)

turn emoji warning into markdown warning

* Update webui.md (cpacker#398)

* softpass test when keys are missing (cpacker#369)

* softpass test when keys are missing

* update to use local model

* both openai and local

* typo

* fix

* Specify model inference and embedding endpoint separately  (cpacker#286)

* Fix config tests (cpacker#343)

Co-authored-by: Vivian Fang <hi@vivi.sh>

* Avoid throwing error for older `~/.memgpt/config` files due to missing section `archival_storage` (cpacker#344)

* avoid error if has old config type

* Dependency management  (cpacker#337)

* Divides dependencies into `pip install pymemgpt[legacy,local,postgres,dev]`. 
* Update docs

* Relax verify_first_message_correctness to accept any function call (cpacker#340)

* Relax verify_first_message_correctness to accept any function call

* Also allow missing internal monologue if request_heartbeat

* Cleanup

* get instead of raw dict access

* Update `poetry.lock` (cpacker#346)

* mark depricated API section

* add readme

* add readme

* add readme

* add readme

* add readme

* add readme

* add readme

* add readme

* add readme

* CLI bug fixes for azure

* check azure before running

* Update README.md

* Update README.md

* bug fix with persona loading

* remove print

* make errors for cli flags more clear

* format

* fix imports

* fix imports

* add prints

* update lock

* Add autogen example that lets you chat with docs (cpacker#342)

* Relax verify_first_message_correctness to accept any function call

* Also allow missing internal monologue if request_heartbeat

* Cleanup

* get instead of raw dict access

* Support attach in memgpt autogen agent

* Add docs example

* Add documentation, cleanup

* add gpt-4-turbo (cpacker#349)

* add gpt-4-turbo

* add in another place

* change to 3.5 16k

* Revert relaxing verify_first_message_correctness, still add archival_memory_search as an exception (cpacker#350)

* Revert "Relax verify_first_message_correctness to accept any function call (cpacker#340)"

This reverts commit 30e9110.

* add archival_memory_search as an exception for verify

* Bump version to 0.1.18 (cpacker#351)

* Remove `requirements.txt` and `requirements_local.txt` (cpacker#358)

* update requirements to match poetry

* update with extras

* remove requirements

* disable pretty exceptions (cpacker#367)

* Updated documentation for users (cpacker#365)


---------

Co-authored-by: Vivian Fang <hi@vivi.sh>

* Create pull_request_template.md (cpacker#368)

* Create pull_request_template.md

* Add pymemgpt-nightly workflow (cpacker#373)

* Add pymemgpt-nightly workflow

* change token name

* Update lmstudio.md (cpacker#382)

* Update lmstudio.md

* Update lmstudio.md

* Update lmstudio.md to show the Prompt Formatting Option (cpacker#384)

* Update lmstudio.md to show the Prompt Formatting Option

* Update lmstudio.md Update the screenshot

* Swap asset location from cpacker#384 (cpacker#385)

* Update poetry with `pg8000` and include `pgvector` in docs  (cpacker#390)

* Allow overriding config location with `MEMGPT_CONFIG_PATH` (cpacker#383)

* Always default to local embeddings if not OpenAI or Azure  (cpacker#387)

* Add support for larger archival memory stores (cpacker#359)

* Replace `memgpt run` flags error with warning + remove custom embedding endpoint option + add agent create time (cpacker#364)

* Update webui.md (cpacker#397)

turn emoji warning into markdown warning

* Update webui.md (cpacker#398)

* dont hard code embeddings

* formatting

* black

* add full deps

* remove changes

* update poetry

---------

Co-authored-by: Sarah Wooders <sarahwooders@gmail.com>
Co-authored-by: Vivian Fang <hi@vivi.sh>
Co-authored-by: MSZ-MGS <65172063+MSZ-MGS@users.noreply.github.com>

* Use `~/.memgpt/config` to set questionary defaults in `memgpt configure` + update tests to use specific config path (cpacker#389)

* Dockerfile for running postgres locally (cpacker#393)

* Return empty list if archival memory search over empty local index  (cpacker#402)

* Remove AsyncAgent and async from cli (cpacker#400)

* Remove AsyncAgent and async from cli

Refactor agent.py memory.py

Refactor interface.py

Refactor main.py

Refactor openai_tools.py

Refactor cli/cli.py

stray asyncs

save

make legacy embeddings not use async

Refactor presets

Remove deleted function from import

* remove stray prints

* typo

* another stray print

* patch test

---------

Co-authored-by: cpacker <packercharles@gmail.com>

* I added some json repairs that helped me with malformed messages (cpacker#341)

* I added some json repairs that helped me with malformed messages

There are two of them: The first will remove hard line feeds that appear
in the message part because the model added those instead of escaped
line feeds. This happens a lot in my experiments and that actually fixes
them.

The second one is less tested and should handle the case that the model
answers with multiple blocks of strings in quotes or even uses unescaped
quotes. It should grab everything betwenn the message: " and the ending
curly braces, escape them and makes it propper json that way.

Disclaimer: Both function were written with the help of ChatGPT-4 (I
can't write much Python). I think the first one is quite solid but doubt
that the second one is fully working. Maybe somebody with more Python
skills than me (or with more time) has a better idea for this type of
malformed replies.

* Moved the repair output behind the debug flag and removed the "clean" one

* Added even more fixes (out of what I just encountered while testing)

It seems that cut of json can be corrected and sometimes the model is to
lazy to add not just one curly brace but two. I think it does not "cost"
a lot to try them all out. But the expeptions get massive that way :)

* black

* for the final hail mary with extract_first_json, might as well add a double end bracket instead of single

---------

Co-authored-by: cpacker <packercharles@gmail.com>

* Fix max tokens constant (cpacker#374)

* stripped LLM_MAX_TOKENS constant, instead it's a dictionary, and context_window is set via the config (defaults to 8k)

* pass context window in the calls to local llm APIs

* safety check

* remove dead imports

* context_length -> context_window

* add default for agent.load

* in configure, ask for the model context window if not specified via dictionary

* fix default, also make message about OPENAI_API_BASE missing more informative

* make openai default embedding if openai is default llm

* make openai on top of list

* typo

* also make local the default for embeddings if you're using localllm instead of the locallm endpoint

* provide --context_window flag to memgpt run

* fix runtime error

* stray comments

* stray comment

* [version] bump version to 0.2.0 (cpacker#410)

* Fix main.yml to not rely on requirements.txt (cpacker#411)

* Hotfix openai create all with context_window kwarg (cpacker#413)

* fix agent load (cpacker#412)

* Patch local LLMs with context_window (cpacker#416)

* patch

* patch ollama

* patch lmstudio

* patch kobold

* Fix model configuration for when `config.model == "local"` previously  (cpacker#415)

* fix agent load

* fix model config

* add errors to make sure envs set correctly (cpacker#418)

* [version] bump version to 0.2.1 (cpacker#417)

* fix memgptagent attach docs error (cpacker#427)

Co-authored-by: Anjalee Sudasinghe <anjalee@codegen.net>

* [fix] remove asserts for `OPENAI_API_BASE` (cpacker#432)

* mark depricated API section

* add readme

* add readme

* add readme

* add readme

* add readme

* add readme

* add readme

* add readme

* add readme

* CLI bug fixes for azure

* check azure before running

* Update README.md

* Update README.md

* bug fix with persona loading

* remove print

* make errors for cli flags more clear

* format

* fix imports

* fix imports

* add prints

* update lock

* remove asserts

* patch (cpacker#435)

* patch cpacker#428 (cpacker#433)

* [version] bump release to 0.2.2 (cpacker#436)

* fix config (cpacker#438)

* Configurable presets to support easy extension of MemGPT's function set (cpacker#420)

* partial

* working schema builder, tested that it matches the hand-written schemas

* correct another schema diff

* refactor

* basic working test

* refactored preset creation to use yaml files

* added docstring-parser

* add code for dynamic function linking in agent loading

* pretty schema diff printer

* support pulling from ~/.memgpt/functions/*.py

* clean

* allow looking for system prompts in ~/.memgpt/system_prompts

* create ~/.memgpt/system_prompts if it doesn't exist

* pull presets from ~/.memgpt/presets in addition to examples folder

* add support for loading agent configs that have additional keys

---------

Co-authored-by: Sarah Wooders <sarahwooders@gmail.com>

* WebSocket interface and basic `server.py` process (cpacker#399)

* patch getargspec error (cpacker#440)

* always cast `config.context_window` to `int` before use (cpacker#444)

* always cast config.context_window to int before use

* extra code to be super safe if self.config.context_window is somehow None

* Refactor config + determine LLM via `config.model_endpoint_type` (cpacker#422)

* mark depricated API section

* CLI bug fixes for azure

* check azure before running

* Update README.md

* Update README.md

* bug fix with persona loading

* remove print

* make errors for cli flags more clear

* format

* fix imports

* fix imports

* add prints

* update lock

* update config fields

* cleanup config loading

* commit

* remove asserts

* refactor configure

* put into different functions

* add embedding default

* pass in config

* fixes

* allow overriding openai embedding endpoint

* black

* trying to patch tests (some circular import errors)

* update flags and docs

* patched support for local llms using endpoint and endpoint type passed via configs, not env vars

* missing files

* fix naming

* fix import

* fix two runtime errors

* patch ollama typo, move ollama model question pre-wrapper, modify question phrasing to include link to readthedocs, also have a default ollama model that has a tag included

* disable debug messages

* made error message for failed load more informative

* don't print dynamic linking function warning unless --debug

* updated tests to work with new cli workflow (disabled openai config test for now)

* added skips for tests when vars are missing

* update bad arg

* revise test to soft pass on empty string too

* don't run configure twice

* extend timeout (try to pass against nltk download)

* update defaults

* typo with endpoint type default

* patch runtime errors for when model is None

* catching another case of 'x in model' when model is None (preemptively)

* allow overrides to local llm related config params

* made model wrapper selection from a list vs raw input

* update test for select instead of input

* Fixed bug in endpoint when using local->openai selection, also added validation loop to manual endpoint entry

* updated error messages to be more informative with links to readthedocs

* add back gpt3.5-turbo

---------

Co-authored-by: cpacker <packercharles@gmail.com>

* patch bad merge

* patch websocket server after presets refactor

* Update config to include `memgpt_version` and re-run configuration for old versions on `memgpt run` (cpacker#450)

* mark depricated API section

* add readme

* add readme

* add readme

* add readme

* add readme

* add readme

* add readme

* add readme

* add readme

* CLI bug fixes for azure

* check azure before running

* Update README.md

* Update README.md

* bug fix with persona loading

* remove print

* make errors for cli flags more clear

* format

* fix imports

* fix imports

* add prints

* update lock

* remove asserts

* store config versions and force update in some cases

* Add load and load_and_attach functions to memgpt autogen agent. (cpacker#430)

* Add load and load_and_attach functions to memgpt autogen agent.

* Only recompute files if dataset does not exist.

* Update documentation [local LLMs, presets] (cpacker#453)

* updated local llm documentation

* updated cli flags to be consistent with documentation

* added preset documentation

* update test to use new arg

* update test to use new arg

* missing .md file

* When default_mode_endpoint has a value, it needs to become model_endpoint. (cpacker#452)

Co-authored-by: Oliver Smith <oliver.smith@superevilmegacorp.com>

* Upgrade workflows to Python 3.11 (cpacker#441)

* use python 3.11

* change format

* [version] bump version to 0.2.3 (cpacker#457)

* Set service context for llama index in `local.py`  (cpacker#462)

* mark depricated API section

* add readme

* add readme

* add readme

* add readme

* add readme

* add readme

* add readme

* add readme

* add readme

* CLI bug fixes for azure

* check azure before running

* Update README.md

* Update README.md

* bug fix with persona loading

* remove print

* make errors for cli flags more clear

* format

* fix imports

* fix imports

* add prints

* update lock

* remove asserts

* bump version

* set global context for llama index

* Update functions.md (cpacker#461)

* bugfix for linking functions from ~/.memgpt/functions (cpacker#463)

* Add d20 function example to readthedocs (cpacker#464)

* Update functions.md

* Update functions.md

* move webui to new openai completions endpoint, but also provide existing functionality via webui-legacy backend (cpacker#468)

* updated websocket protocol and server (cpacker#473)

* Lancedb storage integration (cpacker#455)

* Docs: Fix typos (cpacker#477)

* Remove .DS_Store from agents list (cpacker#485)

* Fix cpacker#487 (summarize call uses OpenAI even with local LLM config) (cpacker#488)

* use new chatcompletion function that takes agent config inside of summarize

* patch issue with model now missing

* patch web UI (cpacker#484)

* patch web UI

* set truncation_length

* ANNA, an acronym for Adaptive Neural Network Assistant. Which acts as your personal research assistant really good with archival documents and research. (cpacker#494)

* vLLM support (cpacker#492)

* init vllm (not tested), uses POST API not openai wrapper

* add to cli config list

* working vllm endpoint

* add model configuration for vllm

---------

Co-authored-by: Sarah Wooders <sarahwooders@gmail.com>

* Add error handling during linking imports (cpacker#495)

* Add error handling during linking imports

* correct typo + make error message even more explicit

* deadcode

* Fixes bugs with AutoGen implementation and exampes (cpacker#498)

* patched bugs in autogen agent example, updated autogen agent creation to follow agentconfig paradigm

* more fixes

* black

* fix bug in autoreply

* black

* pass default autoreply through to the memgpt autogen conversibleagent subclass so that it doesn't leave empty messages which can trigger errors in local llm backends like lmstudio

* update version (cpacker#497)

* add new manual json parser meant to catch send_message calls with trailing bad extra chars (cpacker#509)

* add new manual json parser meant to catch send_message calls with stray trailing chars, patch json error passing

* typo

* add a longer prefix that to the default wrapper (cpacker#510)

* add a longer prefix that to the default wrapper (not just opening brace, but up to 'function: ' part since that is always present)

* drop print

* add core memory char limits to text shown in core memory (cpacker#508)

* add core memory char limits to text shown in core memory

* include char limit in xml tag

* add flag to allow reverting to old version

* extra arg being passed causing a runtime error (cpacker#517)

* Add warning if no data sources loaded on `/attach` command  (cpacker#513)

* minor fix

* add warn instead of error for no data sources

* fix autogem to autogen (cpacker#512)

* Update contributing guidelines  (cpacker#516)

* update contributing

* Update CONTRIBUTING.md

* Update CONTRIBUTING.md

* Update CONTRIBUTING.md

* Update CONTRIBUTING.md

* Update CONTRIBUTING.md

* Update contributing.md (cpacker#518)

* Update contributing.md (cpacker#520)

* Add support for HuggingFace Text Embedding Inference endpoint for embeddings  (cpacker#524)

* Update mkdocs theme, small fixes for `mkdocs.yml` (cpacker#522)

* Update mkdocs.yml (cpacker#525)

* Clean memory error messages (cpacker#523)

* Raise a custom keyerror instead of basic keyerror to clarify issue to LLM processor

* remove self value from error message passed to LLM processor

* simplify error message propogated to llm processor

* Fix class names used in persistence manager logging (cpacker#503)

* Fix class names used in persistence manager logging

Signed-off-by: Claudio Cambra <developer@claudiocambra.com>

* Use self.__class__.__name__ for logging in different persistence managers

Signed-off-by: Claudio Cambra <developer@claudiocambra.com>

---------

Signed-off-by: Claudio Cambra <developer@claudiocambra.com>

* add autogen extra (cpacker#530)

* Add `user` field for vLLM endpoint  (cpacker#531)

* patched a bug where outputs of a regex extraction weren't getting cast back to string, causing an issue when the dict was then passed to json.dumps() (cpacker#533)

* Update bug_report.md (cpacker#532)

* Update bug_report.md

* LanceDB integration bug fixes and improvements (cpacker#528)

* fixes

* update

* lint

* Remove `openai` package and migrate to requests (cpacker#534)

* Update contributing.md (typo) (cpacker#538)

* Run formatting checks with poetry (cpacker#537)

* update black version

* add workflow dispatch

* Removing dead code + legacy commands  (cpacker#536)

* Remove usage of `BACKEND_TYPE` (cpacker#539)

* Update AutoGen documentation and notebook example (cpacker#540)

* Update AutoGen documentation

* Update webui.md

* Update webui.md

* Update lmstudio.md

* Update lmstudio.md

* Update mkdocs.yml

* Update README.md

* Update README.md

* Update README.md

* Update autogen.md

* Update local_llm.md

* Update local_llm.md

* Update autogen.md

* Update autogen.md

* Update autogen.md

* refreshed the autogen examples + notebook (notebook is untested)

* unrelated patch of typo I noticed

* poetry remove pyautogen, then manually removed autogen extra in .toml

* add pdf dependency

---------

Co-authored-by: Sarah Wooders <sarahwooders@gmail.com>

* Update local_llm.md (cpacker#542)

* Documentation update (cpacker#541)

* Update autogen.md

* Update autogen.md

* clean docs (cpacker#543)

* Update autogen.md (cpacker#544)

* update docs (cpacker#547)

* update admonitions

* Update local_llm.md

* Update webui.md

* Update autogen.md

* Update storage.md

* Update example_chat.md

* Update example_data.md

* Update example_chat.md

* Update example_data.md

* added vLLM doc page since we support it (cpacker#545)

* added vLLM doc page since we support it

* capitalization

* updated documentation

* Update vllm.md

* Update ollama.md

* Update ollama.md

* Update ollama.md

* Update autogen.md

* Fix vLLM endpoint to have correct suffix (cpacker#548)

* minor fix

* fix vllm endpoint

* fix docs

* Add documentation for using Hugging Face models for embeddings  (cpacker#549)

* Update README.md

* bump version (cpacker#551)

* Add docs file for customizing embedding mode  (cpacker#554)

* minor fix

* forgot to add embedding file

* Upgrade to `llama_index=0.9.10` (cpacker#556)

* minor fix

* forgot to add embedding file

* upgrade llama index

* fix cannot import name 'EmptyIndex' from 'llama_index' (cpacker#558)

* Update README.md

* Update storage.md (cpacker#564)

fix typo

* use a consistent warning prefix across codebase (cpacker#569)

* Update autogen.md to include Azure config example + patch for `pyautogen>=0.2.0` (cpacker#555)

* Update autogen.md

* in groupchat example add an azure elif

* fixed missing azure mappings + corrected the gpt-4-turbo one

* Updated MemGPT AutoGen agent to take credentials and store them in the config (allows users to use memgpt+autogen without running memgpt configure), also patched api_base kwarg for autogen >=v0.2

* add note about 0.2 testing

* added overview to autogen integration page

* default examples to openai, sync config header between the two main examples, change speaker mode to round-robin in 2-way chat to supress warning

* sync config header on last example (not used in docs)

* refactor to make sure we use existing config when writing out extra credentials

* fixed bug in local LLM where we need to comment out api_type (for pyautogen>=0.2.0)

* Update autogen.md

* Update autogen.md (cpacker#571)

Update example config to match `pyautogen==0.2.0`

* Fix crash from bad key access into response_message without function_call (cpacker#437)

Signed-off-by: Claudio Cambra <developer@claudiocambra.com>

* sort agents by directory-last-modified time (cpacker#574)

* sort agents by directory-last-modified time

* only save agent config when agent is saved

---------

Co-authored-by: Sarah Wooders <sarahwooders@gmail.com>

* Add safety check to pop (cpacker#575)

* Add safety check to pop

* typo

* Add `pyyaml` package to `pyproject.toml` (cpacker#557)

* add back dotdict for backcompat (cpacker#572)

* Bump version to 0.2.6 (cpacker#573)

* Update cli_faq.md

* Update cli_faq.md

* Update cli_faq.md

* allow passing `skip_verify` to autogen constructors (cpacker#581)

* allow passing skip_verify to autogen constructors

* added flag to examples with a NOTE, also added to docs

* Chroma storage integration  (cpacker#285)

* Fix `pyproject.toml` chroma version  (cpacker#582)

* mark depricated API section

* add readme

* add readme

* add readme

* add readme

* add readme

* add readme

* add readme

* add readme

* add readme

* CLI bug fixes for azure

* check azure before running

* Update README.md

* Update README.md

* bug fix with persona loading

* remove print

* make errors for cli flags more clear

* format

* add initial postgres implementation

* working chroma loading

* add postgres tests

* working initial load into postgres and chroma

* add load index command

* semi working load index

* disgusting import code thanks to llama index's nasty APIs

* add postgres connector

* working postgres integration

* working local storage (changed saving)

* implement /attach

* remove old code

* split up storage conenctors into multiple files

* remove unused code

* cleanup

* implement vector db loading

* cleanup state savign

* add chroma

* minor fix

* fix up chroma integration

* fix list error

* update dependencies

* update docs

* format

* cleanup

* forgot to add embedding file

* upgrade llama index

* fix data source naming bug

* remove legacy

* os import

* upgrade chroma version

* fix chroma package

* Remove broken tests from chroma merge (cpacker#584)

* fix runtime error (cpacker#586)

* Patch azure embeddings + handle azure deployments properly (cpacker#594)

* Fix bug where embeddings endpoint was getting set to deployment, upgraded pinned llama-index to use new version that has azure endpoint

* updated documentation

* added memgpt example for openai

* change wording to match configure

---------

Signed-off-by: Claudio Cambra <developer@claudiocambra.com>
Co-authored-by: danx0r <danbmil99@gmail.com>
Co-authored-by: cpacker <packercharles@gmail.com>
Co-authored-by: Drake-AI <drake-ai@users.noreply.github.com>
Co-authored-by: Vivian Fang <hi@vivi.sh>
Co-authored-by: Robin Goetz <35136007+goetzrobin@users.noreply.github.com>
Co-authored-by: Sarah Wooders <sarahwooders@gmail.com>
Co-authored-by: Dividor <matthew@regolith.org>
Co-authored-by: borewik <borewik@gmail.com>
Co-authored-by: Hans Raaf <hara@oderwat.de>
Co-authored-by: Jirito0 <jirito0@users.noreply.github.com>
Co-authored-by: Mo Nuaimat <nuaimat2002@yahoo.com>
Co-authored-by: MSZ-MGS <65172063+MSZ-MGS@users.noreply.github.com>
Co-authored-by: Bob Kerns <1154903+BobKerns@users.noreply.github.com>
Co-authored-by: Anjalee Sudasinghe <42403668+anjaleeps@users.noreply.github.com>
Co-authored-by: Anjalee Sudasinghe <anjalee@codegen.net>
Co-authored-by: Wes <wryanmedford@gmail.com>
Co-authored-by: Oliver Smith <oliver@kfs.org>
Co-authored-by: Oliver Smith <oliver.smith@superevilmegacorp.com>
Co-authored-by: Prashant Dixit <54981696+PrashantDixit0@users.noreply.github.com>
Co-authored-by: sahusiddharth <112792547+sahusiddharth@users.noreply.github.com>
Co-authored-by: Max Blackmer, CSM <max@agiletechnologist.us>
Co-authored-by: Paul Asquin <paul.asquin@gmail.com>
Co-authored-by: Claudio Cambra <developer@claudiocambra.com>
Co-authored-by: Ayush Chaurasia <ayush.chaurarsia@gmail.com>
Co-authored-by: Alex Perez <alexperezdev@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add grammar-based sampling
3 participants