Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

seed_type memorize and recall #23

Open
12 of 18 tasks
jkomoros opened this issue Jul 1, 2023 · 0 comments
Open
12 of 18 tasks

seed_type memorize and recall #23

jkomoros opened this issue Jul 1, 2023 · 0 comments

Comments

@jkomoros
Copy link
Owner

jkomoros commented Jul 1, 2023

A local associative memory plugged into the library and matching a typespec.

  • Add a memory-mode remember
  • If value for memorize is an array of text, process them in parallel (combining into a single request) instead of sequentially (which adds a network round trip time * items.length)
  • Consider renameing default_memory (currently _default) to something more dsitinctivee that it's being used in a memory context, not a proflie context (similar to ids having c_whatever
  • Memoize hsnw readers when vending them out
  • Everywhere that a seed expects a string, have extractString() and accept an embedding too (document this)
  • Only save hsnw every so often (and on process exist)
  • If query is not provided, create a random embedding and fetch
  • Persist the text to memory (just flat json, then later duckdb). Get the vector from hsnw.getPoint()
  • Store metadata in duckdb (this is more efficient for larger memories but is less easy to debug)
  • embedding.text should not be optional
  • Profile.recall / memorize shouldn't have defaults, that should be on the caller to provide
  • Recreate embeddings of the proper type and constructor from persisted data. (Can I use query.constructor?)
  • Set maxElements intelligently and handle resizing into a larger store
  • Add a hnswlib version in ProfileFilesystem. See https://github.com/polymath-ai/polymath-ai/tree/main/core/db
  • Add a read_only ability for memory (where writing can only be done if a secret boolean key to allow all writes in env (or maybe it should be that it needs an access key to allow writing it? That would allow even remote memory writing: See Allow remote memories #28 ))
  • Clean up th etesting recall seed by actually using real (cached) embeddings for those values. This is somewaht important to verify the sorting is actually correct and not backwards...
  • Store the memories in .profile/memory/MEMORY-NAME/${normalized_embedding_model_name}/hsnw.db. This requires the embedding_model_name to not have any illegal path characters
  • Allow recall.k seed argument to be omitted (needs new machinery possibly to allow optional. And then maybe allow memory to be provided as optional argument (falling back to env.memory) on recall and memorize using same machinery))
This was referenced Jul 1, 2023
jkomoros added a commit that referenced this issue Jul 2, 2023
This is just a low-efficiency in-memory representation. ProfileFilesystem should implement a higher-performance version.

Part of #23.
jkomoros added a commit that referenced this issue Jul 2, 2023
jkomoros added a commit that referenced this issue Jul 2, 2023
Just lightly tested currently.

Doesn't do much until `recall` is wired up.

Part of #23.
jkomoros added a commit that referenced this issue Jul 2, 2023
jkomoros added a commit that referenced this issue Jul 2, 2023
jkomoros added a commit that referenced this issue Jul 2, 2023
jkomoros added a commit that referenced this issue Jul 2, 2023
If there were fewer than k items then it would return none of them.

Part of #23.
jkomoros added a commit that referenced this issue Jul 2, 2023
It recalls memories previously stored with `memorize`.

Currently `memorize` doesn't persist because ProfileFilesystem just holds the memories in memory. But once that is implemented it will actually be useful...

Part of #23.
jkomoros added a commit that referenced this issue Jul 3, 2023
jkomoros added a commit that referenced this issue Jul 3, 2023
…in a local db.

Currently it has a number of limitations including not actually persisting the text associated with each embedding, and a number of other TODOs in the code.

But it does roughly work.

Part of #23
jkomoros added a commit that referenced this issue Jul 3, 2023
Part of #23.
jkomoros added a commit that referenced this issue Jul 3, 2023
jkomoros added a commit that referenced this issue Jul 3, 2023
jkomoros added a commit that referenced this issue Jul 3, 2023
…BY_MODEL.

Currently it's unncessary, but it will be soon when we start persisiting to disk the text.

Part of #23.
jkomoros added a commit that referenced this issue Jul 3, 2023
It's up to the callsite to set defaults.

Part of #23.
jkomoros added a commit that referenced this issue Jul 3, 2023
jkomoros added a commit that referenced this issue Jul 3, 2023
jkomoros added a commit that referenced this issue Jul 3, 2023
…bedding.

For now it's just a JSON file that's written out to filesystem fully on each update.

Part of #23.
jkomoros added a commit that referenced this issue Jul 3, 2023
jkomoros added a commit that referenced this issue Jul 3, 2023
…rom filesystem.

We were initializing the _metadata to {} in constructor so it was never empty.

Part of #23.
jkomoros added a commit that referenced this issue Jul 3, 2023
If so, double the size of the index before inserting.

Part of #23.
jkomoros added a commit that referenced this issue Jul 3, 2023
…bedding can be used.

Where an embedding is found, it will use embedding.text.

Part of #23.
jkomoros added a commit that referenced this issue Jul 3, 2023
…og statements and have them only logged

if verbose is set.

Part of #14. Part of #23.
jkomoros added a commit that referenced this issue Jul 3, 2023
jkomoros added a commit that referenced this issue Jul 3, 2023
Embeddings are so long that printing them all out in weird json was unncessary and overwhelming.

Part of #23.
jkomoros added a commit that referenced this issue Jul 3, 2023
…default_memory` to make it more obvious

what context they're in.

Part of #23.
jkomoros added a commit that referenced this issue Jul 3, 2023
jkomoros added a commit that referenced this issue Jul 3, 2023
Turns out it doesn't require complex new machinery; `input.default` was also handled similarly.

Part of #23.
jkomoros added a commit that referenced this issue Jul 3, 2023
If provided, then it will be used instead of the `memory` env variable.

Part of #23.
jkomoros added a commit that referenced this issue Jul 3, 2023
@jkomoros jkomoros changed the title seed_type remember and recall seed_type memorize and recall Jul 4, 2023
jkomoros added a commit that referenced this issue Jul 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant