Tasti engine #50

ttt-77 · 2023-10-10T02:44:55Z

No description provided.

…lass to keep old cluster representatives

…ating proxy scores.

# Conflicts: # aidb/engine/base_engine.py

# Conflicts: # aidb/query/query.py # aidb/utils/constants.py # aidb/vector_database/chroma_vector_database.py # aidb/vector_database/faiss_vector_database.py # aidb/vector_database/tasti.py # aidb/vector_database/vector_database.py # aidb/vector_database/weaviate_vector_database.py # tests/tasti_test/tasti_test.py

ttt-77 · 2023-10-10T04:20:22Z

Could you review this branch? @ddkang

aidb/config/config_types.py

aidb/engine/tasti_engine.py

aidb/engine/full_scan_engine.py

ddkang · 2023-10-10T06:56:25Z

I think part of the reason this PR is a bit confusing to review is that it isn't connected to any downstream query. I think it would be best to merge this with an example of a downstream query optimization. I think there are a few options:

Implement a limit query optimization, where the records are searched by how close they are to cluster representatives that match the predicate
Implement the optimization where the inference services are ordered for a select query with a complex predicate
Wait for the approximate aggregation PR to be merged and implement control variates

What do you think? @ttt-77

ttt-77 · 2023-10-10T07:14:00Z

Yes, when I was writing this PR, I was not so clear about what should be return from TASTI engine and how it can be combined better with specific query. I checked previous AIDB code and TASTI paper many times. I think in previous code, the proxy scores are only used to order the inference service. Maybe I am missing something. I will consider the first and second options and determine whether any changes need to be made to the current PR

ttt-77 · 2023-10-10T07:29:23Z

And maybe I can also add a design document in Wiki page?

ddkang · 2023-10-10T14:08:08Z

A design doc sounds good. I'd like to see an actual implementation of 1, 2, or 3 in this PR, otherwise it's a bit hard to tell.

ttt-77 · 2023-10-10T14:56:02Z

I will do that

ttt-77 · 2023-10-13T15:24:04Z

@ddkang Could you please review the new commit? And here is the design document, https://github.com/ddkang/aidb-new/wiki/TASTI-Engine-%E2%80%90-Design-Document

aidb/engine/limit_engine.py

aidb/engine/tasti_engine.py

ttt-77 · 2023-10-15T05:15:01Z

This PR is ready for review @ddkang

aidb/query/query.py

aidb/engine/engine.py

aidb/engine/limit_engine.py

aidb/engine/tasti_engine.py

tests/test_tasti_engine.py

aidb/engine/base_engine.py

akash17mittal · 2023-10-16T16:02:39Z

@ttt-77 Just to be sure that the proxy score implementation and all is correct. I would also suggest that you compare the number of inference service calls for limit queries with these proxy scores and some adversarial proxy score (maybe you can test with (1 - proxy_score)). If the implementation is correct, number of inference service calls will be fewer in case of perfect proxy scores.

ttt-77 · 2023-10-16T16:10:28Z

Just to be sure that the proxy score implementation and all is correct. I

Currently, the embedding in vector database are generated randomly. I think we can't test it now. Could you please move your suggestion to issue #54 ? We can check it later.

akash17mittal · 2023-10-16T16:23:15Z

Just to be sure that the proxy score implementation and all is correct. I

Currently, the embedding in vector database are generated randomly. I think we can't test it now. Could you please move your suggestion to issue #54 ? We can check it later.

Yes. We have jackson dataset with the embeddings. Surely, we can test it later. I mean from a code execution point of view, it is alright. But for proxy scores correctness etc, is there a better way to test rather than just reading code?

aidb/vector_database/tasti.py

aidb/engine/base_engine.py

ttt-77 and others added 15 commits September 27, 2023 01:15

TASTI and vector database

8982ac5

Functions to deal with the situation of adding new data. And modify c…

c9917cf

…lass to keep old cluster representatives

gpu for faiss

2c7b45d

Modify format, add more comments

47b0aa3

fix bug and add test code for Tasti

fe5da0e

refactor TASTI

7808d69

store rep tables and topk tables in normal DB, add function for gener…

d6636b7

…ating proxy scores.

fix problem

58b3a2d

Merge branch 'main' into TASTI

85b45d4

# Conflicts: # aidb/engine/base_engine.py

add required bound service for query function, refactor full scan engine

b89d575

refactor tasti engine

6a5031b

refine function

3b1a957

refactor tasti engine

4a1f632

remove unneeded function

400bf92

ddkang reviewed Oct 10, 2023

View reviewed changes

aidb/config/config_types.py Outdated Show resolved Hide resolved

ddkang reviewed Oct 10, 2023

View reviewed changes

aidb/engine/tasti_engine.py Outdated Show resolved Hide resolved

ddkang reviewed Oct 10, 2023

View reviewed changes

aidb/engine/full_scan_engine.py Show resolved Hide resolved

fix mistake

504e246

add proxy score computation and implement limit engine

c728142

ddkang reviewed Oct 13, 2023

View reviewed changes

aidb/engine/limit_engine.py Outdated Show resolved Hide resolved

aidb/engine/tasti_engine.py Outdated Show resolved Hide resolved

ttt-77 added 2 commits October 15, 2023 00:08

revert to original version, add some addition functions

94409f2

refactor proxy score computation, add query type judgement

4de987b

add blobs mapping table file

74ea6e3

ddkang reviewed Oct 15, 2023

View reviewed changes

aidb/query/query.py Outdated Show resolved Hide resolved

fix problem

2c11107

akash17mittal reviewed Oct 15, 2023

View reviewed changes

aidb/engine/engine.py Outdated Show resolved Hide resolved

aidb/engine/limit_engine.py Outdated Show resolved Hide resolved

aidb/engine/tasti_engine.py Show resolved Hide resolved

akash17mittal reviewed Oct 15, 2023

View reviewed changes

tests/test_tasti_engine.py Outdated Show resolved Hide resolved

akash17mittal reviewed Oct 15, 2023

View reviewed changes

tests/test_tasti_engine.py Outdated Show resolved Hide resolved

ttt-77 added 2 commits October 16, 2023 00:49

fix typo, add one TODO

4119dd2

rename and add the test for correctness

1bc1213

akash17mittal reviewed Oct 16, 2023

View reviewed changes

aidb/engine/base_engine.py Outdated Show resolved Hide resolved

aidb/engine/base_engine.py Outdated Show resolved Hide resolved

change all blob_id to vector_id

64d79f6

akash17mittal mentioned this pull request Oct 16, 2023

Test proxy score correctness #55

Open

fix a bug, which doesn't set rep_id's index correctly

09b04a4

akash17mittal reviewed Oct 18, 2023

View reviewed changes

aidb/vector_database/tasti.py Show resolved Hide resolved

akash17mittal reviewed Oct 18, 2023

View reviewed changes

aidb/vector_database/tasti.py Show resolved Hide resolved

akash17mittal reviewed Oct 18, 2023

View reviewed changes

aidb/engine/base_engine.py Outdated Show resolved Hide resolved

aidb/engine/base_engine.py Outdated Show resolved Hide resolved

aidb/engine/base_engine.py Outdated Show resolved Hide resolved

ttt-77 added 3 commits October 17, 2023 23:48

add semicolon

b269657

Add docstring

fbc7404

fix a bug, previous code forgets to add vector_id column in topk table.

7e30a47

akash17mittal reviewed Oct 18, 2023

View reviewed changes

aidb/engine/base_engine.py Outdated Show resolved Hide resolved

akash17mittal previously approved these changes Oct 18, 2023

View reviewed changes

add semicolon

a65d369

ttt-77 dismissed akash17mittal’s stale review via a65d369 October 18, 2023 14:12

ddkang self-requested a review October 18, 2023 17:03

ddkang approved these changes Oct 18, 2023

View reviewed changes

ddkang merged commit e134584 into main Oct 18, 2023

ddkang deleted the tasti_engine branch October 18, 2023 17:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tasti engine #50

Tasti engine #50

ttt-77 commented Oct 10, 2023

ttt-77 commented Oct 10, 2023

ddkang commented Oct 10, 2023

ttt-77 commented Oct 10, 2023

ttt-77 commented Oct 10, 2023

ddkang commented Oct 10, 2023

ttt-77 commented Oct 10, 2023

ttt-77 commented Oct 13, 2023

ttt-77 commented Oct 15, 2023

akash17mittal commented Oct 16, 2023

ttt-77 commented Oct 16, 2023

akash17mittal commented Oct 16, 2023

Tasti engine #50

Tasti engine #50

Conversation

ttt-77 commented Oct 10, 2023

ttt-77 commented Oct 10, 2023

ddkang commented Oct 10, 2023

ttt-77 commented Oct 10, 2023

ttt-77 commented Oct 10, 2023

ddkang commented Oct 10, 2023

ttt-77 commented Oct 10, 2023

ttt-77 commented Oct 13, 2023

ttt-77 commented Oct 15, 2023

akash17mittal commented Oct 16, 2023

ttt-77 commented Oct 16, 2023

akash17mittal commented Oct 16, 2023