Add an example of few-shot memory bank model with MultiModal Feature Extraction #2822

Linuxdex · 2023-02-02T19:11:20Z

This is an example which provides a simple and clear way of implementing a memory-bank-powered few-shot learning model with AutoGluon MultiModal according to Tip-adapter.

The idea is to store <feature, label> pairs from the training data in a key-value memory bank. In the prediction phase, we compare the similarity between the test image features and the memory bank keys, and aggregate the prediction logits. The logits obtained via feature-similarity is combined with logits obtained from a classification model that directly predicts the label from the features.

Experiments show that adding a memory-bank can improve the performance of image, text and image-text classification in the few-shot learning scenario.

github-actions · 2023-02-02T20:35:41Z

Job PR-2822-9dbea0c is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-2822/9dbea0c/index.html

bryanyzhu

Thanks for the PR, performance looks good.

examples/automm/cache_adapter/README.md

examples/automm/cache_adapter/cache_adapter.py

sxjscience · 2023-02-03T00:37:46Z

@Linuxdex , Let's add more description of the memory cache structure.

sxjscience · 2023-02-03T00:38:31Z

examples/automm/cache_adapter/README.md

@@ -0,0 +1,65 @@
+# Use MultiModal Feature Extraction to Create a Few-shot Cache Adapter Model


I also feel that cache adapter is not a good name.

github-actions · 2023-02-03T16:55:41Z

Job PR-2822-9a7909b is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-2822/9a7909b/index.html

github-actions · 2023-02-03T17:01:49Z

Job PR-2822-20b107f is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-2822/20b107f/index.html

github-actions · 2023-02-03T17:15:18Z

Job PR-2822-f3ca830 is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-2822/f3ca830/index.html

github-actions · 2023-02-03T17:49:18Z

Job PR-2822-b7841fe is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-2822/b7841fe/index.html

sxjscience · 2023-02-07T04:08:07Z

@Linuxdex You may also compare with the baseline of feature extraction + SVM, which has been added in #2850

…into cache_adapter

github-actions · 2023-02-07T06:16:40Z

Job PR-2822-576cd0a is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-2822/576cd0a/index.html

github-actions · 2023-02-07T06:17:58Z

Job PR-2822-86fbc2d is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-2822/86fbc2d/index.html

github-actions · 2023-02-09T09:36:40Z

Job PR-2822-339f34a is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-2822/339f34a/index.html

sxjscience

LGTM. We can revise the textual descriptions in another PR.

sxjscience · 2023-02-09T18:08:23Z

examples/automm/memory_bank/README.md

+
+Memory bank follows the excellent design of [Tip-Adapter](https://arxiv.org/pdf/2207.09519.pdf) which stores the image features of few-shot training set to improve the performance of zero-shot CLIP through feature similarity. The stored features can also serve as the initialization of a trainable classifier. This ProtoNet-like design makes full use of few-shot training information and leads to good performance [3]. We believe that the effectiveness of this design is not limited to CLIP, and can be widely applied to few-shot classification tasks of images and texts. 
+
+Memory bank which is the derivative application of Tip-Adapter obtains diversified multi-modal features through MultiModal Feature Extraction. In this example, we first trained a linear classifier based on multi-modal features to generate baseline accuracy. Then, the similarity result between features and memory bank is introduced to baseline predict probability. Finally, an additional linear adapter which is initialized with memory bank is trained to help few-shot classification.


We need to rephrase this paragraph with proper English.

sxjscience · 2023-02-09T18:09:30Z

examples/automm/memory_bank/README.md

+
+Memory bank which is the derivative application of Tip-Adapter obtains diversified multi-modal features through MultiModal Feature Extraction. In this example, we first trained a linear classifier based on multi-modal features to generate baseline accuracy. Then, the similarity result between features and memory bank is introduced to baseline predict probability. Finally, an additional linear adapter which is initialized with memory bank is trained to help few-shot classification.
+
+Hyper-parameters `alpha` and `beta` which adjust the memory bank are modified through grid search on validation set to attain the superior performance.


Same here. Need to revise the paragraph.

examples/automm/memory_bank/memory_bank.py

sxjscience · 2023-02-09T18:18:20Z

examples/automm/memory_bank/utils.py

+    return features
+
+
+def generate_clip_weights(args, classnames, template, predictor):


We can consider to just introduce the variable called semantic_label_embedding and compare the similarity between semantic_label_embedding and the embeddings of the labels stored in the memory bank.

github-actions · 2023-02-09T18:18:39Z

Job PR-2822-91d77d2 is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-2822/91d77d2/index.html

examples/automm/memory_bank/memory_bank.py

github-actions · 2023-02-09T18:33:36Z

Job PR-2822-aff0452 is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-2822/aff0452/index.html

github-actions · 2023-02-10T05:14:39Z

Job PR-2822-de7be1c is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-2822/de7be1c/index.html

github-actions · 2023-02-10T05:16:39Z

Job PR-2822-5d44696 is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-2822/5d44696/index.html

Linuxdex and others added 2 commits February 2, 2023 18:31

Add an example of few-shot cache adapter

906bce7

Merge branch 'autogluon:master' into cache_adapter

9dbea0c

sxjscience requested review from yongxinw and bryanyzhu February 2, 2023 19:37

bryanyzhu reviewed Feb 2, 2023

View reviewed changes

sxjscience reviewed Feb 3, 2023

View reviewed changes

Linuxdex and others added 4 commits February 3, 2023 15:22

Add more information and change the name.

20b107f

Merge branch 'autogluon:master' into cache_adapter

9a7909b

Delete examples/automm/cache_adapter directory

f3ca830

Fix bug

b7841fe

Linuxdex and others added 3 commits February 7, 2023 04:38

change memory cache to memory bank

1b6d4d8

Merge branch 'autogluon:master' into cache_adapter

576cd0a

Merge branch 'cache_adapter' of https://github.com/Linuxdex/autogluon …

86fbc2d

…into cache_adapter

Merge branch 'autogluon:master' into cache_adapter

339f34a

Linuxdex and others added 3 commits February 9, 2023 16:40

Add Model head type and Fix bug

afa1724

Merge branch 'autogluon:master' into cache_adapter

91d77d2

Change model dir name

aff0452

Linuxdex changed the title ~~[WIP] Add an example of few-shot cache model with MultiModal Feature Extraction~~ Add an example of few-shot memory bank model with MultiModal Feature Extraction Feb 9, 2023

sxjscience approved these changes Feb 9, 2023

View reviewed changes