This repository has been archived by the owner on Mar 19, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 333
Fix gem logic, reverse cropping/transformation order. #379
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Summary: Various Instance Retrieval improvements: 1. Add support for Manifold 2. Cleanup noisy logs and add helpful logging. 3. Add DEBUG_MODE support for the Revisited Datasets. 4. Add ability to save results/logs/features. 5. Fix ROI crop bug. 6. Fix typo in benchmark_workflow.py causing benchmarks to fail. Differential Revision: D29995282 fbshipit-source-id: 3500c545819d62f9e627e25cb40e673c47f9918b
Summary: 1. Fix the gem post processing logic. Before this change, the code assumes that each non-preprocessed feature tensor has the same tensor shape: ``` if cfg.IMG_RETRIEVAL.FEATS_PROCESSING_TYPE == "gem": gem_out_fname = f"{out_dir}/{train_dataset_name}_GeM.npy" train_features = torch.tensor(np.concatenate(train_features)) ``` This is not the case, since ROxford/RParis images do not have a standard size, hence the resx layers have different height and widths (but same number of channels). GeM pooling will transform an image of any shape to a shape of `(num_channels)` The change performs gem_pooling on each individual images, as opposed to all the images at once. This should be fine because both gem and l2 normalization are to be performed per-image. 2. Transform before cropping to the bounding box (as opposed to after cropping). The experiments show that this yields much better results. This is also what the deepcluster implentation uses: https://github.com/facebookresearch/deepcluster/blob/master/eval_retrieval.py#L44 ``` Oxford: 61.57 / 41.74 / 14.33 vs. 69.65 / 48.51 / 16.41 Paris: 83.7 / 66.87 / 44.81 vs. 87.9 / 70.57 / 47.39 ``` f288434289 f288438150 Differential Revision: D29993204 fbshipit-source-id: 4e0a59768e2710e9c611f5c0562f5899ed3e32df
facebook-github-bot
added
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
fb-exported
labels
Jul 29, 2021
This pull request was exported from Phabricator. Differential Revision: D29993204 |
iseessel
added a commit
to iseessel/vissl
that referenced
this pull request
Aug 2, 2021
…ch#379) Summary: Pull Request resolved: facebookresearch#379 1. Fix the gem post processing logic. Before this change, the code assumes that each non-preprocessed feature tensor has the same tensor shape: ``` if cfg.IMG_RETRIEVAL.FEATS_PROCESSING_TYPE == "gem": gem_out_fname = f"{out_dir}/{train_dataset_name}_GeM.npy" train_features = torch.tensor(np.concatenate(train_features)) ``` This is not the case, since ROxford/RParis images do not have a standard size, hence the resx layers have different height and widths (but same number of channels). GeM pooling will transform an image of any shape to a shape of `(num_channels)` The change performs gem_pooling on each individual images, as opposed to all the images at once. This should be fine because both gem and l2 normalization are to be performed per-image. 2. Transform before cropping to the bounding box (as opposed to after cropping). The experiments show that this yields much better results. This is also what the deepcluster implentation uses: https://github.com/facebookresearch/deepcluster/blob/master/eval_retrieval.py#L44 ``` Oxford: 61.57 / 41.74 / 14.33 vs. 69.65 / 48.51 / 16.41 Paris: 83.7 / 66.87 / 44.81 vs. 87.9 / 70.57 / 47.39 ``` f288434289 f288438150 Differential Revision: D29993204 fbshipit-source-id: c44007d11bb81f44d0d758460ed171b974612e38
iseessel
added a commit
to iseessel/vissl
that referenced
this pull request
Aug 9, 2021
…ch#379) Summary: Pull Request resolved: facebookresearch#379 1. Fix the gem post processing logic. Before this change, the code assumes that each non-preprocessed feature tensor has the same tensor shape: ``` if cfg.IMG_RETRIEVAL.FEATS_PROCESSING_TYPE == "gem": gem_out_fname = f"{out_dir}/{train_dataset_name}_GeM.npy" train_features = torch.tensor(np.concatenate(train_features)) ``` This is not the case, since ROxford/RParis images do not have a standard size, hence the resx layers have different height and widths (but same number of channels). GeM pooling will transform an image of any shape to a shape of `(num_channels)` The change performs gem_pooling on each individual images, as opposed to all the images at once. This should be fine because both gem and l2 normalization are to be performed per-image. 2. Transform before cropping to the bounding box (as opposed to after cropping). The experiments show that this yields much better results. This is also what the deepcluster implentation uses: https://github.com/facebookresearch/deepcluster/blob/master/eval_retrieval.py#L44 ``` Oxford: 61.57 / 41.74 / 14.33 vs. 69.65 / 48.51 / 16.41 Paris: 83.7 / 66.87 / 44.81 vs. 87.9 / 70.57 / 47.39 ``` f288434289 f288438150 Differential Revision: D29993204 fbshipit-source-id: d6d02b6b96d59b43a00a1d1e99f34c03ee8a85b2
iseessel
added a commit
to iseessel/vissl
that referenced
this pull request
Aug 9, 2021
…ch#379) Summary: Pull Request resolved: facebookresearch#379 1. Fix the gem post processing logic. Before this change, the code assumes that each non-preprocessed feature tensor has the same tensor shape: ``` if cfg.IMG_RETRIEVAL.FEATS_PROCESSING_TYPE == "gem": gem_out_fname = f"{out_dir}/{train_dataset_name}_GeM.npy" train_features = torch.tensor(np.concatenate(train_features)) ``` This is not the case, since ROxford/RParis images do not have a standard size, hence the resx layers have different height and widths (but same number of channels). GeM pooling will transform an image of any shape to a shape of `(num_channels)` The change performs gem_pooling on each individual images, as opposed to all the images at once. This should be fine because both gem and l2 normalization are to be performed per-image. 2. Transform before cropping to the bounding box (as opposed to after cropping). The experiments show that this yields much better results. This is also what the deepcluster implentation uses: https://github.com/facebookresearch/deepcluster/blob/master/eval_retrieval.py#L44 ``` Oxford: 61.57 / 41.74 / 14.33 vs. 69.65 / 48.51 / 16.41 Paris: 83.7 / 66.87 / 44.81 vs. 87.9 / 70.57 / 47.39 ``` f288434289 f288438150 Differential Revision: D29993204 fbshipit-source-id: 6eb48e00011704c6f670f60417e2ed53a9ff0cb9
iseessel
added a commit
to iseessel/vissl
that referenced
this pull request
Aug 9, 2021
…ch#379) Summary: Pull Request resolved: facebookresearch#379 1. Fix the gem post processing logic. Before this change, the code assumes that each non-preprocessed feature tensor has the same tensor shape: ``` if cfg.IMG_RETRIEVAL.FEATS_PROCESSING_TYPE == "gem": gem_out_fname = f"{out_dir}/{train_dataset_name}_GeM.npy" train_features = torch.tensor(np.concatenate(train_features)) ``` This is not the case, since ROxford/RParis images do not have a standard size, hence the resx layers have different height and widths (but same number of channels). GeM pooling will transform an image of any shape to a shape of `(num_channels)` The change performs gem_pooling on each individual images, as opposed to all the images at once. This should be fine because both gem and l2 normalization are to be performed per-image. 2. Transform before cropping to the bounding box (as opposed to after cropping). The experiments show that this yields much better results. This is also what the deepcluster implentation uses: https://github.com/facebookresearch/deepcluster/blob/master/eval_retrieval.py#L44 ``` Oxford: 61.57 / 41.74 / 14.33 vs. 69.65 / 48.51 / 16.41 Paris: 83.7 / 66.87 / 44.81 vs. 87.9 / 70.57 / 47.39 ``` f288434289 f288438150 Differential Revision: D29993204 fbshipit-source-id: 359b0b0e848d4dc6a0fcd9851effa34c76b0b891
This pull request has been merged in b9bb004. |
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
fb-exported
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary:
Before this change, the code assumes that each non-preprocessed feature tensor has the same tensor shape:
This is not the case, since ROxford/RParis images do not have a standard size, hence the resx layers have different height and widths (but same number of channels). GeM pooling will transform an image of any shape to a shape of
(num_channels)
The change performs gem_pooling on each individual images, as opposed to all the images at once. This should be fine because both gem and l2 normalization are to be performed per-image.
The experiments show that this yields much better results. This is also what the deepcluster implentation uses: https://github.com/facebookresearch/deepcluster/blob/master/eval_retrieval.py#L44
f288434289
f288438150
Differential Revision: D29993204