Fix gem logic, reverse cropping/transformation order. #379

iseessel · 2021-07-29T20:01:13Z

Summary:

Fix the gem post processing logic.

Before this change, the code assumes that each non-preprocessed feature tensor has the same tensor shape:

    if cfg.IMG_RETRIEVAL.FEATS_PROCESSING_TYPE == "gem":
        gem_out_fname = f"{out_dir}/{train_dataset_name}_GeM.npy"
        train_features = torch.tensor(np.concatenate(train_features))

This is not the case, since ROxford/RParis images do not have a standard size, hence the resx layers have different height and widths (but same number of channels). GeM pooling will transform an image of any shape to a shape of (num_channels)

The change performs gem_pooling on each individual images, as opposed to all the images at once. This should be fine because both gem and l2 normalization are to be performed per-image.

Transform before cropping to the bounding box (as opposed to after cropping).

The experiments show that this yields much better results. This is also what the deepcluster implentation uses: https://github.com/facebookresearch/deepcluster/blob/master/eval_retrieval.py#L44

Oxford: 61.57 / 41.74 / 14.33 vs. 69.65 / 48.51 / 16.41
Paris: 83.7 / 66.87 / 44.81 vs. 87.9 / 70.57 / 47.39

f288434289
f288438150

Differential Revision: D29993204

Summary: Various Instance Retrieval improvements: 1. Add support for Manifold 2. Cleanup noisy logs and add helpful logging. 3. Add DEBUG_MODE support for the Revisited Datasets. 4. Add ability to save results/logs/features. 5. Fix ROI crop bug. 6. Fix typo in benchmark_workflow.py causing benchmarks to fail. Differential Revision: D29995282 fbshipit-source-id: 3500c545819d62f9e627e25cb40e673c47f9918b

Summary: 1. Fix the gem post processing logic. Before this change, the code assumes that each non-preprocessed feature tensor has the same tensor shape: ``` if cfg.IMG_RETRIEVAL.FEATS_PROCESSING_TYPE == "gem": gem_out_fname = f"{out_dir}/{train_dataset_name}_GeM.npy" train_features = torch.tensor(np.concatenate(train_features)) ``` This is not the case, since ROxford/RParis images do not have a standard size, hence the resx layers have different height and widths (but same number of channels). GeM pooling will transform an image of any shape to a shape of `(num_channels)` The change performs gem_pooling on each individual images, as opposed to all the images at once. This should be fine because both gem and l2 normalization are to be performed per-image. 2. Transform before cropping to the bounding box (as opposed to after cropping). The experiments show that this yields much better results. This is also what the deepcluster implentation uses: https://github.com/facebookresearch/deepcluster/blob/master/eval_retrieval.py#L44 ``` Oxford: 61.57 / 41.74 / 14.33 vs. 69.65 / 48.51 / 16.41 Paris: 83.7 / 66.87 / 44.81 vs. 87.9 / 70.57 / 47.39 ``` f288434289 f288438150 Differential Revision: D29993204 fbshipit-source-id: 4e0a59768e2710e9c611f5c0562f5899ed3e32df

facebook-github-bot · 2021-07-29T20:01:28Z

This pull request was exported from Phabricator. Differential Revision: D29993204

…ch#379) Summary: Pull Request resolved: facebookresearch#379 1. Fix the gem post processing logic. Before this change, the code assumes that each non-preprocessed feature tensor has the same tensor shape: ``` if cfg.IMG_RETRIEVAL.FEATS_PROCESSING_TYPE == "gem": gem_out_fname = f"{out_dir}/{train_dataset_name}_GeM.npy" train_features = torch.tensor(np.concatenate(train_features)) ``` This is not the case, since ROxford/RParis images do not have a standard size, hence the resx layers have different height and widths (but same number of channels). GeM pooling will transform an image of any shape to a shape of `(num_channels)` The change performs gem_pooling on each individual images, as opposed to all the images at once. This should be fine because both gem and l2 normalization are to be performed per-image. 2. Transform before cropping to the bounding box (as opposed to after cropping). The experiments show that this yields much better results. This is also what the deepcluster implentation uses: https://github.com/facebookresearch/deepcluster/blob/master/eval_retrieval.py#L44 ``` Oxford: 61.57 / 41.74 / 14.33 vs. 69.65 / 48.51 / 16.41 Paris: 83.7 / 66.87 / 44.81 vs. 87.9 / 70.57 / 47.39 ``` f288434289 f288438150 Differential Revision: D29993204 fbshipit-source-id: c44007d11bb81f44d0d758460ed171b974612e38

…ch#379) Summary: Pull Request resolved: facebookresearch#379 1. Fix the gem post processing logic. Before this change, the code assumes that each non-preprocessed feature tensor has the same tensor shape: ``` if cfg.IMG_RETRIEVAL.FEATS_PROCESSING_TYPE == "gem": gem_out_fname = f"{out_dir}/{train_dataset_name}_GeM.npy" train_features = torch.tensor(np.concatenate(train_features)) ``` This is not the case, since ROxford/RParis images do not have a standard size, hence the resx layers have different height and widths (but same number of channels). GeM pooling will transform an image of any shape to a shape of `(num_channels)` The change performs gem_pooling on each individual images, as opposed to all the images at once. This should be fine because both gem and l2 normalization are to be performed per-image. 2. Transform before cropping to the bounding box (as opposed to after cropping). The experiments show that this yields much better results. This is also what the deepcluster implentation uses: https://github.com/facebookresearch/deepcluster/blob/master/eval_retrieval.py#L44 ``` Oxford: 61.57 / 41.74 / 14.33 vs. 69.65 / 48.51 / 16.41 Paris: 83.7 / 66.87 / 44.81 vs. 87.9 / 70.57 / 47.39 ``` f288434289 f288438150 Differential Revision: D29993204 fbshipit-source-id: d6d02b6b96d59b43a00a1d1e99f34c03ee8a85b2

…ch#379) Summary: Pull Request resolved: facebookresearch#379 1. Fix the gem post processing logic. Before this change, the code assumes that each non-preprocessed feature tensor has the same tensor shape: ``` if cfg.IMG_RETRIEVAL.FEATS_PROCESSING_TYPE == "gem": gem_out_fname = f"{out_dir}/{train_dataset_name}_GeM.npy" train_features = torch.tensor(np.concatenate(train_features)) ``` This is not the case, since ROxford/RParis images do not have a standard size, hence the resx layers have different height and widths (but same number of channels). GeM pooling will transform an image of any shape to a shape of `(num_channels)` The change performs gem_pooling on each individual images, as opposed to all the images at once. This should be fine because both gem and l2 normalization are to be performed per-image. 2. Transform before cropping to the bounding box (as opposed to after cropping). The experiments show that this yields much better results. This is also what the deepcluster implentation uses: https://github.com/facebookresearch/deepcluster/blob/master/eval_retrieval.py#L44 ``` Oxford: 61.57 / 41.74 / 14.33 vs. 69.65 / 48.51 / 16.41 Paris: 83.7 / 66.87 / 44.81 vs. 87.9 / 70.57 / 47.39 ``` f288434289 f288438150 Differential Revision: D29993204 fbshipit-source-id: 6eb48e00011704c6f670f60417e2ed53a9ff0cb9

…ch#379) Summary: Pull Request resolved: facebookresearch#379 1. Fix the gem post processing logic. Before this change, the code assumes that each non-preprocessed feature tensor has the same tensor shape: ``` if cfg.IMG_RETRIEVAL.FEATS_PROCESSING_TYPE == "gem": gem_out_fname = f"{out_dir}/{train_dataset_name}_GeM.npy" train_features = torch.tensor(np.concatenate(train_features)) ``` This is not the case, since ROxford/RParis images do not have a standard size, hence the resx layers have different height and widths (but same number of channels). GeM pooling will transform an image of any shape to a shape of `(num_channels)` The change performs gem_pooling on each individual images, as opposed to all the images at once. This should be fine because both gem and l2 normalization are to be performed per-image. 2. Transform before cropping to the bounding box (as opposed to after cropping). The experiments show that this yields much better results. This is also what the deepcluster implentation uses: https://github.com/facebookresearch/deepcluster/blob/master/eval_retrieval.py#L44 ``` Oxford: 61.57 / 41.74 / 14.33 vs. 69.65 / 48.51 / 16.41 Paris: 83.7 / 66.87 / 44.81 vs. 87.9 / 70.57 / 47.39 ``` f288434289 f288438150 Differential Revision: D29993204 fbshipit-source-id: 359b0b0e848d4dc6a0fcd9851effa34c76b0b891

facebook-github-bot · 2021-08-09T23:06:25Z

This pull request has been merged in b9bb004.

iseessel added 2 commits July 29, 2021 13:01

facebook-github-bot added CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported labels Jul 29, 2021

facebook-github-bot closed this in b9bb004 Aug 9, 2021

facebook-github-bot added the Merged label Aug 9, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix gem logic, reverse cropping/transformation order. #379

Fix gem logic, reverse cropping/transformation order. #379

iseessel commented Jul 29, 2021

facebook-github-bot commented Jul 29, 2021

facebook-github-bot commented Aug 9, 2021

Fix gem logic, reverse cropping/transformation order. #379

Fix gem logic, reverse cropping/transformation order. #379

Conversation

iseessel commented Jul 29, 2021

facebook-github-bot commented Jul 29, 2021

facebook-github-bot commented Aug 9, 2021