Skip to content
This repository has been archived by the owner on Mar 19, 2024. It is now read-only.

Fix gem logic, reverse cropping/transformation order. #379

Closed
wants to merge 2 commits into from

Conversation

iseessel
Copy link
Contributor

Summary:

  1. Fix the gem post processing logic.

Before this change, the code assumes that each non-preprocessed feature tensor has the same tensor shape:

    if cfg.IMG_RETRIEVAL.FEATS_PROCESSING_TYPE == "gem":
        gem_out_fname = f"{out_dir}/{train_dataset_name}_GeM.npy"
        train_features = torch.tensor(np.concatenate(train_features))

This is not the case, since ROxford/RParis images do not have a standard size, hence the resx layers have different height and widths (but same number of channels). GeM pooling will transform an image of any shape to a shape of (num_channels)

The change performs gem_pooling on each individual images, as opposed to all the images at once. This should be fine because both gem and l2 normalization are to be performed per-image.

  1. Transform before cropping to the bounding box (as opposed to after cropping).

The experiments show that this yields much better results. This is also what the deepcluster implentation uses: https://github.com/facebookresearch/deepcluster/blob/master/eval_retrieval.py#L44

Oxford: 61.57 / 41.74 / 14.33 vs. 69.65 / 48.51 / 16.41
Paris: 83.7 / 66.87 / 44.81 vs. 87.9 / 70.57 / 47.39

f288434289
f288438150

Differential Revision: D29993204

Summary:
Various Instance Retrieval improvements:
1. Add support for Manifold

2. Cleanup noisy logs and add helpful logging.

3. Add DEBUG_MODE support for the Revisited Datasets.

4. Add ability to save results/logs/features.

5. Fix ROI crop bug.

6. Fix typo in benchmark_workflow.py causing benchmarks to fail.

Differential Revision: D29995282

fbshipit-source-id: 3500c545819d62f9e627e25cb40e673c47f9918b
Summary:
1. Fix the gem post processing logic.

Before this change, the code assumes that each non-preprocessed feature tensor has the same tensor shape:

```
    if cfg.IMG_RETRIEVAL.FEATS_PROCESSING_TYPE == "gem":
        gem_out_fname = f"{out_dir}/{train_dataset_name}_GeM.npy"
        train_features = torch.tensor(np.concatenate(train_features))
```

This is not the case, since ROxford/RParis images do not have a standard size, hence the resx layers have different height and widths (but same number of channels). GeM pooling will transform an image of any shape to a shape of `(num_channels)`

The change performs gem_pooling on each individual images, as opposed to all the images at once. This should be fine because both gem and l2 normalization are to be performed per-image.

2. Transform before cropping to the bounding box (as opposed to after cropping).

The experiments show that this yields much better results. This is also what the deepcluster implentation uses: https://github.com/facebookresearch/deepcluster/blob/master/eval_retrieval.py#L44

```
Oxford: 61.57 / 41.74 / 14.33 vs. 69.65 / 48.51 / 16.41
Paris: 83.7 / 66.87 / 44.81 vs. 87.9 / 70.57 / 47.39
```
f288434289
f288438150

Differential Revision: D29993204

fbshipit-source-id: 4e0a59768e2710e9c611f5c0562f5899ed3e32df
@facebook-github-bot facebook-github-bot added CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported labels Jul 29, 2021
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D29993204

iseessel added a commit to iseessel/vissl that referenced this pull request Aug 2, 2021
…ch#379)

Summary:
Pull Request resolved: facebookresearch#379

1. Fix the gem post processing logic.

Before this change, the code assumes that each non-preprocessed feature tensor has the same tensor shape:

```
    if cfg.IMG_RETRIEVAL.FEATS_PROCESSING_TYPE == "gem":
        gem_out_fname = f"{out_dir}/{train_dataset_name}_GeM.npy"
        train_features = torch.tensor(np.concatenate(train_features))
```

This is not the case, since ROxford/RParis images do not have a standard size, hence the resx layers have different height and widths (but same number of channels). GeM pooling will transform an image of any shape to a shape of `(num_channels)`

The change performs gem_pooling on each individual images, as opposed to all the images at once. This should be fine because both gem and l2 normalization are to be performed per-image.

2. Transform before cropping to the bounding box (as opposed to after cropping).

The experiments show that this yields much better results. This is also what the deepcluster implentation uses: https://github.com/facebookresearch/deepcluster/blob/master/eval_retrieval.py#L44

```
Oxford: 61.57 / 41.74 / 14.33 vs. 69.65 / 48.51 / 16.41
Paris: 83.7 / 66.87 / 44.81 vs. 87.9 / 70.57 / 47.39
```
f288434289
f288438150

Differential Revision: D29993204

fbshipit-source-id: c44007d11bb81f44d0d758460ed171b974612e38
iseessel added a commit to iseessel/vissl that referenced this pull request Aug 9, 2021
…ch#379)

Summary:
Pull Request resolved: facebookresearch#379

1. Fix the gem post processing logic.

Before this change, the code assumes that each non-preprocessed feature tensor has the same tensor shape:

```
    if cfg.IMG_RETRIEVAL.FEATS_PROCESSING_TYPE == "gem":
        gem_out_fname = f"{out_dir}/{train_dataset_name}_GeM.npy"
        train_features = torch.tensor(np.concatenate(train_features))
```

This is not the case, since ROxford/RParis images do not have a standard size, hence the resx layers have different height and widths (but same number of channels). GeM pooling will transform an image of any shape to a shape of `(num_channels)`

The change performs gem_pooling on each individual images, as opposed to all the images at once. This should be fine because both gem and l2 normalization are to be performed per-image.

2. Transform before cropping to the bounding box (as opposed to after cropping).

The experiments show that this yields much better results. This is also what the deepcluster implentation uses: https://github.com/facebookresearch/deepcluster/blob/master/eval_retrieval.py#L44

```
Oxford: 61.57 / 41.74 / 14.33 vs. 69.65 / 48.51 / 16.41
Paris: 83.7 / 66.87 / 44.81 vs. 87.9 / 70.57 / 47.39
```
f288434289
f288438150

Differential Revision: D29993204

fbshipit-source-id: d6d02b6b96d59b43a00a1d1e99f34c03ee8a85b2
iseessel added a commit to iseessel/vissl that referenced this pull request Aug 9, 2021
…ch#379)

Summary:
Pull Request resolved: facebookresearch#379

1. Fix the gem post processing logic.

Before this change, the code assumes that each non-preprocessed feature tensor has the same tensor shape:

```
    if cfg.IMG_RETRIEVAL.FEATS_PROCESSING_TYPE == "gem":
        gem_out_fname = f"{out_dir}/{train_dataset_name}_GeM.npy"
        train_features = torch.tensor(np.concatenate(train_features))
```

This is not the case, since ROxford/RParis images do not have a standard size, hence the resx layers have different height and widths (but same number of channels). GeM pooling will transform an image of any shape to a shape of `(num_channels)`

The change performs gem_pooling on each individual images, as opposed to all the images at once. This should be fine because both gem and l2 normalization are to be performed per-image.

2. Transform before cropping to the bounding box (as opposed to after cropping).

The experiments show that this yields much better results. This is also what the deepcluster implentation uses: https://github.com/facebookresearch/deepcluster/blob/master/eval_retrieval.py#L44

```
Oxford: 61.57 / 41.74 / 14.33 vs. 69.65 / 48.51 / 16.41
Paris: 83.7 / 66.87 / 44.81 vs. 87.9 / 70.57 / 47.39
```
f288434289
f288438150

Differential Revision: D29993204

fbshipit-source-id: 6eb48e00011704c6f670f60417e2ed53a9ff0cb9
iseessel added a commit to iseessel/vissl that referenced this pull request Aug 9, 2021
…ch#379)

Summary:
Pull Request resolved: facebookresearch#379

1. Fix the gem post processing logic.

Before this change, the code assumes that each non-preprocessed feature tensor has the same tensor shape:

```
    if cfg.IMG_RETRIEVAL.FEATS_PROCESSING_TYPE == "gem":
        gem_out_fname = f"{out_dir}/{train_dataset_name}_GeM.npy"
        train_features = torch.tensor(np.concatenate(train_features))
```

This is not the case, since ROxford/RParis images do not have a standard size, hence the resx layers have different height and widths (but same number of channels). GeM pooling will transform an image of any shape to a shape of `(num_channels)`

The change performs gem_pooling on each individual images, as opposed to all the images at once. This should be fine because both gem and l2 normalization are to be performed per-image.

2. Transform before cropping to the bounding box (as opposed to after cropping).

The experiments show that this yields much better results. This is also what the deepcluster implentation uses: https://github.com/facebookresearch/deepcluster/blob/master/eval_retrieval.py#L44

```
Oxford: 61.57 / 41.74 / 14.33 vs. 69.65 / 48.51 / 16.41
Paris: 83.7 / 66.87 / 44.81 vs. 87.9 / 70.57 / 47.39
```
f288434289
f288438150

Differential Revision: D29993204

fbshipit-source-id: 359b0b0e848d4dc6a0fcd9851effa34c76b0b891
@facebook-github-bot
Copy link
Contributor

This pull request has been merged in b9bb004.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants