Reproducing MP3D Experiments #6

Jaraxxus-Me · 2024-02-09T01:38:45Z

Hi,
Thanks for this work.
Would it be possible for you to also open-source the model weights and code for reproducing the MP3D experiments (Table 2)?
Besides, I have a question for PONI results, in their official paper the Success was 31.8, why is this result lower in PEANUT paper?
(Sorry for opening two issues.)

ajzhai · 2024-02-09T03:54:55Z

To run on MP3D, this is what you need:

prediction model weights: https://drive.google.com/file/d/1CVIZ8UsikFFmMCno544P_KYQjoM9Rqin/view?usp=sharing
prediction model config: https://drive.google.com/file/d/1SLs6XEZ4e-Ve1j25A_VrVxhzOzx9tBwG/view?usp=sharing
RedNet code (this is from Stubborn): https://drive.google.com/file/d/1qNXyn868I6DxW5HD7zF_Q8nSsejPpUCh/view?usp=sharing
RedNet weights: https://drive.google.com/file/d/1M-Jhx-Jte6Q9YG_Thv7vAjP4pg7Qf8X2/view?usp=sharing
Change the args "num_sem_categories" to 23 and maybe increase "map_size_cm" and "global downscaling" (not quite sure if necessary, but I sometimes encountered out-of-bounds errors).

Unfortunately, I don't have time to integrate everything into the official repo, but I hope this helps.

As for the 31.8, it seems like we made a typo there; thanks for finding that!

Jaraxxus-Me · 2024-02-09T14:43:55Z

Great, thanks, I'll try.

Jaraxxus-Me · 2024-02-12T21:24:16Z

Hi. Using the provided code and weights, I'm unable to reproduce MP3D experiments and get very low score.
I think I'm confused with the following question:

I'm using objectnav_mp3d_v1 episode dataset, and modified the config file as:

DATASET:
  TYPE: ObjectNav-v1
  SPLIT: train
  DATA_PATH: "habitat-challenge-data/objectnav_mp3d_v1/{split}/{split}.json.gz"
  SCENES_DIR: "habitat-challenge-data/data/scene_datasets/"

If so, how to map the goal id "observations['objectgoal']" from habitat to its actual name? I'm using constant from stubborn
here:

habitat_labels = {
            'background': 0,
            'chair': 1, #g
            'table': 2, #g
            'picture':3, #b
            'cabinet':4, # in resnet
            'cushion':5, # in resnet
            'sofa':6, #g
            'bed':7, #g
            'chest_of_drawers':8, #b in resnet
            'plant':9, #g
            'sink':10, #g
            'toilet':11, #g
            'stool':12, #b
            'towel':13, #b in resnet
            'tv_monitor':14, #g
            'shower':15, #b
            'bathtub':16, #b in resnet
            'counter':17, #b isn't this table?
            'fireplace':18,
            'gym_equipment':19,
            'seating':20,
            'clothes':21 # in resnet
}

Not sure if I'm right.

How to map the goal id from "observations['objectgoal']" for the id in the segmentation output?
For HM3D, I see you have a mapper:

hm3d_to_coco = {0: 0, 
                1: 3, 
                2: 2,
                3: 4,
                4: 5,
                5: 1}

I assume this is because the observations['objectgoal'] does not align with the layer id from the segmentation and prediction.
I see in the RedNet segmentation code:

    def get_prediction(self, img, depth, goal_cat=None):
        args = self.args
        #image_list = []
        img = img[np.newaxis, :, :, :]
        depth = depth[np.newaxis, :, :, :]
        #print("input shape is ",img.shape)
        #print(self.args.device)
        img = torch.from_numpy(img).float().to(self.args.device)
        depth = torch.from_numpy(depth).float().to(self.args.device)
        output, mask = self.segmentation_model(img, depth)
        #print("output shape is",output.shape)
        output = output[0]
        output = output *0.1
        output[output<self.threshold] = 0 #0.9: 30 1.1: 26
        semantic_input = np.zeros((img.shape[1], img.shape[2], 23 ))
        for i in range(0, 40):
            if i in fourty221.keys():
                output[i][mask != i] = 0
                j = fourty221[i]
                if (not (self.gt_mask is None)) and j == goal_cat:
                    semantic_input[:,:,j] += np.copy(self.gt_mask)
                    self.gt_mask = None
                else:
                    semantic_input[:, :, j] += (
                        output[i]).cpu().numpy()

        return semantic_input, mask

Does this mean the id do not need to be changed? Are they already aligned in the 23 layers?

Why do we need 23 layers in RedNet? The 22 classes in constant already have "background".
Why using output = output *0.1 in RedNet? I see the max score will be over one in this case.
It took me a very long time to do the evaluation on MP3D, 2195 episodes took me about 2 days..Is this expected?

I'd appreciate it if you could provide some guidance on my questions!

ajzhai · 2024-02-13T00:45:55Z

and 2.:
Oh sorry, I forgot to say that you need to add 1 to the goal id. That's something I found in Stubborn: https://github.com/Improbable-AI/Stubborn/blob/1a0f85bab2f203229406a5f311688ea7cf3b251e/Stubborn/agent/stubborn_agent.py#L52
The name mapping you got should be correct.

3: I forgot why, maybe it's unnecessary.
4: I don't know (it's directly copied from Stubborn).
5: Unfortunately yes, but you can run multiple threads with different episodes to parallelize.

Jaraxxus-Me · 2024-02-13T01:36:18Z

I see, I'll try to follow their code.
Thanks!

Jaraxxus-Me · 2024-02-15T14:55:32Z

Hi, I tried to reproduce the MP3D w/ the correct class mapping.
However, I'm only able to achieve Succ 0.321 | SPL: 0.0926 on all 2195 episodes.
I'm using your provided MP3D episode dataset and downloaded MP3D scene datasets from here
Where could be wrong? Thanks!

ajzhai · 2024-02-15T18:56:53Z

Hmm, not sure, but the issue might be in some args for the mapping. Try using --map_size_cm 7200 --global_downscaling 3 since MP3D scenes can be larger and the 4800 might be too small. Sadly I forgot to write down what exactly I used previously, but I think it's something like that. It might also be different confidence threshold for RedNet vs MaskRCNN.

Jaraxxus-Me · 2024-02-15T19:19:07Z

Okay, thanks, I'll try that and update here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reproducing MP3D Experiments #6

Reproducing MP3D Experiments #6

Jaraxxus-Me commented Feb 9, 2024

ajzhai commented Feb 9, 2024

Jaraxxus-Me commented Feb 9, 2024

Jaraxxus-Me commented Feb 12, 2024

ajzhai commented Feb 13, 2024 •

edited

Loading

Jaraxxus-Me commented Feb 13, 2024

Jaraxxus-Me commented Feb 15, 2024

ajzhai commented Feb 15, 2024

Jaraxxus-Me commented Feb 15, 2024

Reproducing MP3D Experiments #6

Reproducing MP3D Experiments #6

Comments

Jaraxxus-Me commented Feb 9, 2024

ajzhai commented Feb 9, 2024

Jaraxxus-Me commented Feb 9, 2024

Jaraxxus-Me commented Feb 12, 2024

ajzhai commented Feb 13, 2024 • edited Loading

Jaraxxus-Me commented Feb 13, 2024

Jaraxxus-Me commented Feb 15, 2024

ajzhai commented Feb 15, 2024

Jaraxxus-Me commented Feb 15, 2024

ajzhai commented Feb 13, 2024 •

edited

Loading