Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't get 100% accuracy in Sub-Goal evaluation with ground-truth actions and masks. #19

Open
bhkim94 opened this issue Mar 27, 2020 · 6 comments
Labels
bug Something isn't working

Comments

@bhkim94
Copy link

bhkim94 commented Mar 27, 2020

I'm trying to produce Sub-Goal evaluation results with ground-truth actions and masks.
But I got index-out-of-bound errors and couldn't get 100% PLW for some trajectories in seen and unseen validation sets. (both SR and PLW should be 100% as it's evaluated with ground truths.)

This is the changes I made only in eval_subgoals.py, (line 69 and 128)

...

 68: expert_init_actions = [a['discrete_action'] for a in traj_data['plan']['low_actions'] if a['high_idx'] < eval_idx]
 69: expert_init_actions_gt = [a['discrete_action'] for a in traj_data['plan']['low_actions']]

...

127: mask = np.squeeze(mask, axis=0) if model.has_interaction(action) else None
128: action = expert_init_actions[t]['action']
     compressed_mask = expert_init_actions_gt['args']['mask'] if 'mask' in expert_init_actions_gt['args'] else None
     mask = env.decompress_mask(compressed_mask) if compressed_mask is not None else None
129: # debug
130:     if args.debug:

...

If the changes are correct to implement ground-truth actions and masks, is there any idea why I can't get 100% PLW?
And I don't understand why I got index-out-of-bound errors with ground-truth trajectories.

Thanks for replying!

@MohitShridhar
Copy link
Collaborator

Hi @bhkim94, can you post the index-out-of-bound error?

We did have a related bug, but it's fixed in the latest master.

@bhkim94
Copy link
Author

bhkim94 commented Mar 29, 2020

Hi, thanks for replying.

This is the index errors I got.

...

No. of trajectories left: 805
Resetting ThorEnv
Task: Place two salt shakers in the drawer.
Evaluating: data/json_feat_2.1.0/pick_two_obj_and_place-PepperShaker-None-Drawer-10/trial_T20190912_221141_608117
Subgoal GotoLocation (0)
Instr: Look down, turn left, walk straight, turn left to face the fridge, walk straight, turn right when you reach the fridge, walk st
raight, turn left to face the counter with the bread on the counter and look up to the cabinet.
-------------
GotoLocation ==========
SR: 40/40 = 1.000
PLW S: 0.994
------------
No. of trajectories left: 805
Resetting ThorEnv
Task: Place two salt shakers in the drawer.
Evaluating: data/json_feat_2.1.0/pick_two_obj_and_place-PepperShaker-None-Drawer-10/trial_T20190912_221141_608117
Subgoal GotoLocation (2)
Instr: Look down, turn right, walk straight, turn right to face the pot on the counter and turn right.
Traceback (most recent call last):
  File "/home/user/Desktop/alfred/models/eval/eval_subgoals.py", line 49, in run
    cls.evaluate(env, model, eval_idx, r_idx, resnet, traj, args, lock, successes, failures, results)
  File "/home/user/Desktop/alfred/models/eval/eval_subgoals.py", line 132, in evaluate
    action = expert_init_actions_gt[t]['action']
IndexError: list index out of range
Error: IndexError('list index out of range',)
No. of trajectories left: 804
Resetting ThorEnv
Task: Put two shakers in the second drawer.
Evaluating: data/json_feat_2.1.0/pick_two_obj_and_place-PepperShaker-None-Drawer-10/trial_T20190912_221141_608117
Subgoal GotoLocation (0)
Instr: Make a right and step forward then turn left at the island.
-------------
GotoLocation ==========
SR: 41/41 = 1.000
PLW S: 0.988
------------
No. of trajectories left: 804
Resetting ThorEnv
Task: Put two shakers in the second drawer.
Evaluating: data/json_feat_2.1.0/pick_two_obj_and_place-PepperShaker-None-Drawer-10/trial_T20190912_221141_608117
Subgoal GotoLocation (2)
Instr: Turn right then walk forward. Turn around once you are past the sink.
Traceback (most recent call last):
  File "/home/user/Desktop/alfred/models/eval/eval_subgoals.py", line 49, in run
    cls.evaluate(env, model, eval_idx, r_idx, resnet, traj, args, lock, successes, failures, results)
  File "/home/user/Desktop/alfred/models/eval/eval_subgoals.py", line 132, in evaluate
    action = expert_init_actions_gt[t]['action']

...

I've found that an agent takes more actions for a subgoal than the validation set has.
For example, the first subgoal (Goto) of "pick_two_obj_and_place-PepperShaker-None-Drawer-10/trial_T20190912_221141_608117" consists of 19 low-level actions, but it took 23 actions to accomplish the subgoal.
I think this could be a cause for the errors because this makes a subsequent subgoal (i.e. the second subgoal) start from the 24th action, which isn't for the subgoal and therefore the agent eventually fails at completing subsequent subgoals.

+ I've validated a model with GT actions and masks in the same way above, and I didn't get 100% SR nor PC in both seen and unseen validation sets.

In seen validation set,
SR: 818 / 820 = 0.998
PC: 2104 / 2109 = 0.998
PLW SR: 0.998
PLW PC: 0.999

In unseen validation set,
SR: 819 / 821 = 0.998
PC: 2118 / 2120 = 0.999
PLW SR: 0.998
PLW PC: 0.999

I think in #7 three trajectories are missed in validation, but I include the whole validation set (820 for seen, and 821 for unseen).

@MohitShridhar
Copy link
Collaborator

MohitShridhar commented Mar 31, 2020

@bhkim94 something seems strange. If there are 19 low-level actions in the expert trajectory, I don't know why the agent takes 23 actions when you are simply replaying the expert actions. Could you share your fork so I can get a better understanding of what's happening?

@bhkim94
Copy link
Author

bhkim94 commented Mar 31, 2020

This is the fork, https://github.com/bhkim94/alfred

There are only changes in eval_task.py and eval_subgoals.py with marked with # in the codes.
And I remove the index errors by adding explicit exit codes because there is no explicit stop token in a trajectory.

if t == len(expert_init_actions_gt): # GT doesn't have the STOP token.
    break

but the index error itself is not the main point because the incompletion of a subgoal as stated in a validation trajectory causes the index errors.

@MohitShridhar
Copy link
Collaborator

MohitShridhar commented Apr 8, 2020

@bhkim94 sorry about the delay. Will take a look at this soon.

Meanwhile, you can just ignore these 5 trajectories, while I try to figure out what's happening. Since the SR is 99.8% anyway, this issue shouldn't hinder your progress.

@MohitShridhar
Copy link
Collaborator

@bhkim94, it seems like the issue is being caused by some non-deterministic behavior in AI2THOR.

For instance, when PutObject is used to place a Knife, the object occasionally slips and falls, invalidating the GT mask in the dataset:

No Slip:
image

Slip (rare occurrence):
image

We will take a look at fixing this, but it needs to be done on the simulator side. Fortunately, it's a rare occurrence, so you still get 99.8% SR. This shouldn't affect your modeling progress for now.

Thanks again for pointing this out!

@MohitShridhar MohitShridhar added the bug Something isn't working label Oct 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants