Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluation metrics per class #42

Open
mariiak2021 opened this issue Apr 13, 2023 · 4 comments
Open

Evaluation metrics per class #42

mariiak2021 opened this issue Apr 13, 2023 · 4 comments
Assignees

Comments

@mariiak2021
Copy link

Hi @Lucaweihs , @mattdeitke!

Can you please tell if there are existing any evaluation metrics per class or type of objects (pickable/openable..)?

Best regards,
Mariia

@Lucaweihs
Copy link
Contributor

Hi @mariiak2021,

Unfortunately there are no per-class evaluation metrics currently defined but, if you're interested, these would be very easy to add. You'd want to update the metrics function to return such values. E.g. in pseudo-code you could do something like:

for object_type in object_types_whose_position_was_different_at_the_start_of_the_unshuffle_phase:
    metrics[f"{object_type}_fixed"] = end_energy_of_object == 0.0

Let me know if you'd need any help getting this working.

@mariiak2021
Copy link
Author

mariiak2021 commented Apr 20, 2023

Hi @Lucaweihs thanx a lot for your reply!

I might need your help if possible. :) Will it look like this? To compute the end energy of each object which was misplaced:

def metrics(self) -> Dict[str, Any]:
        if not self.is_done():
            return {}
        env = self.unshuffle_env
        ips, gps, cps = env.poses
        for gp, ip, cp in zip(gps, ips, cps):
            gp_vs_ip = env.compare_poses(gp, ip)
            if (gp_vs_ip["iou"] != 1.0 and gp_vs_ip["iou"] is not None) or (gp_vs_ip["openness_diff"] is not None and gp_vs_ip["openness_diff"] != 0.0):
                object_class = gp["name"].split("_")[0]
                end_energy_of_object = env.pose_difference_energy(gp, ip)
                metrics = {f"{object_class}/end_energy" : end_energy_of_object == 0.0}
        return metrics

Please correct if it's a wrong way. Also which metrics can I reuse per class except for end energy?

best,
Mariia

@mariiak2021
Copy link
Author

Hi @Lucaweihs sorry for disturbing, but did you have a time to look into my question? :) Thank you!

@Lucaweihs Lucaweihs self-assigned this May 8, 2023
@Lucaweihs
Copy link
Contributor

Hi @mariiak2021 ,

Just to double check, the number that you want reported per class is equal to the a = average energy of an object of this class conditional on it ending misplaced at the end of an episode right? Note that this means you'll never be recording cases where the energy is 0 so if you're trying to measure something like b = the average energy of an object at the end of the episode assuming that it _started_ misplaced then you'd need to compute things a bit differently. If you want to compute a, then this is how I'd do it:

    def metrics(self) -> Dict[str, Any]:
        metrics = ...  # Old UnshuffleTask metrics code 

        # New, per-class, metrics code
        key_to_count = defaultdict(lambda: 0)
        for object_type, end_energy in zip([gp["type"] for gp in gps], end_energies):
            if end_energy > 0.0:
                key = f"end_energy__{object_type}"

                # Undo the running average across object type and recompute it with the new value/count
                metrics[key] = metrics.get(key, 0.0) * key_to_count[key] + end_energy
                key_to_count[key] += 1
                metrics[key] /= key_to_count[key]

        return metrics

Note that I'm reusing the gps and end_energies variables defined in the original UnshuffleTask.metrics code. A few ways this differs from the code you wrote:

  1. If you want to compare the status of objects at the end of the episode vs the goal positions then you'll need to use cps/gps and not ips/gps (ips=initial poses, gps = goal poses, cps = current poses), note that end_energies equals the pairwise comparisons between cps and gps.
  2. Multiple objects of the same category might be misplaced in the same scene, I've added some code to average in this case.
  3. I saw you were saving end_energy_of_object == 0.0 and not just end_energy_of_object. The boolean end_energy_of_object == 0.0 would be True if the object poses are equal and False otherwise so this would be the same as recording the per-object-class-fixed rate. This seems like something interesting to measure but is not the same as the energy. If you want to measure the, per-object-class-fixed-rate, then I would recommend doing something like:
  • First compare gps and ips to figure out which objects had different initial/goal states (i.e. the objects that should be rearranged).
  • Next grab the energies corresponding to these "should be rearranged" objects at the end of the episode (i.e. comparing cps to gps at this stage).
  • For each such object, record a metric like f"fixed__{object_type}": energy == 0.0.

Let me know if that helps!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants