Dataset extraction API #483

mpiseno · 2020-02-10T23:11:25Z

Motivation and Context

Added dataset extraction API to habitat-sim that allows users to specify a scene file and get back a bunch of images that can easily be fed into a PyTorch model or saved for later use. The API supports RGBA, depth, and semantic images as well as allows the user to specify image size.

How Has This Been Tested

Created two tests in tests/test_dataset_extraction.py that:

Create a topdown view
input a scene file to the data extractor and feed it through a pytorch model

Types of changes

New feature (non-breaking change which adds functionality)

Example Output

erikwijmans · 2020-02-10T23:16:55Z

src/esp/nav/PathFinder.cpp

@@ -229,6 +231,8 @@ struct PathFinder::Impl {

  std::pair<vec3f, vec3f> bounds() const { return bounds_; };

+  std::vector<std::vector<bool>> getTopDownView(const float res);


This should be an Eigen::Matrix<bool, Eigen::Dynamic, Eigen::Dynamic> matrix -- that'll get sent over to python as a numpy array with zero copy :)

erikwijmans · 2020-02-10T23:18:04Z

src/esp/nav/PathFinder.cpp

@@ -1128,6 +1132,42 @@ bool PathFinder::Impl::isNavigable(const vec3f& pt,
  return true;
 }

+std::vector<std::vector<bool>> PathFinder::Impl::getTopDownView(
+    const float res) {


The caller should be responsible for providing a height.

erikwijmans · 2020-02-10T23:21:08Z

src/esp/nav/PathFinder.cpp

@@ -229,6 +231,8 @@ struct PathFinder::Impl {

  std::pair<vec3f, vec3f> bounds() const { return bounds_; };

+  std::vector<std::vector<bool>> getTopDownView(const float res);


Let's give res a more intelligible name. Seems like it is being used as metersPerPixel?

erikwijmans · 2020-02-10T23:24:19Z

habitat_sim/utils/data/dataextractor.py

+        bound1, bound2 = self.pathfinder.get_bounds()
+        startw = min(bound1[0], bound2[0])
+        starth = min(bound1[2], bound2[2])
+        starty = self.pathfinder.get_random_navigable_point()[


If you have the caller send in the height for generating the top-down map, then you can get this exactly.

erikwijmans · 2020-02-10T23:26:26Z

habitat_sim/utils/data/dataextractor.py

+
+import habitat_sim
+import habitat_sim.bindings as hsim
+from examples.settings import make_cfg


We should not be pulling something in from examples into the core of habitat-sim.

erikwijmans · 2020-02-10T23:28:11Z

habitat_sim/utils/data/dataextractor.py

+        new_state.rotation = rot
+        self.sim.agents[0].set_state(new_state)
+        obs = self.sim.get_sensor_observations()
+        sample = {


Color is output as RGB-A, so you'll want to either chop-off the alpha channel or change the name.

erikwijmans · 2020-02-10T23:30:07Z

src/esp/nav/PathFinder.cpp

+    std::vector<bool> curRow;
+    for (int w = 0; w < widthResolution; w++) {
+      vec3f point = vec3f(curWidth, navigableHeight, curHeight);
+      if (isNavigable(point, 0.5)) {


This will need to change for the Eigen matrix, but anyways, you can do curRow.push_back(isNavigable(point, 0.5)) here.

codecov · 2020-02-11T01:10:11Z

Codecov Report

Merging #483 into master will increase coverage by 1.02%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master     #483      +/-   ##
==========================================
+ Coverage   59.19%   60.22%   +1.02%     
==========================================
  Files         165      168       +3     
  Lines        7340     7570     +230     
  Branches       84       84              
==========================================
+ Hits         4345     4559     +214     
- Misses       2995     3011      +16

Flag	Coverage Δ
#CPP	`54.78% <100.00%> (+0.22%)`	⬆️
#JavaScript	`10.00% <ø> (ø)`	⬆️
#Python	`80.72% <87.55%> (+0.99%)`	⬆️

Impacted Files	Coverage Δ
tests/test_data_extraction.py	`100.00% <0.00%> (ø)`
habitat_sim/utils/data/data_extractor.py	`84.14% <0.00%> (ø)`
habitat_sim/utils/data/pose_extractor.py	`86.31% <0.00%> (ø)`
habitat_sim/simulator.py	`95.54% <0.00%> (+1.21%)`	⬆️
habitat_sim/utils/common.py	`66.21% <0.00%> (+9.45%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 05db83e...3ec5eaf. Read the comment docs.

mpiseno · 2020-02-11T01:14:52Z

I had to force-push to fix some difficult errors that caused api tests to fail. Will address the comments above @erikwijmans thanks for the feedback.

…gs import

dhruvbatra · 2020-02-11T21:39:34Z

requirements.txt

@@ -5,3 +5,5 @@ numpy-quaternion
 pillow
 scipy>=1.3.0
 tqdm
+torch


I think we've avoided adding torch as a dependency so far. @erikwijmans -- thoughts?

Yeah, we should continue to avoid it as a hard dependency. @mpiseno we can have the dataset extractor class be API-compatible and then mix-in a torch dataset in the training examples or something. It could also be a soft dependency (i.e. you will only need it if you want to use this code).

The only think I actually use pytorch for is in the tests in tests/test_dataset_extraction. I could remove this without any changes to the ImageExtractor class, but I think @mathfac wanted to run the data through a pytorch model in testing?

That's fine, we have a soft dependency on pytorch for some tests already (--with-cuda relies on torch in places), but that soft dependency doesn't need to be in requirements.txt

Is it sufficient to simply remove torch from requirements.txt? Will the soft dependency during testing be handled appropriately even if i import torch in my tests/test_dataset_extraction.py file?

Yeah, just removing it from requirements.txt is sufficient. All of a "soft" dependency is in these terms is something you don't list in requirements.txt and the user will only need to install if they actually want to run certain code (as opposed to just importing the function).

… getTopdownView to return Eigen Matrix

mpiseno · 2020-02-12T01:55:29Z

Fixed everything you gave feedback for @erikwijmans thank you for the help!

erikwijmans

One layout request: Can you break ImageExtractor and PoseExtractor into their own files? Both seem like they provide enough functionality to warrant being in their own file and it also makes quickly remembering where they are easier :)

erikwijmans · 2020-02-12T19:05:32Z

habitat_sim/utils/data/dataextractor.py

+        return (startw, starty, starth)  # width, y, height
+
+
+class TopdownView(object):


What does this class do? Seems like it's just a helper function?

Right now it is essentially a helper function. @msbaines thought the topdown functionality could be useful in other contexts in the future, which is why I made it separate.

erikwijmans · 2020-02-12T19:05:46Z

habitat_sim/utils/data/dataextractor.py

+
+class TopdownView(object):
+    def __init__(self, sim, height, pixels_per_meter=0.1):
+        self.topdown_view = np.array(


Should no longer need the np.array here.

erikwijmans · 2020-02-12T19:10:32Z

habitat_sim/utils/data/dataextractor.py

+            "silent": True,
+        }
+
+        return make_cfg(sim_settings)


Using the dictionary and helper function doesn't really seem necessary here. Just making the config in this method would be cleaner IMO.

For the next PR im working on that depends on this one, I need to call this method from multiple classes, but for this branch it does seem cleaner to do what you suggested.

erikwijmans · 2020-02-12T19:11:12Z

habitat_sim/utils/data/dataextractor.py

+
+    def _compute_quat(self, cam_normal):
+        """Rotations start from -z axis"""
+        return quat_from_two_vectors(np.array([0, 0, -1]), cam_normal)


Suggested change

return quat_from_two_vectors(np.array([0, 0, -1]), cam_normal)

return quat_from_two_vectors(habitat_sim.geo.FRONT, cam_normal)

erikwijmans · 2020-02-14T02:17:53Z

habitat_sim/utils/data/dataextractor.py

+    def close(self):
+        r"""Deletes the instance of the simulator. Necessary for instatiating a different ImageExtractor.
+        """
+        self.sim.close()


Add an self.sim = None bellow and then guard all this with if self.sim is not None: that way you can call close twice.

erikwijmans

Couple comments

erikwijmans · 2020-02-25T22:04:09Z

src/esp/nav/PathFinder.cpp

@@ -1222,5 +1258,17 @@ std::pair<vec3f, vec3f> PathFinder::bounds() const {
  return pimpl_->bounds();
 }

+// std::vector<std::vector<bool>> PathFinder::getTopDownView(


erikwijmans · 2020-02-25T22:04:48Z

habitat_sim/utils/data/poseextractor.py

@@ -0,0 +1,192 @@
+import collections


snake_case filename to match the rest of the codebase.

erikwijmans · 2020-02-25T22:05:00Z

habitat_sim/utils/data/dataextractor.py

@@ -0,0 +1,183 @@
+import math


Same request for filename.

erikwijmans · 2020-02-25T22:08:18Z

examples/Image-Data-Extraction-API.ipynb

@@ -0,0 +1,234 @@
+{


We banished notebooks a while ago as they don't play nice with git, CC @abhiskk

I had asked Michael to make one so that it serves as a tutorial. What would be better if not a notebook?

We banished them from git, they still exist, just not in git.

Oleksandr said I should make a tutorial on the habitat website that reflects the same information

…n files format

mpiseno · 2020-02-26T00:25:43Z

I believe @mathfac said the 1 test that is currently failing is a known issue and if everything else is passing then this PR should be good. If that's the case, I'm just waiting on PR approval to merge.

* created data extraction API for habitat-sim * height passed in as arg to getTopdownView and removed examples/settings import * added jupyter notebook example for Dataset Extraction API and changed getTopdownView to return Eigen Matrix * formatting * moved PoseExtractor to separate file and other formatting changes * removed torch dependency * failing api tests fix * this commit builds * modified close method in ImageExtractor * removed jupyter notebook and modified file names to match other python files format * small test change

mpiseno changed the title ~~Dataset extraction~~ Dataset extraction API Feb 10, 2020

facebook-github-bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Feb 10, 2020

mpiseno requested a review from msbaines February 10, 2020 23:11

erikwijmans suggested changes Feb 10, 2020

View reviewed changes

created data extraction API for habitat-sim

17131ca

mpiseno force-pushed the dataset-extraction branch from 2b007b1 to 17131ca Compare February 11, 2020 00:33

height passed in as arg to getTopdownView and removed examples/settin…

65db127

…gs import

dhruvbatra reviewed Feb 11, 2020

View reviewed changes

Michael Piseno added 2 commits February 11, 2020 17:50

added jupyter notebook example for Dataset Extraction API and changed…

20a19c1

… getTopdownView to return Eigen Matrix

formatting

e43c951

mpiseno requested review from mathfac and abhiskk February 12, 2020 01:56

erikwijmans reviewed Feb 12, 2020

View reviewed changes

moved PoseExtractor to separate file and other formatting changes

a1701b3

erikwijmans reviewed Feb 14, 2020

View reviewed changes

Michael Piseno added 3 commits February 21, 2020 10:54

removed torch dependency

36cbb5f

failing api tests fix

2107e54

this commit builds

1b7ae03

mpiseno force-pushed the dataset-extraction branch from 625beb0 to 1b7ae03 Compare February 25, 2020 01:27

modified close method in ImageExtractor

e547685

msbaines requested review from erikwijmans and dhruvbatra February 25, 2020 21:56

erikwijmans reviewed Feb 25, 2020

View reviewed changes

Michael Piseno added 2 commits February 25, 2020 14:20

removed jupyter notebook and modified file names to match other pytho…

5c9963e

…n files format

small test change

3ec5eaf

erikwijmans mentioned this pull request Feb 26, 2020

How to accuess all reachable positions within a Gibson dataset for a habitat agent? facebookresearch/habitat-lab#315

Closed

erikwijmans approved these changes Feb 26, 2020

View reviewed changes

mpiseno merged commit e083724 into master Feb 26, 2020

mpiseno deleted the dataset-extraction branch February 26, 2020 23:28

erikwijmans mentioned this pull request Mar 4, 2020

Use Heading sensor and Top Down Map without a task facebookresearch/habitat-lab#321

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dataset extraction API #483

Dataset extraction API #483

mpiseno commented Feb 10, 2020 •

edited

Loading

erikwijmans Feb 10, 2020

erikwijmans Feb 10, 2020

erikwijmans Feb 10, 2020

erikwijmans Feb 10, 2020

erikwijmans Feb 10, 2020

erikwijmans Feb 10, 2020

erikwijmans Feb 10, 2020

codecov bot commented Feb 11, 2020 •

edited

Loading

mpiseno commented Feb 11, 2020

dhruvbatra Feb 11, 2020

erikwijmans Feb 12, 2020

mpiseno Feb 12, 2020

erikwijmans Feb 13, 2020

mpiseno Feb 20, 2020

erikwijmans Feb 20, 2020

mpiseno commented Feb 12, 2020

erikwijmans left a comment

erikwijmans Feb 12, 2020

mpiseno Feb 12, 2020

erikwijmans Feb 12, 2020

erikwijmans Feb 12, 2020

mpiseno Feb 12, 2020

erikwijmans Feb 12, 2020

erikwijmans Feb 14, 2020

erikwijmans left a comment

erikwijmans Feb 25, 2020

erikwijmans Feb 25, 2020

erikwijmans Feb 25, 2020

erikwijmans Feb 25, 2020

dhruvbatra Feb 25, 2020

erikwijmans Feb 25, 2020

mpiseno Feb 25, 2020

mpiseno commented Feb 26, 2020

		@@ -229,6 +231,8 @@ struct PathFinder::Impl {

		std::pair<vec3f, vec3f> bounds() const { return bounds_; };

		std::vector<std::vector<bool>> getTopDownView(const float res);

		return (startw, starty, starth) # width, y, height


		class TopdownView(object):

	return quat_from_two_vectors(np.array([0, 0, -1]), cam_normal)
	return quat_from_two_vectors(habitat_sim.geo.FRONT, cam_normal)

Dataset extraction API #483

Dataset extraction API #483

Conversation

mpiseno commented Feb 10, 2020 • edited Loading

Motivation and Context

How Has This Been Tested

Types of changes

Example Output

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Feb 11, 2020 • edited Loading

Codecov Report

mpiseno commented Feb 11, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mpiseno commented Feb 12, 2020

erikwijmans left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

erikwijmans left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mpiseno commented Feb 26, 2020

mpiseno commented Feb 10, 2020 •

edited

Loading

codecov bot commented Feb 11, 2020 •

edited

Loading