Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sidewalk CV Code Image Resolution Bugs #2

Open
kavidey opened this issue Jul 11, 2019 · 9 comments
Open

Sidewalk CV Code Image Resolution Bugs #2

kavidey opened this issue Jul 11, 2019 · 9 comments

Comments

@kavidey
Copy link
Collaborator

kavidey commented Jul 11, 2019

I just found an issue with how we were generating crops for panos in Seattle.
The issue is that the depth data in both Seattle and DC/Newberg is the same resolution. BUT the pano sizes are different resolutions. The function that calculates the crop size (predict_crop_size) uses GSV_IMAGE_WIDTH and GSV_IMAGE_HEIGHT to decide where in the depth data file to sample BUT when it tries to sample depth data for the points on the bottom or right of the panorama, it can’t find any data because the resolution of the depth data didn’t increase with the resolution of the panorama. This can be fixed by instead of inputting the panorama’s actual width and height to the function, always inputting 13312 and 6656 (the resolution of the DC/Newberg images). Right now when the function fails, it falls back onto a less accurate function that calculates crop size based on position in the image.

I don’t think this is a big deal for the ASSETS paper because of how few labels are on the very bottom and right of the panos, but it is definitely something that we should fix and it makes me wonder how many other pieces of code are being affected in weird ways by the difference in resolution. I think that this thread can be a place to gather and resolve any additional bugs we find related to image resolution.

@jonfroehlich
Copy link
Member

I don't think the solution should be to use constants. Instead, why can't we read the resolution of the input image?

@galenweld
Copy link
Collaborator

Gotcha.... well.... Damnit! I should have asked about depth data because this is a totally reasonable type of error to expect.

Regardless, @kaviMD I don't think your conclusion sounds correct – it sounds to me like the SV_x and SV_y values should not be scaled by our ~1.3 factor when sampling from depth data, no? So with our current approach (ie with the crops we're currently using for training) even when we do get depth data, it's not for the correct section of the scene. Again, this isn't going to make a huge difference but it sounds to me like we probably ought to re-do our Seattle crops once we fix this otherwise we're not using the same technique we used for Newberg and D.C. If this is the case, we should confirm asap because it will take some time to redo work.

@kavidey
Copy link
Collaborator Author

kavidey commented Jul 11, 2019

Yeah, I think you're right. For the higher resolution images, the X,Y coordinate needs to be effectively "unscaled" by multiplying by 0.8125 the inverse of 1.2307692308. I can do that automatically in the code so that it works for both high and low res images.

@tongning
Copy link

Good catch! Just want to add that even with correct depth data, it might be appropriate to reevaluate the formula used to compute the crop size from distance. The way I originally came up with that formula, I just manually made a bunch of crops, plotted a chart of distance vs. the crop size I used, and fitted a formula using Excel. I'm not sure whether it might be necessary to redo this process to come up with a new formula for the Seattle resolution, or if simply scaling up the predicted crop size by a little bit would be sufficient.

@jonfroehlich
Copy link
Member

jonfroehlich commented Jul 11, 2019

Thanks @tongning. Yes, the results of some of @tongning's initial crop experiments are here: ProjectSidewalk/SidewalkWebpage#633 (comment). Is there another location with a fuller examination?

This is definitely something that @kaviMD or @sarda-devesh could look into improving in the future. We should create a new Issue for it.

Update: created Issue: #3

@kavidey
Copy link
Collaborator Author

kavidey commented Jul 11, 2019

@galenweld and I just met and we came up with a universal solution that eliminates all future issues with image sizes. It involves three changes to the code.

First:
The global variables GSV_IMAGE_WIDTH and GSV_IMAGE_HEIGHT will always be equal to 13312 and 6656, the resolution of the lower-res Newberg/DC panos. This has the effect that any conversions between SV and XY coordinate along with other math like generating crop locations, will be calculated as if they were for the lower-res panos.

Second:
When computing the center location of a crop, IF the image is a higher resolution Seattle image, the x and y coordinate of the crop will be scaled up respectively. (This can easily automatically happen in code).

Third:
When computing the size of a crop, IF the image is a higher resolution Seattle image, the size of the crop will be scaled up. (Again this is easy to do automatically in code).

These three changes should not only automatically handle the current different image resolutions in every part of the CV pipeline, but they should also make the code forwards and backward compatible in the event that google changes image resolution again.

@jonfroehlich
Copy link
Member

jonfroehlich commented Jul 11, 2019 via email

@galenweld
Copy link
Collaborator

To elaborate on one point - the crop size needs to be scaled up when cropping from a panorama of a higher resolution because (since the resolution is higher) the corresponding crop needs to be proportionally larger.... a curb ramp that takes up 1 degree of the field of view will be more pixels wide in a higher resolution image.

@galenweld
Copy link
Collaborator

And yes, Jon, your understanding is correct. The coordinate system that the SV_x and SV_y values are based upon (as stored in the main Project Sidewalk database) are based upon a coordinate space of size 13312x6656. We use this coordinate system universally for reading from depth data and determining where in the panorama a label is located. The only place we need to adjust this is when converting from SV_x and SV_y to pixel values to determine where to make a crop from an underlying panorama image. In this case and this case only, we need to consider the resolution of the image we're cropping from, which @kaviMD will implement automatically, based on that resolution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants