-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Non-360 View #25
Comments
By introducing the "occluded"-class, we explicitly ask the network to not hallucinate, i.e., make predictions about areas it cannot possibly reason about. Let's assume you pre-trained DeepLab Xception with our dataset for 4 cameras (360deg). If you then created a classical IPM image using 3 cameras only (<360deg), whatever the network would predict in the occluded region could be considered "undefined behavior", since it was trained on full 360deg IPM images. You could however already apply the same occlusions to our inputs and labels in pre-processing, s.t. the network would also reliably predict the uncovered region for your cameras as "occluded". If you wanted to use the uNetXST approach, this gets more complicated, since the transformation is happening in the network. In that case you probably cannot train on e.g. 4 cameras but only apply on 3 different cameras without further modifications. |
So if the simulation is setup to also not have a 360 degree view since that is the way the cameras are set up on the car, we could train uNetXST with a non-360 degree view. This would work, correct, because our training data would also not have a 360 degree view? The area not covered by the cameras wouldn't be part of the occluded class because there is simply no data there to classify even during training. |
Well, you need to set some class for the not covered area in the birds-eye-view label image. Whether that would be the occluded class or another new class unlabeled would be up to you. But yes, training such a setup should be possible in principle. |
Note that this is in principle similar to the |
Understood. So, in that case, the unseen area was just categorized as occluded. So we could augment the BEV images to pre-label the unseen area as occluded before we show the label to the network when training. |
Yes, exactly. Note that you should also apply this "augmentation" in the input image, i.e., the homography image. As I said earlier, if you want to use the uNetXST approach, this gets more complicated and may not be feasibly without custom training data. |
Ahh I see. So the plan was to use our own simulation data, write a script to semantically segment the images using known ground truths from a drone camera, and then use your script to cast our rays and determine where the occluded regions are from our intrinsics/extrinsics. If I understand correctly, this wouldn't work though because we need to generate a homography image using the ipm script and homography won't work unless there is overlapping regions. Does this sound accurate? |
Does the above seem correct? If so, we will look into ways around the problem. Otherwise, we will allocate some time to testing this soon. |
Creating a homography image using
Does that make sense? |
Yes at a high level this makes sense. We will try to look into this deeper. Thank you again |
Our camera suite lacks view of a ~30 degree region in the rear of the vehicle. Classical methods of image stitching would break down without overlapping regions between cameras for image stitching. Since this method uses a learning network, I could see it being robust to this.
So to clarify the question, how would you expect this method to respond to a non-360 view. Would the region just show up as the added "occluded" class since it is not within the view of any of the cameras?
The text was updated successfully, but these errors were encountered: