New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
depth_from_video_in_the_wild: not able to reproduce the result #46
Comments
Hi
I am assuming you're training on KITTI?
Did you create a "possibility mobile" mask for each image? Did you use an
segmentation network to do that? Which one?
…On Sat, Aug 17, 2019, 7:11 AM juju ***@***.***> wrote:
@gariel-google <https://github.com/gariel-google> Dear author, Thanks for
sharing the source code of the paper.
I was trying to reproduce the result of the paper using your code.
However, with your default setting (batch size=4, learning_rate=0.0002,
etc.) training from scratch, the result I got it's quite far from what you
stated in the paper (Abs Rel 0.147 for the best checkpoint within around
370k-th step vs 0.128 in the paper). For your information, I am using the
evaluation code from sfmlearner
<https://github.com/tinghuiz/SfMLearner/tree/master/kitti_eval> as what
struct2depth does.
Therefore, may I know what's setting for obtaining the paper's result? Or
is there anything critical part missing in the current released code (maybe
pretrained checkpoint for example)?
Thank you in advance.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#46?email_source=notifications&email_token=ADXKUNH2XZHXPU5SHQ6YEDDQFABJPA5CNFSM4IMPQMMKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HFZCAQA>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ADXKUNAY4ZPYW5AVT3TDSE3QFABJPANCNFSM4IMPQMMA>
.
|
Hi, Sorry for the missing information in my previous comment. Yes, I am using KITTI, eigen split, using data generation code from vid2depth as what struct2depth does. Yes, I have created a "possibility mobile" mask for each image. I am using the same masks of struth2depth (each object has different object ID and the objects are being tracked across three consecutive sequences). I am using Mask-RCNN to obtain the mask. Also, I have turned |
Thanks for the information, that helps a lot. On KITTI we trained with a
barch size of 16 and a learning rate of 1e-4. All other parameters are at
their default values. In addition, we initialized from an ImageNet
checkpoint for ResNet 18. The current release does not yet support
initialization form a checkpoint, but it should be easy to set up. Fig. 5
in the paper indicates when you should expect convergence. Lastly, are you
using
https://github.com/google-research/google-research/blob/master/depth_from_video_in_the_wild/model.py#L388
to
infer depth?
We are planning to release pretrained checkpoints, more code and more
dicumentation before ICCV. We will do our best to do it sooner than later.
…On Sat, Aug 17, 2019 at 6:34 PM juju ***@***.***> wrote:
Hi, Sorry for the missing information in my previous comment.
Yes, I am using KITTI, eigen split, using data generation code from
vid2depth
<https://github.com/tensorflow/models/tree/master/research/vid2depth/dataset>
as what struct2depth does.
Yes, I have created a "possibility mobile" mask for each image. I am using
the same masks of struth2depth (each object has different object ID and the
objects are being tracked across three consecutive sequences). I am using
Mask-RCNN to obtain the mask. Also, I have turned boxify=True, so the
masks will become bounding boxes as I understand.
For your information, I also attached the the image in tensorboard of
variable seg_stack of line 172, model.py
<https://github.com/google-research/google-research/blob/master/depth_from_video_in_the_wild/model.py>
.
[image: Screenshot 2019-08-18 at 9 24 02 AM]
<https://user-images.githubusercontent.com/18667188/63219007-1355ea00-c19b-11e9-88ef-e1da17bf8943.png>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#46?email_source=notifications&email_token=ADXKUNFFVIYEE5PMU3XG3VLQFCRKLA5CNFSM4IMPQMMKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4QWRKA#issuecomment-522283176>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ADXKUNBXUHPWYVEWW6EDMXDQFCRKLANCNFSM4IMPQMMA>
.
|
Thanks for the reply. Yes, for depth inference, I am using the link you mentioned in your previous comment. In Figure 5 of the paper, for Evaluated on KITTI, the training converges at around 1 million training images. So if we assume that Looking forward to your release and thanks for the efforts. |
I feel that it's too early to draw a conclusion, we need to investigate
this more. If you're ready to do it together, that would be great. We are
planning to release a pretrained KITTI checkpoint in the next few days, and
a first step can be establishing that we agree on how eval metrics are
calculated. We can both run evals on the same checkpoint and exchange
results. Based on that, we'll see how to proceed. How does that sound?
…On Sun, Aug 18, 2019 at 8:12 AM juju ***@***.***> wrote:
Thanks for the reply.
Yes, for depth inference, I am using the link
<https://github.com/google-research/google-research/blob/master/depth_from_video_in_the_wild/model.py#L388>
you mentioned in your previous comment.
In Figure 5 of the paper, for *Evaluated on KITTI*, the training
converges at around 1 million training images. So if we assume that batch_size=4,
learning_rate=0.0002 has similar convergence, and as I found that my
training has Abs_Rel=0.147 for the best checkpoint within around 370k-th
step (370k steps=1.48 million images). Therefore, can I conclude that the
pretrained ImageNet checkpoint has huge impact (0.147 vs 0.128) on the
result?
Looking forward to your release and thanks for the efforts.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#46?email_source=notifications&email_token=ADXKUNFWASS3J5Q4QMRTSDTQFFRHBA5CNFSM4IMPQMMKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4RB75Q#issuecomment-522330102>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ADXKUNBF5JPKUAW3SO7EAX3QFFRHBANCNFSM4IMPQMMA>
.
|
Hi, it sounds great to me, Let's do it together. |
In the paper,it said that some videos selected from 3079 YouTube8M videos labled 'Quadcopter', so does the IDs of them will be public soon? |
Yes, we will release the IDs soon, and also pretrained checkpoints.
Regarding the initialization mechanism, we're trying to release it soon,
but it might take a bit longer. We will update as soon as possible.
…On Mon, Aug 19, 2019 at 7:37 PM linxin ***@***.***> wrote:
In the paper,it said that some videos selected from 3079 YouTube8M videos
labled 'Quadcopter', so does the IDs of them will be public soon?
And I realized that it need much time for process so many videos into
three-frames-split and generate its masks and alignment,so the pretrained
YouTube8M checkpoint will be also released soon?
I also notice that the current release does not yet support initialization
form the resnet18 checkpoint pretrained on ImageNet, I'm trying to write
codes to implement that since struct2depth has the similar codes
organization...
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#46?email_source=notifications&email_token=ADXKUNDZJMA5PUZ3NXGIHHLQFNKE7A5CNFSM4IMPQMMKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4U3TRQ#issuecomment-522828230>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ADXKUNFFSD6GL4UA4PINNG3QFNKE7ANCNFSM4IMPQMMA>
.
|
Hi @gariel-google, I am also evaluating the egomotion prediction by using inference_egomotion to obtain egomition, and using sfmlearner to compute 5-point/3-point ATE. Since I am just evaluating on my trained model (the one with Abs_Rel=0.147, trained on eigen split training set) and eigen split training set has overlap frames with odometry sequence 09 and 10, the ATE should be reasonably good. However, the result I got it's quite bad compared to what you stated in the paper. (I apologize if I am evaluating the egomotion prediction wrongly.)
For your information, I also attached the plotted trajectories. Maybe we could exchange the odometry evaluation result as well? |
We will release the KITTI-trained checkpoints, that should be enough for
comparing odometry evals, am I correct?
It will take at least a few days though - please bear with me ;-)
…On Mon, Aug 19, 2019 at 9:22 PM juju ***@***.***> wrote:
Hi @gariel-google <https://github.com/gariel-google>, I am also
evaluating the egomotion prediction by using inference_egomotion
<https://github.com/google-research/google-research/blob/master/depth_from_video_in_the_wild/model.py#L423>
to obtain egomition, and using sfmlearner
<https://github.com/tinghuiz/SfMLearner/blob/master/kitti_eval/eval_pose.py>
to compute 5-point/3-point ATE.
Since I am just evaluating on my trained model (the one with
Abs_Rel=0.147, trained on eigen split training set) and eigen split
training set has overlap frames with odometry sequence 09 and 10, the ATE
should be reasonably good. However, the result I got it's quite bad
compared to what you stated in the paper. (I apologize if I am evaluating
the egomotion prediction wrongly.)
| | Seq. 09 | Seq. 10 |
|---------|:-------:|---------|
| 5-point | 0.0296 | 0.0245 |
| 3-point | 0.0212 | 0.0180 |
For your information, I also attached the plotted trajectories.
[image: seq09]
<https://user-images.githubusercontent.com/18667188/63317302-e0e7f080-c344-11e9-9916-90d3327fd694.png>
[image: seq10]
<https://user-images.githubusercontent.com/18667188/63317304-e3e2e100-c344-11e9-9780-5357bfd24a6b.png>
Maybe we could exchange the odometry evaluation result as well?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#46?email_source=notifications&email_token=ADXKUNHMCKHUBUG2TGK5WZDQFNWRVA5CNFSM4IMPQMMKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4VAFVA#issuecomment-522846932>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ADXKUNH5Z6TWRLZYN6HYUGDQFNWRVANCNFSM4IMPQMMA>
.
|
Yes, as we exchange results it should be enough for comparing odometry evals. Thanks again for the efforts. |
We just released some checkpoints (links in the README file) with the
respective metrics. Note that there is a slight change in the code (in
depth_prediction_net). Would you be ready to try them and see what metrics
do you obtain? YouTube8M IDs coming soon.
…On Mon, Aug 19, 2019 at 10:41 PM juju ***@***.***> wrote:
Yes, as we exchange results it should be enough for comparing odometry
evals. Thanks again for the efforts.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#46?email_source=notifications&email_token=ADXKUNABRKRBNTNND4SDVPDQFN7YNA5CNFSM4IMPQMMKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4VDZ3Q#issuecomment-522861806>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ADXKUNE3WQWT3JJD2M35E2TQFN7YNANCNFSM4IMPQMMA>
.
|
Thanks for the release. Yes, I am ready to try! However, I noticed that only the data files are released, and as I understand that to restore a model in TensorFlow, we need 3 files (correct me if I am wrong)--index, data, meta. Therefore, could you release the complete checkpoints? |
Sorry about that. I replaced the links, they now link to zip files that
contain all the checkpoint components. Could you check them out? Thanks!
…On Wed, Aug 21, 2019 at 11:14 PM juju ***@***.***> wrote:
Thanks for the release. Yes, I am ready to try! However, I noticed that
only the data files are released, and as I understand that to restore a
model in TensorFlow, we need 3 files (correct me if I am wrong)--index,
data, meta. Therefore, could you release the complete checkpoints?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#46?email_source=notifications&email_token=ADXKUND5Z64FBXF64Z3OLC3QFYVGDA5CNFSM4IMPQMMKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD44AFLA#issuecomment-523764396>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ADXKUNFFXKE4JEXM3LIAY63QFYVGDANCNFSM4IMPQMMA>
.
|
YouTube8M IDs are out (see the README file).
…On Thu, Aug 22, 2019 at 10:34 AM Ariel Gordon ***@***.***> wrote:
Sorry about that. I replaced the links, they now link to zip files that
contain all the checkpoint components. Could you check them out? Thanks!
On Wed, Aug 21, 2019 at 11:14 PM juju ***@***.***> wrote:
> Thanks for the release. Yes, I am ready to try! However, I noticed that
> only the data files are released, and as I understand that to restore a
> model in TensorFlow, we need 3 files (correct me if I am wrong)--index,
> data, meta. Therefore, could you release the complete checkpoints?
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#46?email_source=notifications&email_token=ADXKUND5Z64FBXF64Z3OLC3QFYVGDA5CNFSM4IMPQMMKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD44AFLA#issuecomment-523764396>,
> or mute the thread
> <https://github.com/notifications/unsubscribe-auth/ADXKUNFFXKE4JEXM3LIAY63QFYVGDANCNFSM4IMPQMMA>
> .
>
|
Hi @gariel-google, thanks for the work. The new zip files work for me. I have tested the checkpoint trained on KITTI. The following is what I have:
When batch_size=1, we have the same result, therefore we should be having the same evaluating metrics. However, depth output doesn't have a consistent result when batch_size changes, is it the same case for you? where does the variation come from?
The plotted trajectories: |
Thanks much @liyingliu for testing the checkpoints so quickly. I am happy
that we are getting the exact same result on depth prediction.
Regarding the batch size, we tested at 1. If batch-normalization is
replaced everywhere by randomized layer normalization, the inference
results do not depend on the batch size, as it should be. Due to an
oversight, when we were obtaining the results for the paper, we left a few
batch normalization layers in place. We fixed that since, but to be
compatible with the checkpoints used for the paper, we needed to leave the
batch-norms there, hence the dependence on the batch size.
Regarding odometry, there may be a few explanations to that:
1. We used a more mature checkpoint for odometry, which seemed to converge
slower than depth prediction.
2. We used inference time correction for the intrinsics (even though the
result you're showing seems worse than even our uncorrected one).
3. There might be a difference in the way we stack the rotations and
translations together to obtain a trajectory - this is a bit tricky.
We can debug odometry together by us releasing the checkpoints used for
odometry and the respective inferred trajectories.
At this time I would like to ask you what you would like to prioritize -
getting the code and checkpoint for imagenet initialization, so that you
can reproduce the training, or getting the odometry evaluation right?
Please let me know which one you prefer and I'll start there. It will take
at least a few days, in any case.
Thank you again for your help in debugging this.
…On Thu, Aug 22, 2019 at 7:08 PM juju ***@***.***> wrote:
Hi @gariel-google <https://github.com/gariel-google>, thanks for the
work. The new zip files work for me. I have tested the checkpoint trained
on KITTI. The following is what I have:
- depth result is different when inference with different batch_size:
abs_rel sq_rel rms log_rms a1 a2 a3
batch_size=1 0.1262 0.9462 5.2214 0.2086 0.8470 0.9475 0.9774
batch_size=16 0.1305 1.0186 5.3237 0.2136 0.8389 0.9430 0.9751
When batch_size=1, we have the same result, therefore we should be having
the same evaluating metrics. However, depth output doesn't have a
consistent result when batch_size changes, is it the same case for you?
where does the variation come from?
- odometry result (ATE) when inference with batch_size=1:
seq_09 seq_10
5-point 0.0231 0.0195
3-point 0.0170 0.0149
The plotted trajectories:
seq_09
[image: dfvauthorseq09]
<https://user-images.githubusercontent.com/18667188/63561747-a02ee800-c58d-11e9-8462-285ec61e84e7.png>
seq_10
[image: dfvauthorseq10]
<https://user-images.githubusercontent.com/18667188/63561754-aa50e680-c58d-11e9-9f53-762a795cba98.png>
The odometry result looks quite bad for me. Do you have the same result?
Since eigen split training set has overlap with odometry sequence 09 and
10, the ATE should be better than what you have stated in the paper (0.0231
vs 0.012 and 0.0195 vs 0.010)?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#46?email_source=notifications&email_token=ADXKUNERJ6UPVBOJ67HWQILQF5BBBA5CNFSM4IMPQMMKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD46446Y#issuecomment-524144251>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ADXKUNHCTPAV6C2AHV2CWCLQF5BBBANCNFSM4IMPQMMA>
.
|
Hi @gariel-google, Sorry for the late reply and thanks for your explanations. |
Hi @gariel-google , thanks for your checkpoint. But I test intrinsic inference with your kitti checkpoint, the intrinsic matrix is not right. We use top two images to infer intrinsic matrix: But the ground truth intrinsic matrix: Why the intrinsic matrix is not right? |
@liyingliu Cool, so I am aiming for releasing the code for initializing
from an imagenet checkpoint and the respective checkpoint this week.
@buaafish Let's try to debug it together, let me start by asking you some
questions:
1. Do we have enough evidence to rule out an incorrect normalization of the
images (0-1 vs 0-255), or some other error in running the inference? For
example, were you able to reproduce the depth metrics? Were you able to
obtain reasonable trajectories on the KITTI odometry set?
2. How did you calculate the intrinsics at inference time? Did you run the
inference twice, swapping the order of images, and taking the average, like
here
<https://github.com/google-research/google-research/blob/master/depth_from_video_in_the_wild/model.py#L258>
?
3. As we write in the paper, there are two settings we used for learning
the intrinsics: One with a constraint that the intrinsics are the same
throughout the dataset (like in EuRoC), and the other is where we predict
the intrinsics independently from each pair of images. In the second case,
which is the one you are referring to, Eq. 3 and Fig. 9 show that the
intrinsics are only correct within the accuracy imposed by rotations, and
when there are no rotations, the error can be large. Have you tried to
create a plot similar to Fig. 9 or the example you're showing is the only
one you ran? Does that example have rotations?
…On Mon, Aug 26, 2019 at 12:58 AM buaafish ***@***.***> wrote:
Hi @gariel-google <https://github.com/gariel-google> , thanks for your
checkpoint. But I test intrinsic inference with your kitti checkpoint, the
intrinsic matrix is not right.
The input images is one pair kitti images like this:
[image: kitti2]
<https://user-images.githubusercontent.com/6870525/63674721-b81a9c00-c819-11e9-90b4-17f32a181d82.png>
We use top two images to infer intrinsic matrix:
[[119.80293 0. 702.5139 ]
[ 0. 74.126114 -29.449604]
[ 0. 0. 1. ]]
But the ground truth intrinsic matrix:
[[241.67446312 0. 204.16801031]
[ 0. 246.28486827 59.000832 ]
[ 0. 0. 1. ]]
Why the intrinsic matrix is not right?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#46?email_source=notifications&email_token=ADXKUNBFXD3JOZ3HJO72TFDQGOEKLA5CNFSM4IMPQMMKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5DTFVQ#issuecomment-524759766>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ADXKUNHNQN4NLZB4R73PVATQGOEKLANCNFSM4IMPQMMA>
.
|
The inference code like this :
I modified your code :
Then I read RGB images to feed to image1 and image2. The intrinsic matrix is not right too. |
@gariel-google Test code like this.
|
@liyingliu We just added the code for initialization from Imagenet, as well
as some corrections in the hyperparameters for training. Unfortunately I
was unable to obtain clearance to release the specific ImageNet checkpoint
itself yet - sorry about that, things sometimes get more bureaucratic than
expected.
@buaafish Thanks for sharing your code, it's not easy for me though to spot
a bug if there is one. Is there a chance you have an answer for me whether
you were able to reproduce the depth inference metrics and/or whether the
trajectories look reasonable? The Intrinsic matrix is so much off that I
still suspect there is some sort of crude error somewhere.
My next steps are to release the checkpoints we used for calculating
odometry, with learned and given intrinsics, as well as the respective
odometry trajectories. Then I can try to add a small piece of code for
generating Fig. 9 in the paper for the intrinsics, which should hopefully
resolve the intrinsics issue.
Thank you all for your help debugging this, our goal is that everyone will
be able to reproduce out results.
…On Mon, Aug 26, 2019 at 8:31 PM buaafish ***@***.***> wrote:
@gariel-google <https://github.com/gariel-google> Test code like this.
def main(_):
seed = FLAGS.seed
tf.set_random_seed(seed)
np.random.seed(seed)
random.seed(seed)
if not gfile.Exists(FLAGS.checkpoint_dir):
gfile.MakeDirs(FLAGS.checkpoint_dir)
test_model = model.Model(
boxify=FLAGS.boxify,
data_dir=FLAGS.data_dir,
file_extension=FLAGS.file_extension,
is_training=False,
foreground_dilation=FLAGS.foreground_dilation,
learn_intrinsics=FLAGS.learn_intrinsics,
learning_rate=FLAGS.learning_rate,
reconstr_weight=FLAGS.reconstr_weight,
smooth_weight=FLAGS.smooth_weight,
ssim_weight=FLAGS.ssim_weight,
translation_consistency_weight=FLAGS.translation_consistency_weight,
rotation_consistency_weight=FLAGS.rotation_consistency_weight,
batch_size=FLAGS.batch_size,
img_height=FLAGS.img_height,
img_width=FLAGS.img_width,
weight_reg=FLAGS.weight_reg,
depth_consistency_loss_weight=FLAGS.depth_consistency_loss_weight,
queue_size=FLAGS.queue_size,
input_file=FLAGS.input_file)
_test(test_model, FLAGS.checkpoint_dir)
def readImages(path, subdir, name):
filename = name+".png"
filepath = os.path.join(path, subdir, filename)
im = Image.open(filepath)
im_array = np.array(im)
img1 = im_array[:, 0:416, :]
img2 = im_array[:, 416:832, :]
return img1[np.newaxis, :, :, :], img2[np.newaxis, :, :, :]
def readMat(path, subdir, name):
filename = name+"_cam.txt"
filepath = os.path.join(path, subdir, filename)
data_temp=[]
with open(filepath) as fdata:
line=fdata.readline()
data_temp.append([float(i) for i in line.split(',')])
return np.array(data_temp).reshape((3,3))
def readFileList(list_data_dir):
with gfile.Open(list_data_dir) as f:
frames = f.readlines()
frames = [k.rstrip() for k in frames]
subfolders = [x.split(' ')[0] for x in frames]
frame_ids = [x.split(' ')[1] for x in frames]
return subfolders, frame_ids
def _test(test_model, checkpoint_dir):
checkpointpath = "./pretrained/cityscapes_kitti_learned_intrinsics/"
saver = tf.train.import_meta_graph(checkpointpath+'model-1000977.meta')
checkpoint = checkpointpath+"model-1000977"
with tf.device('/cpu:0'):
with tf.Session() as sess:
sess.run(tf.local_variables_initializer())
sess.run(tf.global_variables_initializer())
logging.info('Loading checkpoint...')
saver.restore(sess, checkpoint)
print(reader.IMAGENET_MEAN)
print(reader.IMAGENET_SD)
logging.info('Reading data...')
path = "./kitti/format_data"
list_data_dir = "test.txt"
subfolders, frame_ids = readFileList(list_data_dir)
for (subdir, name) in zip(subfolders, frame_ids):
img1, img2 = readImages(path, subdir, name)
logging.info('Start testing...')
ret = test_model.inference_egomotion(img1, img2,sess)
print(ret[2])
mat = readMat(path, subdir, name)
print(mat)
logging.info('End testing...')
if __name__ == '__main__':
app.run(main)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#46?email_source=notifications&email_token=ADXKUNHNC2IDOE7JDYVIPN3QGSN25A5CNFSM4IMPQMMKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5GLS5Q#issuecomment-525121910>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ADXKUNHON74T2IYYN5MCBTTQGSN25ANCNFSM4IMPQMMA>
.
|
@gariel-google Understand and thanks! Looking forward to exchanging the odometry results. |
@liyingliu we just released the odometry results, and code for generating
trajectories form checkpoints.
@buaafish intrinsics is coming next.
…On Sat, Aug 31, 2019 at 4:18 AM juju ***@***.***> wrote:
@gariel-google <https://github.com/gariel-google> Understand and thanks!
Looking forward to exchanging the odometry results.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#46?email_source=notifications&email_token=ADXKUNCLGJXYI4KIP2SS55TQHJHRBA5CNFSM4IMPQMMKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5TKYBY#issuecomment-526822407>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ADXKUNGKRK7EEGWIDAEODYDQHJHRBANCNFSM4IMPQMMA>
.
|
@gariel-google did you get clearance to release the specific ImageNet checkpoint? I want to try with that.
Not learning. |
@cognitiveRobot If your question is to restore and train the checkpoint that the author provided. Then, you could try to add a file named "checkpoint" in your checkpoint folder (the folder contains .index, .meta .data-xxxx). The content in "checkpoint" file can be the following: |
@cognitiveRobot the pretrained checkpoint from resnet18 that we used was
taken from here <http://cognitiveRobot>. We cannot release it here because
it was taken from somewhere else. Sorry about that...
…On Wed, Nov 13, 2019 at 12:40 AM juju ***@***.***> wrote:
@cognitiveRobot <https://github.com/cognitiveRobot> If your question is
to restore and train the checkpoint that the author provided. Then, you
could try to add a file named "checkpoint" in your checkpoint folder (the
folder contains .index, .meta .data-xxxx). The content in "checkpoint" file
can be the following:
model_checkpoint_path:"path_to_kitti_learned_intrinsics/model-248900"
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#46?email_source=notifications&email_token=ADXKUND5PBGYDGCEQCQEC6LQTO4QBA5CNFSM4IMPQMMKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOED5KOAQ#issuecomment-553297666>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADXKUNEFXKA322UKAGSW7WDQTO4QBANCNFSM4IMPQMMA>
.
|
Sorry, here's the link to where we took the pretrained checkpoint from:
https://pytorch.org/docs/stable/torchvision/models.html
…On Wed, Nov 13, 2019 at 3:00 PM Ariel Gordon ***@***.***> wrote:
@cognitiveRobot the pretrained checkpoint from resnet18 that we used was
taken from here <http://cognitiveRobot>. We cannot release it here
because it was taken from somewhere else. Sorry about that...
On Wed, Nov 13, 2019 at 12:40 AM juju ***@***.***> wrote:
> @cognitiveRobot <https://github.com/cognitiveRobot> If your question is
> to restore and train the checkpoint that the author provided. Then, you
> could try to add a file named "checkpoint" in your checkpoint folder (the
> folder contains .index, .meta .data-xxxx). The content in "checkpoint" file
> can be the following:
> model_checkpoint_path:"path_to_kitti_learned_intrinsics/model-248900"
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#46?email_source=notifications&email_token=ADXKUND5PBGYDGCEQCQEC6LQTO4QBA5CNFSM4IMPQMMKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOED5KOAQ#issuecomment-553297666>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/ADXKUNEFXKA322UKAGSW7WDQTO4QBANCNFSM4IMPQMMA>
> .
>
|
@gariel-google, thanks. I will test. :) |
@liyingliu |
Hi, there is an unknown scale between the predicted depth value from the network and the real depth value. You need to scale the predicted depth by such a scale factor to have true depth value. You could use the median of your prediction divided by the median of ground truth to be the scale. |
The predicted depth is up to an unknown scale factor. When evaluating, this
factor is found by normalizing the medians of the predicted and inferred
depth this is standard in this line of publications). Are you
observing strong discrepancies even beyond that global factor?
…On Sun, Apr 12, 2020 at 11:26 PM stephen ***@***.***> wrote:
@liyingliu <https://github.com/liyingliu>
sorry to bother you, but I have some questions I'd like to ask you. I
tried to infer the depth map using the existing checkpoint cityscape_kitti,
but the depth value I read directly from the '.npy' file was far from real
depth, whether it was my own images or images in cityscape. Did I do
something wrong? Or further operations are required to obtain true depth
values. Thank you very much.
'inference.py' from (
https://github.com/tensorflow/models/blob/master/research/struct2depth/inference.py
)
img_width and img_height are defaults (416,128)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#46 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADXKUNC23ORNPVHH5B4GUFTRMKWBLANCNFSM4IMPQMMA>
.
|
@liyingliu @gariel-google |
@gariel-google |
@StephenStorm by "observing strong discrepancies even beyond that global factor" I mean: If you multiply the predicted depth by a factor such that its median matches the median groundtruth depth, do you still see significant discrepancies? @player1321 In the KITTI format, the first 3 columns are the (x, y, z) position of the car if I'm not mistaken. This code generates the inferred (x, y, z)-s of the trajectory. |
@gariel-google Thanks a lot for your patient guidance. |
@player1321 I looked up the checkpoint we used for KITTI odometry (with given intrinsics), and its depth prediction metric is 0.1321, which is indeed worse than the KITTI-only depth error that we report in the paper for given intrinsics (0.129). Is that your concern? We did observe that for odometry results tend to improve the longer we train, whereas for depth they tend to become slightly worse and more noise beyond some point. We did not try to evaluate the cityscapes + KITTI checkpoints for odometry, and I don't know how it would perform. Would you like to share your numbers on both evaluations? |
@gariel-google Thanks for your reply. And here are the numbers:
It seems that the definition of the ATE is not the same as yours. Would you share some evaluation tools? Or any reference about the definitions can be recommended? |
@gariel-google Thanks for sharing the code and helping us reproduce the results. I'm able to reproduce similar figures to the paper using the odometry checkpoints, but the scale seems to be wrong. Is the egomotion network supposed to output positions with the real-world scale immediately, or is it assumed that we're performing a scaling as postprocessing? If yes, which type of scaling is used in the paper? EDIT: From the looks of it, I think the scale-7dof scaling technique is used (see https://github.com/Huangying-Zhan/kitti-odom-eval) |
Also, I realized that the given intrinsics weights link with kitti odometry is wrong: it references references the cityscape model from the depth table right above: https://www.googleapis.com/download/storage/v1/b/gresearch/o/depth_from_video_in_the_wild%2Fcheckpoints%2Fcityscapes_learned_intrinsics.zip?generation=1566493765410932&alt=media |
@player1321 The definition of ATE we used follows Zhou et al. Our ATE eval is based on theirs, which is given here: https://github.com/tinghuiz/SfMLearner/tree/master/kitti_eval. The numbers typically are in the 10^-2-s. Your's are in meters, and are large-ish, so indeed it's probably the same definition. Regarding the translation error, we didn't check if tor the city+KITTI checkpoint, and while your numbers are different, it seems that your numbers are not far from ours, assuming you tested checkpoints with learned intrinsics (right?). @frobinet The odometry predictions are scale-less, just like the depth predictions. We normalized the entire trajectory by its length. That is, we scaled the predicted trajectory uniformly until its total length was identical to the GT length. |
@frobinet I'll have a look at the model links and get back to you, thanks for pointing this out. |
@gariel-google Thanks for helping with this! Any news about releasing the weights for the given intrinsics odometry model? |
@gariel-google Thanks for your guidance! It's very helpful! |
@gariel-google Thanks for sharing the code and helping us. Can you provide trajectories with poses and/or code to reproduce the same value? |
Sorry for the delayed response. @NHirose we released the trajectories here - please see the table below the title, under the links "trajectory". @player1321 We didn't evaluate quantitatively the prediction of residual motion. Qualitatively it looks good in most cases - I know it sounds hand-wavy, but unfortunately there is no number that I can quote to support this quantitatively. |
@gariel-google Thank you for your replying. However your released trajectory file only includes XYZ position. I additionally need roll pitch yaw angles to reproduce the values on your paper. Or can you provide the evaluation file to reproduce the value about ego motion on your paper? That can help me to find differences!! |
Hi, I'm getting an error when loading EuroC MAV checkpoint for training. [depth_from_video_in_the_wild_euroc_ckpt_MachineHallAll]
When using the same code as above with checkpoint saved after training from scratch - no errors. Key MotionFieldNet/compute_loss/MotionFieldNet_2/Conv1/Relu/MotionBottleneck/weights not found in checkpoint |
Did you run the code as is or with modifications? I am asking because a few
months ago, when I uploaded the checkpoint, I did verify that it loads, so
I am trying to track down the reason for the change in behavior.
…On Mon, Aug 3, 2020 at 1:29 AM adizhol ***@***.***> wrote:
Hi,
I'm getting an error when loading EuroC MAV checkpoint for training.
[depth_from_video_in_the_wild_euroc_ckpt_MachineHallAll]
Key
MotionFieldNet/compute_loss/MotionFieldNet_2/Conv1/Relu/MotionBottleneck/weights
not found in checkpoint
[[node save/RestoreV2 (defined at
/depth_from_video_in_the_wild/model.py:117) ]]
@gariel-google <https://github.com/gariel-google>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#46 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADXKUNEGVWFILC3KXWQRVF3R6ZYOZANCNFSM4IMPQMMA>
.
|
No modifications.
The cityscape and kitti snapshots load normally.
Thank you
…On Mon, Aug 3, 2020, 19:46 Ariel Gordon ***@***.***> wrote:
Did you run the code as is or with modifications? I am asking because a few
months ago, when I uploaded the checkpoint, I did verify that it loads, so
I am trying to track down the reason for the change in behavior.
On Mon, Aug 3, 2020 at 1:29 AM adizhol ***@***.***> wrote:
> Hi,
>
> I'm getting an error when loading EuroC MAV checkpoint for training.
> [depth_from_video_in_the_wild_euroc_ckpt_MachineHallAll]
>
> Key
>
MotionFieldNet/compute_loss/MotionFieldNet_2/Conv1/Relu/MotionBottleneck/weights
> not found in checkpoint
> [[node save/RestoreV2 (defined at
> /depth_from_video_in_the_wild/model.py:117) ]]
>
> @gariel-google <https://github.com/gariel-google>
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <
#46 (comment)
>,
> or unsubscribe
> <
https://github.com/notifications/unsubscribe-auth/ADXKUNEGVWFILC3KXWQRVF3R6ZYOZANCNFSM4IMPQMMA
>
> .
>
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#46 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADD2556GIDE6PEKMOL3MX2LR63SVBANCNFSM4IMPQMMA>
.
|
@gariel-google @adizhol I am facing the same issue, i.e., the checkpoints of " cityscape and kitti" works well with the model.py.
Are there any updates about this? Thanks |
Thanks for pointing this out. This seems to be a bug on our side, then. I will look into it, but it may take some time till I can get to it and debug. Sorry about that :-) |
After training on custom data, I'm getting different depth when training and when doing inference (on the same images). update
update
And the error\warning is gone, but the problem still exists update Also, during inference you're inferring on a flipped image, and taking the minimum with the no-flipped image. |
Hey there, I'm still facing this: NotFoundError (see above for traceback): Restoring from checkpoint failed. This is most likely due to a Variable name or other graph key that is missing from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error: Key MotionFieldNet/CameraIntrinsics/foci/biases not found in checkpoint Currently, I'm using the latest code version:
|
@VladimirYugay @adizhol @mathmax12 Run tf.reset_default_graph() before restoring the checkpoint to cope with the error below
|
I have downloaded the checkpoints provided by the author, extracted them and put them in the folder, and also did as you said,added a checkpoint. In the run.sh, I wrote as follows: Is my path wrong?Can you help me,thank you. |
@gariel-google Dear author, Thanks for sharing the source code of the paper.
I was trying to reproduce the result of the paper using your code. However, with your default setting (batch size=4, learning_rate=0.0002, etc.) training from scratch, the result I got it's quite far from what you stated in the paper (Abs Rel 0.147 for the best checkpoint within around 370k-th step vs 0.128 in the paper). For your information, I am using the evaluation code from sfmlearner as what struct2depth does.
Therefore, may I know what's setting for obtaining the paper's result? Or is there anything critical part missing in the current released code (maybe pretrained checkpoint for example)?
Thank you in advance.
The text was updated successfully, but these errors were encountered: