Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

about temporal_loss? only need 2 frames, Would it be good to calculate the effect between many frames #4

Open
zhanghongyong123456 opened this issue Oct 11, 2022 · 9 comments

Comments

@zhanghongyong123456
Copy link

I see the basic loss calculation, it only takes two frames,Most of our actual videos are 30fps, so how good is the two-frame calculation? Is it necessary to add multiple consecutive frames for calculation?
image

@daipengwa
Copy link
Member

I agree with you, maybe you can apply some long-term contraints on more frames. In our experiment, using two frames brings us improved temporal consistency.

@zhanghongyong123456
Copy link
Author

I agree with you, maybe you can apply some long-term contraints on more frames. In our experiment, using two frames brings us improved temporal consistency.

I just have a simple idea, I don't fully understand this consistency loss(especially Multi-Scale Region-Level Relation Loss), can you give a general idea of the specific implementation of multi-frame time consistency, thank you very much, like samples have 10 frames,what should i do?

@zhanghongyong123456
Copy link
Author

I agree with you, maybe you can apply some long-term contraints on more frames. In our experiment, using two frames brings us improved temporal consistency.

  1. i debug code , Notice that the code is a little bit different from the paper,
    image
    image

@daipengwa
Copy link
Member

First, the motivation of using multi-scale design is because the distance between eye and screen are not fixed (e.g., the screen will cover a small area in your eyes when you stand at a far distance, vice versa).

Second, I think you can choose frames with random steps (t, t+?), or propagte it (t, t+2) = (t, t+1) + (t+1, t+2)? I am not sure

Third, thanks for pointing out this, the lamda should be put at the first part. you can freely change theses hyperparameters.

@zhanghongyong123456
Copy link
Author

ok,For the second point, it's use (t, t+n) or ((t, t+2) = (t, t+1) + (t+1, t+2) i try it,Thanks for your idea

@zhanghongyong123456
Copy link
Author

First, the motivation of using multi-scale design is because the distance between eye and screen are not fixed (e.g., the screen will cover a small area in your eyes when you stand at a far distance, vice versa).

Second, I think you can choose frames with random steps (t, t+?), or propagte it (t, t+2) = (t, t+1) + (t+1, t+2)? I am not sure

Third, thanks for pointing out this, the lamda should be put at the first part. you can freely change theses hyperparameters.

Hi, I would like to get your guidance, thank you very much

for second, i use temporal_loss for video matte, but test result is not good ,This my design config (temporal_loss_mode = 1, weight_t=50)

  1. Is my design correct? First calculate the difference of the images in sequence, add them, and finally perform the L1 loss calculation uniformly
    <0> mode == 0
    image
    < 1> mode == 1
    image
  2. for output=alpha ,Is mode 0(basic relation-based loss) better than mode 1(multi-scale relation-based loss)?
    Because the alpha output is just a black and white image,No need for multiscale image

@daipengwa
Copy link
Member

  1. For your design, if the img_count=40, finally, the gt_error=img(39)-img(0). you skip all the frames between (0~39), only keep the long-term change between 0 and 39.
  2. I suggest you begin with the basic one, then add other designs to see if the quality will be improved?

@zhanghongyong123456
Copy link
Author

  1. For your design, if the img_count=40, finally, the gt_error=img(39)-img(0). you skip all the frames between (0~39), only keep the long-term change between 0 and 39.
  2. I suggest you begin with the basic one, then add other designs to see if the quality will be improved?

ok ,
sorry i made a mistake, i see, If there are multiple frames of images, it should be like this, right?
image

@onlyinheaven
Copy link

  1. For your design, if the img_count=40, finally, the gt_error=img(39)-img(0). you skip all the frames between (0~39), only keep the long-term change between 0 and 39.
  2. I suggest you begin with the basic one, then add other designs to see if the quality will be improved?

ok , sorry i made a mistake, i see, If there are multiple frames of images, it should be like this, right? image

Your discussion is very interesting. I am currently also experimenting with similar things. I would like to know if you have figured out how to implement temporal loss between multiple images in the end.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants