Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

So confused about the return value of the get_loss_img2text_image function in the Trainer file. #1

Closed
CCaoWWei opened this issue Jul 3, 2024 · 2 comments

Comments

@CCaoWWei
Copy link

CCaoWWei commented Jul 3, 2024

Thank you for open-sourcing the code. This article has been very insightful and inspiring to me. However, I have some questions while reviewing the code.

Q1:This function currently appears to only have the Lc loss from the paper and does not include the loss from the Lr component.

Q2:In the get_loss_img2text function, the loss and extra_loss within the if branch do not correspond to those in the else branch.

Q1:
9PB}AMEUHPFA4 B(5RNQF98

Q2:
@FK@NYAQ(CN@62O8$QZ~IAY

@suoych
Copy link
Owner

suoych commented Jul 3, 2024

Thank you for open-sourcing the code. This article has been very insightful and inspiring to me. However, I have some questions while reviewing the code.

Q1:This function currently appears to only have the Lc loss from the paper and does not include the loss from the Lr component.

Q2:In the get_loss_img2text function, the loss and extra_loss within the if branch do not correspond to those in the else branch.

Q1: 9PB}AMEUHPFA4 B(5RNQF98

Q2: @FK@NYAQ(CN@62O8$QZ~IAY

Hi, Thank you for your interest!

First, I apologize for the raw code presented in the repo, promise I will reformat and add more docstrings.

For the first question, in our real implementation, we seperately train the two branches (Each with 4 GPUs). The get_loss_img2text_image function is the image-only contrastive branch and the get_loss_img2text is the textual alignment branch.

As for the second question, we only train the model using 4 cards thus you can always refer to the if branch. The else branch is only used for debug which can be ignored.

@CCaoWWei
Copy link
Author

CCaoWWei commented Jul 4, 2024

Thank you for open-sourcing the code. This article has been very insightful and inspiring to me. However, I have some questions while reviewing the code.
Q1:This function currently appears to only have the Lc loss from the paper and does not include the loss from the Lr component.
Q2:In the get_loss_img2text function, the loss and extra_loss within the if branch do not correspond to those in the else branch.
Q1: 9PB}AMEUHPFA4 B(5RNQF98
Q2: @FK@NYAQ(CN@62O8$QZ~IAY

Hi, Thank you for your interest!

First, I apologize for the raw code presented in the repo, promise I will reformat and add more docstrings.

For the first question, in our real implementation, we seperately train the two branches (Each with 4 GPUs). The get_loss_img2text_image function is the image-only contrastive branch and the get_loss_img2text is the textual alignment branch.

As for the second question, we only train the model using 4 cards thus you can always refer to the if branch. The else branch is only used for debug which can be ignored.

I understand now. Thank you for your response. ^_^

@suoych suoych closed this as completed Jul 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants