-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Add an introduction section to the MD file #321
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Sync to origin
|
@fchollet - Thank you so much for merging my previous PR. I am not sure why the introduction section was not added to the MD file. I created this PR to add the introduction |
Co-authored-by: 8bitmp3 <19637339+8bitmp3@users.noreply.github.com>
8bitmp3
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @ksalama I have a few proposals to improve this intro, if you don't mind.
I think here you're describing the results of an experiment (by saying it "outperforms"). Maybe it'd be more useful for the readers to first learn about the gist of this supervised contrastive learning (SCL) and how it works in 1-2 sentences.
Then, you could finish off this small introductory paragraph with the "outperforms" statement, while also being explicit about how SCL outperforms the traditional vanilla cross-entropy supervised learning stuff (judging by Table 2 and 3 on page 7 of https://arxiv.org/pdf/2004.11362.pdf, SCL "outperforms" i.t.o. accuracy by a margin).
I think the cool thing about SCL is that it extends the previous (?) self-supervised approach to supervised learning - you should definitely highlight it here ("the self-supervised batch contrastive approach to the fully-supervised" - page 1, https://arxiv.org/pdf/2004.11362.pdf). SCL "contrasts the set of all samples from the same class as positives against the negatives from the remainder of the batch" (page 2, figure 3).
On page 4 under "Method", the paper actually summarizes what SCL does:
"Given an input batch of data, we first apply data augmentation twice to obtain two copies of the batch. Both copies are forward propagated through the encoder network to obtain a 2048-dimensional normalized embedding. During training, this representation is further propagated through a projection network that is discarded at inference time. The supervised contrastive loss is computed on the outputs of the projection network. To use the trained model for classification, we train a linear classifier on top of the frozen representations using a cross-entropy loss."
I recommend you summarize it in a human-friendly way to appeal to non-academics.
I can assist you with that, if you need help.
Anyway, these are just suggestions.
| [Supervised Contrastive Learning](https://arxiv.org/abs/2004.11362) | ||
| (Prannay Khosla et al.) is a training methodology that outperforms | ||
| plain crossentropy-supervised training on classification tasks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @ksalama I have a few proposals to improve this intro, if you don't mind.
I think here you're describing the results of an experiment (by saying it "outperforms"). Maybe it'd be more useful for the readers to first learn about the gist of this supervised contrastive learning (SCL) and how it works in 1-2 sentences.
Then, you could finish off this small introductory paragraph with the "outperforms" statement, while also being explicit about how SCL outperforms the traditional vanilla cross-entropy supervised learning stuff (judging by Table 2 and 3 on page 7 of https://arxiv.org/pdf/2004.11362.pdf, SCL "outperforms" i.t.o. accuracy by a margin).
I think the cool thing about SCL is that it extends the previous (?) self-supervised approach to supervised learning - you should definitely highlight it here ("the self-supervised batch contrastive approach to the fully-supervised" - page 1, https://arxiv.org/pdf/2004.11362.pdf). SCL "contrasts the set of all samples from the same class as positives against the negatives from the remainder of the batch" (page 2, figure 3).
On page 4 under "Method", the paper actually summarizes what SCL does:
"Given an input batch of data, we first apply data augmentation twice to obtain two copies of the batch. Both copies are forward propagated through the encoder network to obtain a 2048-dimensional normalized embedding. During training, this representation is further propagated through a projection network that is discarded at inference time. The supervised contrastive loss is computed on the outputs of the projection network. To use the trained model for classification, we train a linear classifier on top of the frozen representations using a cross-entropy loss."
I recommend you summarize it in a human-friendly way to appeal to non-academics.
I can assist you with that, if you need help.
Anyway, these are just suggestions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@8bitmp3 Thanks a lot for suggestion. Please feel free to provide a intro text that you think it could be simple and useful, and will be happy to commit your suggestion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ksalama Would it fair to say that the the supervised contrastive learning paper introduces a way of training that includes the supervised contrastive loss? Also, we could say that the method offers a two-stage framework that enhances the image classification performance (borrowed from: https://github.com/sayakpaul/Supervised-Contrastive-Learning-in-TensorFlow-2). I also like how they worded it here: "Learn how to map the normalized encoding of samples belonging to the same category closer and the samples belonging to the other classes farther." (https://wandb.ai/authors/scl/reports/Improving-Image-Classifiers-With-Supervised-Contrastive-Learning--VmlldzoxMzQwNzE). We could rephrase it with attribution. cc @fchollet
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bear with me, there's a talk by one of the sponsors at the NeurIPS conference today - and probably next week by the paper's authors - that cover(s) contrastive and supervised contrastive learning. I'll take some notes and revise the intro to make it more useful for the readers. cc @fchollet
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@8bitmp3 Sounds good. I would merge this basic introduction to the .md file so that the example page on the website will have "an" introduction (as it is currently hasn't!). Then we can update the introduction as you suggest.
Co-authored-by: 8bitmp3 <19637339+8bitmp3@users.noreply.github.com>
| [Supervised Contrastive Learning](https://arxiv.org/abs/2004.11362) | ||
| (Prannay Khosla et al.) is a training methodology that outperforms | ||
| plain crossentropy-supervised training on classification tasks. | ||
| supervised training on classification tasks with cross-entropy. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the Keras API, "crossentropy" is a single word
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it @fchollet
fchollet
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR. Any changes should be first applied to the .py file, then replicated in the md and ipynb files.
|
Also note that I have fixed the issue with the intro not showing up. The reason why it happened is that it was part of the same block of text as the header. I've added a test that makes sure we'll catch this sort of issue in the future. |
Co-authored-by: 8bitmp3 <19637339+8bitmp3@users.noreply.github.com>
|
@fchollet - I have update the introduction in the |
Probable error in example "TemporalSoftmax" (keras-team#320)
fchollet
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM otherwise
|
@fchollet - I have updated the intro in the three files |
fchollet
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you
Add an introduction section to the MD file (keras-team#321)
No description provided.