New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
regression #3
Comments
This is a pure code implementation, no experiments or training code or test.
|
What do you mean by "number of regressors"? I initially had a classification based transformer code and then convert it to a regressor. I am not sure if the following is correct? Is
Previously, it was: |
Hi did you figure out how to use timesformer for regression tasks as i am trying to do the same but have found no luck |
Yeah, that's it!
|
Thank you for the quick response, so lets say that I am hoping to use the pretrained timesformer model for regression instead of classification, for example using negative pearson loss, and each frame of the video having a unique numeric label/ground truth. So essentially the training data would be a 60 sec video broken into frames with corrsponding values/ labels for each frame. So in this case the we will only have a 1 dimensional regression am I right? |
I think that
So you have to construct a dataloader that generates this. When I used these models I trained from scratch. So I was not carefully checking what input the model expects, I used the model as an architecture. For training, construct a dataloader that, for each batch of videos, gives you a batch of values. How you label this snippets of video (you will have to subsample or reduce the input size, as the model cannot ingest inputs that are too long). I hope that clarifies the strategy to follow. Another quick tip, you can create a super simple dataloader by stacking the full video together and then just slicing randomly on it; here you have an example |
Thank you so much for the quick and detailed responce, I am sorry for asking so many questions I am new to the whole video transformer domain. I just have a follow up question so my dataloader looks something like this Containing video frames and corresponding to them pulse signal. train_loader = torch.utils.data.DataLoader( |
@tcapelle hi thanks for all the help regarding the data loader, I am sorry to bother yet again. I was having some trouble understadning where this issue arises from and why it arises as the only thing I changed is the dataloaders. |
Sorry, I can't help you with this. Maybe ask on the PyTorch forums? |
sorry, don't know. |
Thank you for all the help, just a tiny follow up for the TimeSformer did you use the code provided by facebook or did you manage to find some other script |
I used @lucidrains implementation |
But @lucidrains implementation doesnt have a trainer code does it? |
hi @tcapelle using TimeSformer(orange line) for regression commapred to 3D CNN(pink line) my results are quite weird. I am adding a screen shot of the loss(MSE)-epoch graph for training and validation. Note: Each video is broken into chuncks of 32 conseutive frames each with their corresponding gt values. The model predicts 1 value per frame fed so for 32 frames it outputs 32 values. |
Beautiful work as usual, thanks for this implementation.
I'm curious if you tried using this for a regression task? I have tried using TimeSFormer without success yet, I know the signal is there because I can learn it with a small 3dcnn trained from scratch so I suspect my understanding of how and where to modify the transformer is the culprit. The output is a 1D vector with len == num_frames. Any suggestions very appreciated!
The text was updated successfully, but these errors were encountered: