You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've noticed a potential issue in the implementation of the _TSSequencerEncoderLayer class, where the LSTM layer appears to be applied along the channel axis (feature size) instead of the temporal axis (sequence length). This is evident from the initialization of the LSTM layer:
LSTM Layer Initialization:
Currently, the LSTM layer is initialized as follows:
These issues were identified during a detailed code review while integrating the model into my project. Specifically, I applied the model to two different tasks on RAVDESS AV emotion dataset:
Emotion prediction using a facial embedding sequence extracted from a video.
Emotion prediction using audio feature sequences.
In an attempt to address these concerns, I tested the model's performance with the proposed changes. Interestingly, the results were quite surprising. The performance remained similar whether the LSTM was applied across the channel axis (as per the current implementation) or across the time steps (as per the proposed modification). This observation raises questions about the expected impact of these changes and suggests a need for further investigation into the model's behavior in different application contexts.
The text was updated successfully, but these errors were encountered:
Issue Description
I've noticed a potential issue in the implementation of the _TSSequencerEncoderLayer class, where the LSTM layer appears to be applied along the channel axis (feature size) instead of the temporal axis (sequence length). This is evident from the initialization of the LSTM layer:
LSTM Layer Initialization:
Currently, the LSTM layer is initialized as follows:
This should be revised to:
Fully Connected Layer Adjustment:
The
self.fc
layer needs to be updated to accommodate the change in LSTM layer dimensions:Modifications in Forward Pass:
The forward method needs modifications to correctly process the data through the LSTM layer:
Additional Context:
These issues were identified during a detailed code review while integrating the model into my project. Specifically, I applied the model to two different tasks on RAVDESS AV emotion dataset:
In an attempt to address these concerns, I tested the model's performance with the proposed changes. Interestingly, the results were quite surprising. The performance remained similar whether the LSTM was applied across the channel axis (as per the current implementation) or across the time steps (as per the proposed modification). This observation raises questions about the expected impact of these changes and suggests a need for further investigation into the model's behavior in different application contexts.
The text was updated successfully, but these errors were encountered: