Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Masking certain features in attention based models #101

Closed
Aceticia opened this issue Nov 12, 2022 · 3 comments
Closed

Masking certain features in attention based models #101

Aceticia opened this issue Nov 12, 2022 · 3 comments
Labels
wontfix This will not be worked on

Comments

@Aceticia
Copy link

Is your feature request related to a problem? Please describe.
In my dataset, certain columns of certain rows (as opposed to the entire column) have invalid values. Currently I'm not sure how to solve it.

Describe the solution you'd like
We can pass in an optional mask dataframe with a similar format to how pytorch's own transformers handle masking tokens such as padding.

Describe alternatives you've considered
I considered just removing the rows or replacing them with a fixed value. It's probably not ideal but it somewhat works.

Additional context
I don't know if this is possible yet, so I'm mostly asking a question here. If it's not planned I might be able to help out on this. I imagine the implementation might be relatively easy for the transformer- / attention-based models.

@manujosephv
Copy link
Owner

@Aceticia Sorry for the late response. Been tied up with many other engagements.. So, currently there is no way to use a mask in attention based models. At least not with the current API in PyTorch Tabular.

But out of curiosity, why isn't it ideal to just drop the invalid rows? is it because of some temporal nature of the data?

@Aceticia
Copy link
Author

I understand. The reason I'm interested in this function is that I have a small table where most of the rows contain a small number of invalid values. Simply dropping the rows results in an approximately 75% fewer rows. If you think it's something that can be included in the official API, I'll be happy to discuss ways I can contribute to make this feature available.

@stale
Copy link

stale bot commented Feb 11, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix This will not be worked on label Feb 11, 2023
@stale stale bot closed this as completed Feb 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

2 participants