Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dictionary Input and Custom Collate Function #1050

Open
meepd opened this issue Feb 27, 2024 · 1 comment
Open

Dictionary Input and Custom Collate Function #1050

meepd opened this issue Feb 27, 2024 · 1 comment
Labels

Comments

@meepd
Copy link

meepd commented Feb 27, 2024

It's not clear to me how to combine dictionary input (which is the suggested solution for multiple input models, and RNNs) with a custom collate function for by-batch padding of sequences. I know by default the dictionary is unpacked when passing into the forward function, but I can't imagine that's true for the collate_fn. So does it assume that the collate_fn takes in also (X,y) as a tuple, where X is a SliceDict? How do we unpack that SliceDict?

Also, I'm not sure how to actually use SliceDict for variable sequence data. It doesn't seem to accept a list of tensors.

I would prefer to use a Dataset I defined, but not clear how to handle multiple inputs in that case.

@BenjaminBossan
Copy link
Collaborator

but I can't imagine that's true for the collate_fn

The default collate_fn from PyTorch correctly deals with dictionary inputs. If you want to pass a custom collate_fn, you'd have to ensure that it does too. Ideally, you could share your code and the error so that we can take a look.

So does it assume that the collate_fn takes in also (X,y) as a tuple, where X is a SliceDict? How do we unpack that SliceDict?

Just to be sure, SliceDict is not involved when using dictionary inputs. Its main purpose is basically to trick sklearn into accepting dictionary inputs, e.g. when you want to use GridSearchCV and pass a dict as X.

I would prefer to use a Dataset I defined, but not clear how to handle multiple inputs in that case.

Again, if you could provide some code and (dummy) data, it would help us to figure out your issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants