Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A Field of tensors #127

Closed
kylegao91 opened this issue Sep 22, 2017 · 5 comments
Closed

A Field of tensors #127

kylegao91 opened this issue Sep 22, 2017 · 5 comments

Comments

@kylegao91
Copy link
Contributor

I have an application where the target field is a list of integers, which should be directly converted to a vector. It doesn't require tokenization and numericalization. Is it currently possible with torchtext?

Specifically, the application is a pointer network where the target sequence is a list of indices pointing to the input sequence. It is useful for combinatorial problems.

@jekbradbury
Copy link
Contributor

This should be possible after #119. Can you check out that PR branch and see if it solves your use case?

@nelson-liu
Copy link
Contributor

nelson-liu commented Sep 22, 2017

actually, #119 doesn't quite address this (only fields of numbers, as opposed to a list of numbers).

To elaborate more on the issue:

In this use case, say you want to turn 0.1 0.2 0.3 into a Tensor of [0.1, 0.2, 0.3]. You can't coerce this directly to a float obviously, so you have to tokenize it in some manner. But how do you deal with padding if there is any (this is why using sequential isn't super clean)?

Perhaps we should have some other parameter for numeric types? Alternatively, maybe I'm thinking about this all wrong and there's a cleaner solution?

@kylegao91
Copy link
Contributor Author

True, it is tricky when it comes to float. I was assuming integers when I thought about it, in which case the padding is just another special number. Is there any use case for float?

@kylegao91
Copy link
Contributor Author

I found in OpenNMT/OpenNMT-py that they store dynamic information in each example:
https://github.com/OpenNMT/OpenNMT-py/blob/2a8653d88e6f7caa8754720176fdb69d0b1d0758/onmt/IO.py#L168

I think that would work as an walk around for this issue.

@jekbradbury
Copy link
Contributor

jekbradbury commented Oct 17, 2017

I think #147 closes this too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants