-
Notifications
You must be signed in to change notification settings - Fork 811
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using a field representing real numbers with the iterator #78
Comments
Found use_vocab argument 😞 |
Even after setting use_vocab=False. I get
It is the same error that one gets when you try to do torch.DoubleTensor('1.2'). Is there something I am doing wrong? |
Thanks for the issue. Torchtext needs to convert the string number to an Edit: just noticed that your example uses doubles. changed my code accordingly (tab separated file)
The following works on my machine in the meantime while we patch this: In [1]: import torch
In [2]: from torchtext import data
In [3]: TEXT = data.Field(batch_first=True)
In [4]: TARGETS = data.Field(sequential=False, tensor_type=torch.DoubleTensor, batch_first=True, use_vocab=False, postprocessing=data.Pipeline(lambda x: float(x)))
In [5]: fields = [('targets', TARGETS), ('text', TEXT)]
In [6]: dataset = data.TabularDataset(path="test.txt", format="tsv", fields=fields)
In [7]: TEXT.build_vocab(dataset)
In [8]: train_iter = data.Iterator(dataset, batch_size=1, sort_key=lambda x: len(x.text), shuffle=True)
In [9]: batch = next(iter(train_iter))
In [10]: batch.targets
Out[10]:
Variable containing:
1.3000
[torch.cuda.DoubleTensor of size 1 (GPU 0)] Hope that helps. |
Thanks for the solution @nelson-liu |
could you leave this open for now --- there is a bug behind this that would be nice to track (the fact that we do not actually convert values with |
Sure, I agree. |
Yeah, I was originally imagining that values would be provided as Python numerical types -- but that isn't really consistent with the nature of the library as loading mostly text values. Certainly if it sees strings it should convert them! |
If both my fields like target and source are sequences then also we get the same error , any idea on how to resolve this? |
for me the above one, didn't work. |
@greed2411 you don't even need the lambda. edit: It seems to work for |
|
I am trying to learn a regressor on text data and I use torchtext in all my other tasks but I see a problem in using it for this use case.
I define the field for targets as follows:
I have a file that contains tab separate \t
When I make iterators out of it,
it gives me an error when getting the next batch:
I know this is because I didn't run .build_vocab for the TARGETS field. But why do I really need to do this? What if I just want to get real numbers and compute losses on them?
Any workaround is appreciated. If I am doing something wrong, please let me know too.
The text was updated successfully, but these errors were encountered: