You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
I wanted to test this package for multilabel so I tried the example code for "Minimal Start for Multilabel Classification".
To Reproduce
Copy example code from README.md to clipboard (reproduced here) and paste into file multilabelmve.py:
from simpletransformers.classification import MultiLabelClassificationModel
import pandas as pd
# Train and Evaluation data needs to be in a Pandas Dataframe containing at least two columns, a 'text' and a 'labels' column. The `labels` column should contain multi-hot encoded lists.
train_data = [['Example sentence 1 for multilabel classification.', [1, 1, 1, 1, 0, 1]]] + [['This is another example sentence. ', [0, 1, 1, 0, 0, 0]]]
train_df = pd.DataFrame(train_data, columns=['text', 'labels'])
train_df = pd.DataFrame(train_data)
eval_data = [['Example eval sentence for multilabel classification.', [1, 1, 1, 1, 0, 1]], ['Another example eval sentence.', **0**], ['Example eval senntence belonging to class 2', [0, 1, 1, 0, 0, 0]]]
eval_df = pd.DataFrame(eval_data)
# Create a MultiLabelClassificationModel
model = MultiLabelClassificationModel('roberta', 'roberta-base', num_labels=6, args={'reprocess_input_data': True, 'overwrite_output_dir': True, 'num_train_epochs': 5})
print(train_df.head())
# Train the model
model.train_model(train_df)
# Evaluate the model
result, model_outputs, wrong_predictions = model.eval_model(eval_df)
print(result)
print(model_outputs)
predictions, raw_outputs = model.predict(['This thing is entirely different from the other thing. '])
print(predictions)
print(raw_outputs)
From active simpletransformers conda environment run: python multilabelmve.py
The model trains fine but fails on evaluation @ line 21 model.eval_model(eval_df)
Error trace:
Traceback (most recent call last):
File "multilabelmve.py", line 21, in <module>
result, model_outputs, wrong_predictions = model.eval_model(eval_df)
File "/home/gilles/repos/simpletransformers/simpletransformers/classification/multi_label_classification_model.py", line 103, in eval_model
return super().eval_model(eval_df, output_dir=output_dir, multi_label=multi_label, verbose=verbose, **kwargs)
File "/home/gilles/repos/simpletransformers/simpletransformers/classification/classification_model.py", line 307, in eval_model
result, model_outputs, wrong_preds = self.evaluate(eval_df, output_dir, multi_label=multi_label, **kwargs)
File "/home/gilles/repos/simpletransformers/simpletransformers/classification/multi_label_classification_model.py", line 106, in evaluate
return super().evaluate(eval_df, output_dir, multi_label=multi_label, prefix=prefix, **kwargs)
File "/home/gilles/repos/simpletransformers/simpletransformers/classification/classification_model.py", line 337, in evaluate
eval_dataset = self.load_and_cache_examples(eval_examples, evaluate=True)
File "/home/gilles/repos/simpletransformers/simpletransformers/classification/multi_label_classification_model.py", line 109, in load_and_cache_examples
return super().load_and_cache_examples(examples, evaluate=evaluate, no_cache=no_cache, multi_label=multi_label)
File "/home/gilles/repos/simpletransformers/simpletransformers/classification/classification_model.py", line 446, in load_and_cache_examples
all_label_ids = torch.tensor([f.label_id for f in features], dtype=torch.long)
TypeError: not a sequence
Expected behavior
Evaluation in minimal example for multilabel classification works.
I figured out that the value for [f.label_id for f in features] is [[1, 1, 1, 1, 0, 1], 0, [0, 1, 1, 0, 0, 0]] which is probably not correct, because to input is not a one-hot encoding list but simple int 0.
Desktop (please complete the following information):
Ubuntu 18.04
All requirements except Apex installed following README.md
The text was updated successfully, but these errors were encountered:
I am making a PR to fix the README, should I also add all examples in seperate to an examples/ directory at project root while I am at it? e.g. ./examples/multiclass.py, ./examples/multilabel.py, etc
I think it would help discovery for this project because people can quickly test code.
Describe the bug
I wanted to test this package for multilabel so I tried the example code for "Minimal Start for Multilabel Classification".
To Reproduce
multilabelmve.py
:python multilabelmve.py
model.eval_model(eval_df)
Error trace:
Expected behavior
Evaluation in minimal example for multilabel classification works.
I figured out that the value for
[f.label_id for f in features]
is[[1, 1, 1, 1, 0, 1], 0, [0, 1, 1, 0, 0, 0]]
which is probably not correct, because to input is not a one-hot encoding list but simple int 0.Desktop (please complete the following information):
The text was updated successfully, but these errors were encountered: