-
Notifications
You must be signed in to change notification settings - Fork 396
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ValueError: Unsupported dataset schema #449 #529
Comments
I suggested a fix that you haven't tried yet: A quick diagnosis tells me you should be using our from textattack.datasets import HuggingFaceDataset
train_dataset = HuggingFaceDataset('squad', split='train')
eval_dataset = HuggingFaceDataset('squad', split='validation') |
Thank you, Jack. Things are working now. In the same code above when I try the yelp dataset, it shows that it will take several days to complete because the size of examples is about 560.000.00. Is it possible to reduce the number of examples to about 10k so that it would go faster? |
yes! I would try using the rotten_tomatoes dataset instead. It's much smaller. |
Great. Many thanks. I really appreciate it. |
I am running the following code to test IMDB on WordCNN model- It gives me error: NameError: name 'model_wrapper' is not defined !pip install textattack We only use DeepWordBugGao2018 to demonstration purposes.attack = textattack.attack_recipes.DeepWordBugGao2018.build(model_wrapper) Train for 3 epochs with 1 initial clean epochs, 1000 adversarial examples per epoch, learning rate of 5e-5, and effective batch size of 32 (8x4).training_args = textattack.TrainingArgs( |
uhh, yeah, you still need this piece of the code: model = transformers.AutoModelForSequenceClassification.from_pretrained("bert-base-uncased")
tokenizer = transformers.AutoTokenizer.from_pretrained("bert-base-uncased")
model_wrapper = textattack.models.wrappers.HuggingFaceModelWrapper(model, tokenizer) |
That worked. Many thanks! |
I ran the training on LSTM using command: textattack train --model-name-or-path lstm --dataset yelp_polarity --epochs 50 --learning-rate 1e-5 |
Pretty sure you have to create a model wrapper file and use the |
When I try to run an attack using my saved model. I use this command: !textattack attack --recipe textfooler --num-examples 100 --model ./outputs/2021-09-15-06-37-33-327512/best_model --dataset-from-huggingface imdb --dataset-split test but it gives me this error: ValueError: Error: unsupported TextAttack model ./outputs/2021-09-15-06-37-33-327512/best_model Do you know what could be going wrong? |
You're using |
I am trying to run an attack on a pretrained, fine-tuned model as follows: but its giving me the following error: I am not sure why it would not take the pretrained model above- is there anything I am doing wrong here? |
I am running adversarial training on NLP models and I am getting an error " ValueError: Unsupported dataset schema ". When I run the following code:
import textattack
import transformers
from textattack.datasets import HuggingFaceDataset
model = transformers.AutoModelForSequenceClassification.from_pretrained("bert-base-uncased")
tokenizer = transformers.AutoTokenizer.from_pretrained("bert-base-uncased")
model_wrapper = textattack.models.wrappers.HuggingFaceModelWrapper(model, tokenizer)
We only use DeepWordBugGao2018 to demonstration purposes.
attack = textattack.attack_recipes.DeepWordBugGao2018.build(model_wrapper)
train_dataset = HuggingFaceDataset('squad', split='train')
eval_dataset = HuggingFaceDataset('squad', split='validation')
Train for 3 epochs with 1 initial clean epochs, 1000 adversarial examples per epoch, learning rate of 5e-5, and effective batch size of 32 (8x4).
training_args = textattack.TrainingArgs(
num_epochs=3,
num_clean_epochs=1,
num_train_adv_examples=1000,
learning_rate=5e-5,
per_device_train_batch_size=8,
gradient_accumulation_steps=4,
log_to_tb=True,
)
trainer = textattack.Trainer(
model_wrapper,
"classification",
attack,
eval_dataset,
training_args
)
trainer.train()
@jxmorris12
The text was updated successfully, but these errors were encountered: