Importing the textattack

In [1]:
!pip install --upgrade textattack

Collecting textattack
  Downloading textattack-0.3.10-py3-none-any.whl.metadata (38 kB)
Collecting bert-score>=0.3.5 (from textattack)
  Downloading bert_score-0.3.13-py3-none-any.whl.metadata (15 kB)
Collecting flair (from textattack)
  Downloading flair-0.14.0-py3-none-any.whl.metadata (12 kB)
Collecting language-tool-python (from textattack)
  Downloading language_tool_python-2.8.1-py3-none-any.whl.metadata (12 kB)
Collecting lemminflect (from textattack)
  Downloading lemminflect-0.2.3-py3-none-any.whl.metadata (7.0 kB)
Collecting lru-dict (from textattack)
  Downloading lru_dict-1.3.0-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (4.5 kB)
Collecting datasets>=2.4.0 (from textattack)
  Downloading datasets-3.1.0-py3-none-any.whl.metadata (20 kB)
Collecting terminaltables (from textattack)
  Downloading terminaltables-3.1.10-py2.py3-none-any.whl.metadata (3.5 kB)
Collecting word2number (from textattack)
  Downloading word2

Data augmentation

Augment dataset from ’examples.csv’ using
the EmbeddingAugmenter, swapping out 4%
of words, with 2 augmentations for example,
withholding the original samples from the output CSV

In [14]:
%%writefile examples.csv
text,label
"This is the first example sentence.",positive
"Here's another example with a different sentiment.",negative
"And one more for good measure.",neutral
"The movie was fantastic!",positive
"I absolutely hated the book.",negative
"The food was okay, nothing special.",neutral
"This product is amazing!",positive
"I'm very disappointed with the service.",negative
"The weather is pleasant today.",neutral
"What a wonderful experience!",positive
"That was the worst performance ever.",negative
"The presentation was informative.",neutral
"I'm so happy with my purchase!",positive
"This is a complete waste of time.",negative
"The scenery is breathtaking.",neutral
"This song is incredible!",positive
"I'm feeling really down today.",negative
"Everything is going according to plan.",neutral
"The concert was electrifying!",positive
"I'm having a terrible day.",negative
"The traffic is terrible today.",neutral
"The coffee is delicious.",positive
"I can't stand the noise.",negative
"The view from here is stunning.",neutral
"This is my favorite restaurant.",positive
"I'm not impressed with this product.",negative
"The service here is excellent.",neutral
"The flowers are beautiful.",positive
"I'm feeling really stressed out.",negative
"Life is good.",neutral
"This is an inspiring story.",positive
"I'm feeling really anxious.",negative
"The world is a beautiful place.",neutral
"This is the best day ever!",positive
"I'm so frustrated right now.",negative
"Everything will be alright.",neutral
"The music is calming.",positive
"I'm feeling lonely.",negative
"The sunset is gorgeous.",neutral
"This is a great opportunity.",positive
"I'm feeling overwhelmed.",negative
"The future is bright.",neutral
"This is a dream come true.",positive
"I'm feeling heartbroken.",negative
"The stars are shining bright.",neutral
"This is a magical moment.",positive
"I'm feeling lost and confused.",negative
"Everything happens for a reason.",neutral
"This is a once-in-a-lifetime experience.",positive
"I'm feeling scared and alone.",negative
"The journey is the destination.",neutral

Writing examples.csv


In [17]:
import pandas as pd
from textattack.augmentation import EmbeddingAugmenter
from textattack.transformations import WordSwapEmbedding
from textattack.constraints.pre_transformation import StopwordModification

# Load the dataset
df = pd.read_csv('examples.csv')

# Create the augmenter
augmenter = EmbeddingAugmenter(
    pct_words_to_swap=0.04,
    transformations_per_example=2
)

# Augment the data
augmented_texts = []
for _, row in df.iterrows():
    text = row['text']  # Assuming 'text' is the column name for your text data
    augmented = augmenter.augment(text)
    augmented_texts.extend(augmented)

# Create a new dataframe with augmented data
augmented_df = pd.DataFrame({'text': augmented_texts})

# Save the augmented data to a new CSV file
augmented_df.to_csv('augmented_examples.csv', index=False)

The augmented data which is saved in `augmented_examples.csv` is:

|text|
|---|
|This is the first case sentence\.|
|This is the first instance sentence\.|
|Here's another example with a assorted sentiment\.|
|Here's another example with a disparate sentiment\.|
|And one more for good measurements\.|
|And one more for good measuring\.|
|The movie was exceptional\!|
|The movie was super\!|
|I absolutely hating the book\.|
|I completely hated the book\.|
|The alimentary was okay, nothing special\.|
|The dietary was okay, nothing special\.|
|This product is remarkable\!|
|This product is wonderful\!|
|I'm very disenchanted with the service\.|
|I'm very disillusioned with the service\.|
|The climactic is pleasant today\.|
|The weather is congenial today\.|
|What a resplendent experience\!|
|What a splendid experience\!|
|That was the hardest performance ever\.|
|That was the pire performance ever\.|
|The introductions was informative\.|
|The presentation was illuminating\.|
|I'm so delighted with my purchase\!|
|I'm so happy with my buys\!|
|This is a complete waste of period\.|
|This is a complete waste of times\.|
|The scenery is staggering\.|
|The scenery is unbelievable\.|
|This song is staggering\!|
|This song is startling\!|
|I'm feeling really down thursday\.|
|I'm feeling truthfully down today\.|
|Everything is going accordance to plan\.|
|Everything is going according to systems\.|
|The concerted was electrifying\!|
|The concerto was electrifying\!|
|I'm having a horrendous day\.|
|I'm having a horrible day\.|
|The traffic is abominable today\.|
|The traffic is atrocious today\.|
|The coffee is delightful\.|
|The coffee is yummy\.|
|I can't stand the ruckus\.|
|I can't standing the noise\.|
|The view from here is amazing\.|
|The view from here is striking\.|
|This is my favored restaurant\.|
|This is my favorite lunchroom\.|
|I'm not impressed with this commodity\.|
|I'm not impressed with this products\.|
|The service here is beautiful\.|
|The service here is fantastic\.|
|The blossoms are beautiful\.|
|The flowering are beautiful\.|
|I'm feeling really highlighting out\.|
|I'm feeling really underline out\.|
|Life is alright\.|
|Living is good\.|
|This is an inspiring stories\.|
|This is an inspiring storytelling\.|
|I'm feeling genuinely anxious\.|
|I'm feeling really keen\.|
|The world is a fantastic place\.|
|The world is a splendid place\.|
|This is the finest day ever\!|
|This is the optimum day ever\!|
|I'm so foiled right now\.|
|I'm so frustrated rights now\.|
|Any will be alright\.|
|Everything will be okay\.|
|The music is pacify\.|
|The musicians is calming\.|
|I'm feeling loneliness\.|
|I'm feeling solitaire\.|
|The sunset is fantastic\.|
|The sunset is ravishing\.|
|This is a great chance\.|
|This is a great likelihood\.|
|I'm feeling swamped\.|
|I'm sentiment overwhelmed\.|
|The futur is bright\.|
|The futuristic is bright\.|
|This is a dream come authentic\.|
|This is a dream come veritable\.|
|I'm impression heartbroken\.|
|I'm sense heartbroken\.|
|The celebrity are shining bright\.|
|The stars are shining radiant\.|
|This is a magic moment\.|
|This is a magical time\.|
|I'm feeling lost and puzzled\.|
|I'm sentiment lost and confused\.|
|Everything arrives for a reason\.|
|Eveything happens for a reason\.|
|This is a once-in-a-lifetime enjoying\.|
|This is a once-in-a-lifetime experiences\.|
|I'm feeling fearful and alone\.|
|I'm feeling scared and lone\.|
|The traveling is the destination\.|
|The voyager is the destination\.|

Augment a list of strings in Python

In [3]:
from textattack.augmentation import EmbeddingAugmenter

augmenter = EmbeddingAugmenter()
s = 'What I cannot create, I do not understand.'
augmenter.augment(s)

textattack: Downloading https://textattack.s3.amazonaws.com/word_embeddings/paragramcf.
100%|██████████| 481M/481M [00:56<00:00, 8.48MB/s]
textattack: Unzipping file /root/.cache/textattack/tmpt8rtkdr7.zip to /root/.cache/textattack/word_embeddings/paragramcf.
textattack: Successfully saved word_embeddings/paragramcf to cache.


['What I cannot create, I do not realise.']