Skip to content

tatHi/cwr

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Compositional Word Replacement

Official implementation of Word-level Perturbation Considering Word Length and Compositional Subwords (Findings of ACL2022).

This repository includes modules for the perturbation:

  • wordReplacer.py
    • Word replacement
    • Word replacement with a restriction considering subword length
  • compositWordReplacer.py
    • Compositional word replacement
    • Compositional word replacement with a restriction considering subword length

example.py overviews the usage of each perturbation module. You can see the outputs of each perturbation just by running this script as the following (the results depends on random seeds):

$ python example.py
Vocabulary:
>>> {'thing': 0, '▁book': 1, '▁ti': 2, '▁read': 3, '▁do': 4, 'er': 5, '▁beg': 6, '▁sist': 7, 'y': 8, 't': 9, '▁hav': 10, '▁it': 11, '▁to': 12, '▁she': 13, '▁no': 14, '▁sit': 15, 'r': 16, '▁picture': 17, 'ed': 18, '▁her': 19, '▁or': 20, '▁bank': 21, '▁twice': 22, 'tions': 23, '▁was': 24, '▁wa': 25, 'in': 26, 'e': 27, '▁get': 28, '▁sis': 29, '▁ver': 30, '▁into': 31, 's': 32, '▁peep': 33, '▁of': 34, '▁the': 35, 'd': 36, '▁:': 37, '▁he': 38, '▁ha': 39, '▁ali': 40, '▁th': 41, '▁but': 42, 'ing': 43, 'ter': 44, '▁conver': 45, '▁in': 46, '▁had': 47, 'a': 48, '▁by': 49, '▁and': 50, 'ce': 51, 'n': 52, '▁on': 53}
Input Text:
>>> ['▁hav', 'ing', '▁no', 'thing', '▁to', '▁do']
Input Ids:
>>> [10, 43, 14, 0, 12, 4]
### Original Word Replacement (p=0.5)###
Trial-1:
	TEXT >>> ['▁hav', '▁book', '▁no', 'thing', '▁to', '▁do']
	IDS  >>> [10, 1, 14, 0, 12, 4]
Trial-2:
	TEXT >>> ['▁hav', '▁th', '▁her', 'thing', '▁into', '▁and']
	IDS  >>> [10, 41, 19, 0, 31, 50]
Trial-3:
	TEXT >>> ['▁he', '▁but', '▁no', 'er', '▁to', '▁sis']
	IDS  >>> [38, 42, 14, 5, 12, 29]
Trial-4:
	TEXT >>> ['▁hav', 'ing', '▁to', 'thing', '▁to', '▁do']
	IDS  >>> [10, 43, 12, 0, 12, 4]
Trial-5:
	TEXT >>> ['n', 'ing', '▁no', 'thing', '▁sis', '▁do']
	IDS  >>> [52, 43, 14, 0, 29, 4]
### Word Replacement + Poisson (p=0.5)###
>>> BUILD SAMPLING TABLE
>>> BUILD CAHCE TABLE
Trial-1:
	TEXT >>> ['▁hav', 'ce', '▁no', 'thing', '▁to', '▁do']
	IDS  >>> [10, 51, 14, 0, 12, 4]
Trial-2:
	TEXT >>> ['▁hav', 'ce', '▁no', '▁the', '▁bank', '▁do']
	IDS  >>> [10, 51, 14, 35, 21, 4]
Trial-3:
	TEXT >>> ['ed', '▁sit', 'a', '▁and', 'er', '▁do']
	IDS  >>> [18, 15, 48, 50, 5, 4]
Trial-4:
	TEXT >>> ['▁she', '▁her', '▁no', 'thing', '▁to', '▁do']
	IDS  >>> [13, 19, 14, 0, 12, 4]
Trial-5:
	TEXT >>> ['▁hav', 'e', '▁no', 'thing', '▁by', '▁do']
	IDS  >>> [10, 27, 14, 0, 49, 4]
### Compositional Word Replacement (p=0.5)###
BUILD...
DONE
BUILD SAMPLING TABLE...
DONE
Trial-1:
	TEXT >>> ['▁hav', 'ing', 'n', 'thing', 't', '▁do']
	IDS  >>> [10, 43, 52, 0, 9, 4]
Trial-2:
	TEXT >>> ['▁ha', 'in', '▁no', 'thing', 't', 'd']
	IDS  >>> [39, 26, 14, 0, 9, 36]
Trial-3:
	TEXT >>> ['▁ha', 'n', '▁no', 'thing', 't', '▁do']
	IDS  >>> [39, 52, 14, 0, 9, 4]
Trial-4:
	TEXT >>> ['a', 'in', '▁no', 'n', '▁to', '▁do']
	IDS  >>> [48, 26, 14, 52, 12, 4]
Trial-5:
	TEXT >>> ['▁hav', 'ing', '▁no', 'thing', 't', 'd']
	IDS  >>> [10, 43, 14, 0, 9, 36]
### Compositional Word Replacement + Poisson (p=0.5)###
BUILD...
DONE
BUILD SAMPLING TABLE...
DONE
Trial-1:
	TEXT >>> ['▁ha', 'in', '▁no', 'thing', 't', 'd']
	IDS  >>> [39, 26, 14, 0, 9, 36]
Trial-2:
	TEXT >>> ['▁ha', 'in', 'n', 'ing', '▁to', '▁do']
	IDS  >>> [39, 26, 52, 43, 12, 4]
Trial-3:
	TEXT >>> ['▁hav', 'n', 'n', 'n', 't', 'd']
	IDS  >>> [10, 52, 52, 52, 9, 36]
Trial-4:
	TEXT >>> ['▁hav', 'in', '▁no', 'thing', '▁to', '▁do']
	IDS  >>> [10, 26, 14, 0, 12, 4]
Trial-5:
	TEXT >>> ['▁ha', 'ing', 'n', 'in', 't', 'd']
	IDS  >>> [39, 43, 52, 26, 9, 36]

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages