Merge pull request #543 from QData/doc-minor

add custom dataset API use example in doc
QData · Oct 8, 2021 · 3f0d529 · 3f0d529
2 parents caacc1c + 42d0192
commit 3f0d529
Show file tree

Hide file tree

Showing 2 changed files with 19 additions and 6 deletions.
diff --git a/README.md b/README.md
@@ -499,15 +499,21 @@ dataset = [('Today was....', 1), ('This movie is...', 0), ...]
 You can then run attacks on samples from this dataset by adding the argument `--dataset-from-file my_dataset.py`.
 
 
-#### Dataset via AttackedText class
 
-To allow for word replacement after a sequence has been tokenized, we include an `AttackedText` object
-which maintains both a list of tokens and the original text, with punctuation. We use this object in favor of a list of words or just raw text.
+#### Dataset loading via other mechanism, see: [more details at here](https://textattack.readthedocs.io/en/latest/api/datasets.html)
 
+```python
+import textattack
+my_dataset = [("text",label),....]
+new_dataset = textattack.datasets.Dataset(my_dataset)
+```
 
 
-#### Dataset loading via other mechanism, see: [here](https://textattack.readthedocs.io/en/latest/api/datasets.html)
 
+#### Dataset via AttackedText class
+
+To allow for word replacement after a sequence has been tokenized, we include an `AttackedText` object
+which maintains both a list of tokens and the original text, with punctuation. We use this object in favor of a list of words or just raw text.
 
 
 ### Attacks and how to design a new attack 

diff --git a/docs/1start/FAQ.md b/docs/1start/FAQ.md
@@ -110,14 +110,21 @@ You can then run attacks on samples from this dataset by adding the argument `--
 
 
 
+#### Dataset loading via other mechanism, see: [more details at here](https://textattack.readthedocs.io/en/latest/api/datasets.html)
+
+```python
+import textattack
+my_dataset = [("text",label),....]
+new_dataset = textattack.datasets.Dataset(my_dataset)
+```
+
+
 #### Custom Dataset via AttackedText class
 
 To allow for word replacement after a sequence has been tokenized, we include an `AttackedText` object
 which maintains both a list of tokens and the original text, with punctuation. We use this object in favor of a list of words or just raw text.
 
 
-#### Custome Dataset via Data Frames or other python data objects (*coming soon*)
-
 
 ### 4. Benchmarking Attacks