Merge pull request #99 from QData/quick-grad-rename

Quick grad rename
QData · May 15, 2020 · c61d840 · c61d840
2 parents cd28217 + 92bb29e
commit c61d840
Show file tree

Hide file tree

Showing 15 changed files with 439 additions and 357 deletions.
diff --git a/README.md b/README.md
@@ -39,7 +39,7 @@ environment variable `TA_CACHE_DIR`.
 
 ### Running Attacks
 
-The [`examples/`](examples/) folder contains notebooks walking through examples of basic usage of TextAttack, including building a custom transformation and a custom constraint.
+The [`examples/`](docs/examples/) folder contains notebooks walking through examples of basic usage of TextAttack, including building a custom transformation and a custom constraint. These examples can also be viewed through the [documentation website](https://textattack.readthedocs.io/en/latest).
 
 We also have a command-line interface for running attacks. See help info and list of arguments with `python -m textattack --help`.
 

diff --git a/docs/conf.py b/docs/conf.py
@@ -21,7 +21,7 @@
 author = 'UVA QData Lab'
 
 # The full version, including alpha/beta/rc tags
-release = '0.0.1'
+release = '0.0.1.9'
 
 # Set master doc to `index.rst`.
 master_doc = 'index'
@@ -35,7 +35,8 @@
     'sphinx.ext.viewcode',
     'sphinx.ext.autodoc',
     'sphinx.ext.napoleon',
-    "sphinx_rtd_theme"
+    'sphinx_rtd_theme',
+    'nbsphinx'
 ]
 
 # Add any paths that contain templates here, relative to this directory.
@@ -48,7 +49,7 @@
 
 
 # Mock language_check to stop issues with Sphinx not loading it
-autodoc_mock_imports = ["language_check"]
+autodoc_mock_imports = []
 
 
 

diff --git a/docs/constraints/constraint.rst b/docs/constraints/constraint.rst
@@ -8,11 +8,11 @@ Constraints determine whether a given transformation is valid. Since transformat
 
 We split constraints into three main categories:
 
-   :ref:`semantics`: Check meaning of sentence
+   :ref:`semantic`: Based on the meaning of input and perturbation
 
-   :ref:`syntactical`: Check part-of-speech and grammar
+   :ref:`grammaticality`: Based on syntactic properties like part-of-speech and grammar
 
-   :ref:`overlap`: Measure edit distance
+   :ref:`overlap`: Based on character-based properties, like edit distance
 
 .. automodule:: textattack.constraints.constraint
    :members:

diff --git a/docs/constraints/syntax.rst → docs/constraints/grammaticality.rst b/docs/constraints/syntax.rst → docs/constraints/grammaticality.rst
@@ -1,10 +1,11 @@
 .. _syntactical:
 
 ==============================
-Constraints based on Syntax
+Grammaticality
 ==============================
 
-Syntactic constraints determine if a transformation is valid based on the resulting syntax. 
+Grammaticality constraints determine if a transformation is valid based on
+syntactic properties of the perturbation.
 
 Language Models
 ################
@@ -14,14 +15,9 @@ Language Models
 .. automodule:: textattack.constraints.grammaticality.language_models.gpt2
    :members:
 
-Google Language Models 
-************************
 .. automodule:: textattack.constraints.grammaticality.language_models.google_language_model.google_language_model
    :members:
 
-.. automodule:: textattack.constraints.grammaticality.language_models.google_language_model.alzantot_goog_lm
-   :members:
-
 Language Tool 
 ##############
 .. automodule:: textattack.constraints.grammaticality.language_tool

diff --git a/docs/constraints/overlap.rst b/docs/constraints/overlap.rst
@@ -1,7 +1,7 @@
 .. _overlap:
 
 ==========================================
-Constraints based on Overlap
+Overlap
 ==========================================
 
 Overlap constraints determine if a transformation is valid based on character-level analysis.

diff --git a/docs/constraints/semantics.rst b/docs/constraints/semantics.rst
@@ -1,10 +1,11 @@
 .. _semantics:
 
 ================================
-Constraints based on Semantics
+Semantics
 ================================
 
-Semantic constraints determine if a transformation is valid based on similarity of the semantics between the orignal input and the transformed input.
+Semantic constraints determine if a transformation is valid based on similarity 
+of the semantics between the orignal input and the transformed input.
 
 Word Embedding Distance 
 ########################

diff --git a/examples/.gitignore → docs/examples/.gitignore b/examples/.gitignore → docs/examples/.gitignore
diff --git a/docs/examples/1_Introduction_and_Transformtions.ipynb b/docs/examples/1_Introduction_and_Transformtions.ipynb
diff --git a/examples/[2] Constraints.ipynb → docs/examples/2_Constraints.ipynb b/examples/[2] Constraints.ipynb → docs/examples/2_Constraints.ipynb
@@ -14,7 +14,7 @@
     "\n",
     "- **Overlap constraints** determine if a perturbation is valid based on character-level analysis. For example, some attacks are constrained by edit distance: a perturbation is only valid if it perturbs some small number of characters (or fewer).\n",
     "\n",
-    "- **Syntactic constraints** filter inputs based on their syntax. For example, an attack may require that adversarial perturbations do not introduce grammatical errors.\n",
+    "- **Grammaticality constraints** filter inputs based on syntactical information. For example, an attack may require that adversarial perturbations do not introduce grammatical errors.\n",
     "\n",
     "- **Semantic constraints** try to ensure that the perturbation is semantically similar to the original input. For example, we may design a constraint that uses a sentence encoder to encode the original and perturbed inputs, and enforce that the sentence encodings be within some fixed distance of one another. (This is what happens in subclasses of `textattack.constraints.semantics.sentence_encoders`.)"
    ]
@@ -35,7 +35,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## A custom constraint\n",
+    "### A custom constraint\n",
     "\n",
     "\n",
     "For fun, we're going to see what happens when we constrain an attack to only allow perturbations that substitute out a named entity for another. In linguistics, a **named entity** is a proper noun, the name of a person, organization, location, product, etc. Named Entity Recognition is a popular NLP task (and one that state-of-the-art models can perform quite well). \n",
@@ -45,21 +45,59 @@
     "\n",
     "**NLTK**, the Natural Language Toolkit, is a Python package that helps developers write programs that process natural language. NLTK comes with predefined algorithms for lots of linguistic tasks– including Named Entity Recognition.\n",
     "\n",
-    "First, we're going to write a constraint class. In the `__call__` method, we're going to use NLTK to find the named entities in both `x` and `x_adv`. We will only return `True` (that is, our constraint is met) if `x_adv` has substituted one named entity in `x` for another."
+    "First, we're going to write a constraint class. In the `__call__` method, we're going to use NLTK to find the named entities in both `x` and `x_adv`. We will only return `True` (that is, our constraint is met) if `x_adv` has substituted one named entity in `x` for another.\n",
+    "\n",
+    "Let's import NLTK and download the required modules:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "[nltk_data] Downloading package punkt to /u/jm8wx/nltk_data...\n",
+      "[nltk_data]   Package punkt is already up-to-date!\n",
+      "[nltk_data] Downloading package maxent_ne_chunker to\n",
+      "[nltk_data]     /u/jm8wx/nltk_data...\n",
+      "[nltk_data]   Package maxent_ne_chunker is already up-to-date!\n",
+      "[nltk_data] Downloading package words to /u/jm8wx/nltk_data...\n",
+      "[nltk_data]   Package words is already up-to-date!\n"
+     ]
+    },
+    {
+     "data": {
+      "text/plain": [
+       "True"
+      ]
+     },
+     "execution_count": 7,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "import nltk\n",
+    "nltk.download('punkt') # The NLTK tokenizer\n",
+    "nltk.download('maxent_ne_chunker') # NLTK named-entity chunker\n",
+    "nltk.download('words') # NLTK list of words"
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## NLTK NER Example\n",
+    "### NLTK NER Example\n",
     "\n",
     "Here's an example of using NLTK to find the named entities in a sentence:"
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 34,
+   "execution_count": 8,
    "metadata": {},
    "outputs": [
     {
@@ -90,8 +128,6 @@
     }
    ],
    "source": [
-    "import nltk\n",
-    "\n",
     "sentence = ('In 2017, star quarterback Tom Brady led the Patriots to the Super Bowl, '\n",
     "           'but lost to the Philadelphia Eagles.')\n",
     "\n",
@@ -115,7 +151,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 51,
+   "execution_count": 9,
    "metadata": {},
    "outputs": [
     {
@@ -145,14 +181,14 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## Putting it all together: getting a list of Named Entity Labels from a sentence\n",
+    "### Putting it all together: getting a list of Named Entity Labels from a sentence\n",
     "\n",
     "Now that we know how to tokenize, parse, and detect named entities using NLTK, let's put it all together into a single helper function. Later, when we implement our constraint, we can query this function to easily get the entity labels from a sentence. We can even use `@functools.lru_cache` to try and speed this process up."
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 36,
+   "execution_count": 10,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -178,7 +214,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 37,
+   "execution_count": 11,
    "metadata": {},
    "outputs": [
     {
@@ -200,7 +236,7 @@
        " ('.', '.')]"
       ]
      },
-     "execution_count": 37,
+     "execution_count": 11,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -221,14 +257,14 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## Creating our NamedEntityConstraint\n",
+    "### Creating our NamedEntityConstraint\n",
     "\n",
     "Now that we know how to detect named entities using NLTK, let's create our custom constraint."
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 38,
+   "execution_count": 12,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -273,16 +309,28 @@
     "collapsed": true
    },
    "source": [
-    "## Testing our constraint\n",
+    "### Testing our constraint\n",
     "\n",
     "We need to create an attack and a dataset to test our constraint on. We went over all of this in the first tutorial, so let's gloss over this part for now."
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 39,
+   "execution_count": 13,
    "metadata": {},
-   "outputs": [],
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "\u001b[34;1mtextattack\u001b[0m: Downloading https://textattack.s3.amazonaws.com/models/classification/lstm/yelp_polarity.\n",
+      "100%|██████████| 297M/297M [00:06<00:00, 48.3MB/s] \n",
+      "\u001b[34;1mtextattack\u001b[0m: Unzipping file  path_to_zip_file to unzipped_folder_path.\n",
+      "\u001b[34;1mtextattack\u001b[0m: Successfully saved models/classification/lstm/yelp_polarity to cache.\n",
+      "\u001b[34;1mtextattack\u001b[0m: Goal function <class 'textattack.goal_functions.classification.untargeted_classification.UntargetedClassification'> matches model LSTMForYelpSentimentClassification.\n"
+     ]
+    }
+   ],
    "source": [
     "# Import the dataset.\n",
     "from textattack.datasets.classification import YelpSentiment\n",
@@ -296,7 +344,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 40,
+   "execution_count": 14,
    "metadata": {},
    "outputs": [
     {
@@ -319,7 +367,7 @@
    ],
    "source": [
     "from textattack.transformations import WordSwapEmbedding\n",
-    "from textattack.attack_methods import GreedyWordSwap\n",
+    "from textattack.search_methods import GreedyWordSwap\n",
     "\n",
     "# We're going to the `WordSwapEmbedding` transformation. Using the default settings, this\n",
     "# will try substituting words with their neighbors in the counter-fitted embedding space. \n",

diff --git a/docs/index.rst b/docs/index.rst
@@ -11,10 +11,22 @@ TextAttack
 Features 
 -----------
 
-- **Search Methods**: Explores the transformation space and attempts to find a successful attack
-- **Transformations**: Takes a text input and transforms it by replacing words and phrases while attempting to retain the meaning
-- **Constraints**: Determines if a given transformation is valid
-- **Built-in Datasets** and **Pre-trained Models** for ease of use
+TextAttack isn't just a Python library; it's a framework for constructing adversarial attacks in NLP. TextAttack builds attacks from four components:
+
+- **Goal Functions** stipulate the goal of the attack, like to change the prediction score of a classification model, or to change all of the words in a translation output
+- **Search Methods** explores the space of transformations and attempt to find a successful perturbtion
+- **Transformations** takes a text input and transform it by replacing characters, words, or phrases
+- **Constraints**: Determines if a potential perturbtion is valid with respect to the original input
+
+TextAttack provides a set of **attack recipes** that assemble attacks from the literature from these four components.
+
+TextAttack has some other features that make it a pleasure to use:
+
+- **Data augmentation** using transformations & constraints
+- **Built-in Datasets** for running attacks without supplying your own data
+- **Pre-trained Models** for testing attacks and evaluating constraints
+- **Built-in tokenizers** so you don't have to worry about tokenizing the inputs
+- **Visualization options** like Visdom and Weights & Biases
 
 
 .. toctree::
@@ -23,6 +35,13 @@ Features
 
    quickstart/installation
    quickstart/overview
+
+.. toctree::
+   :maxdepth: 2
+   :caption: Examples
+
+   examples/1_Introduction_and_Transformtions.ipynb
+   examples/2_Constraints.ipynb
 
 
 .. toctree::
@@ -44,7 +63,7 @@ Features
 
    constraints/constraint
    constraints/semantics
-   constraints/syntax
+   constraints/grammaticality
    constraints/overlap
 
 

diff --git a/docs/requirements.txt b/docs/requirements.txt
@@ -0,0 +1 @@
+nbsphinx