Skip to content
This repository has been archived by the owner on Nov 3, 2023. It is now read-only.

Dialcrowd #4387

Merged
merged 16 commits into from May 2, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
1 change: 1 addition & 0 deletions .flake8
Expand Up @@ -10,6 +10,7 @@ extend-ignore =
W503
F403
F541
E305
select = C,E,F,W,B,B950,RST,PAI,BLK

max-line-length=80
173 changes: 173 additions & 0 deletions parlai/crowdsourcing/tasks/dialcrowd/README.md
@@ -0,0 +1,173 @@
# DialCrowd
EricMichaelSmith marked this conversation as resolved.
Show resolved Hide resolved
DialCrowd is a dialogue crowdsourcing toolkit that helps requesters write clear HITs, view and analyze results, and obtain higher-quality data. This integration allows for the requester interface, worker interface, and analysis interface to be integrated into ParlAI so requesters can have access to ParlAI's tools with DialCrowd's tools.

## Example Walkthrough

### Configuration

![screenshot](images/config1.png)
We start with a general configuration section. Here, you can indicate the background for your study, general instructions, enable markdown, specify the time each HIT should take, the payment per HIT, number of utterances per HIT, number of annotations per utterance, and number of sentences per page on the HIT. You can also upload your data in .txt format where each new line is a new utterance to be annotated.

***

![screenshot](images/config2.png)
Then, we have the task units for quality control section. Here, you can specify how many duplicate task units as well as how many golden task units you would like each worker to annotate (we suggest less than 10% of each HIT be quality control units).

***

![screenshot](images/config3.png)
We allow you to upload a consent form for the workers that they will have to agree to before accessing the HIT. This form allows you to inform workers of possible risks and also asks for their explicit consent for participation in your data collection.

***

![screenshot](images/config4.png)
For each of your intents, we provide areas where you can add any additional instructions for each intent, as well as add examples and counterexamples along with explanations. It is important to provide illustrative examples for the workers, so they will be able to provide the annotations according to the definitions you set.

***

![screenshot](images/config5.png)
This allows workers to provide feedback in an open-response input box so that you may improve future iterations of your task.

***

![screenshot](images/config6.png)
You can customize the colors, fonts, and text size associated with your HIT to highlight any important information.

### Annotation Page
![screenshot](images/annotation1.png)
We show the worker the background for your study, as well as the instructions and the table of intents with their respective definitions, examples, counterexamples, and explanations.

***

![screenshot](images/annotation2.png)
Each worker will have a dropdown menu with all the intents, as well as an option to reshow the instructions and examples if they wish to refer back. A confidence score is also provided so that if workers are unsure, they can indicate that.

### Results Page
![screenshot](images/results1.png)
We track workers' times for each annotation, as well as provide the average time taken per annotation, if the annotations had any abnormality (ex. a worker selecting one intent for all utterances), agreement, agreement with the golden questions, and inter-user agreement.

***

![screenshot](images/results2.png)
We calculate Fleiss' kappa for each of the questions, as well as overall kappa.

***

![screenshot](images/results3.png)

We then provide a graph of the time taken by each worker for the HIT, so you are able to pinpoint and check the results of any workers that may have spent an extremely long or short time on the HIT.

## Usage

### Configuration

Run `./config.sh`

This will walk you through configuring the task (instructions, examples, payment, etc.).

### Preview Task Locally

```
python run.py
```

If you want to modify the webpage, and see the update on-the-time, you can run the following command:
```
python run.py mephisto.blueprint.link_task_source=true
```
and
```
cd webapp
npm run dev:watch
```

### Push the Task to AMT Sandbox

1. Obtain the API tokens by following instructions in [the webpage](https://requestersandbox.mturk.com/developer).
2. Register the API tokens:
```
mephisto register mturk_sandbox name=mturk_sandbox access_key_id=[KEY_ID] secret_access_key=[SECRET_KEY]
```
3. Execute:
```
python run.py mephisto/architect=heroku mephisto/provider=mturk_sandbox mephisto.provider.requester_name=mturk_sandbox
```

Troubleshooting:

1. If you register an incorrect token, you may need to remove the database used by Mephisto and register a correct one again. Check `Mephisto/data/` for more information.
2. If `python run.py` fails due to some Heroku related error, you can try to run `heroku login` before running `python run.py`. You may also check whether you have quota to create a new instance on Heroku.

### Results Page

In `webapp-results/server.js`, configure mephisto_path to the path of the Mephisto results of your task, and configure workers to an array of the `<task_run_id>/<assignment_id>/<agent_id>`'s.

Run `./configquality.sh`

This will show the resulting data along with quality control metrics (outliers due to time, duplicate data checks, Fleiss' Kappa calculation, etc).

## Code Structure

### Configuration Page

- `config.sh`: Build the front-end webpage; launch the backend; open the browser.
- `webapp-config/`: Source for the configuration webpage.
- `webapp-config/server.js`: A tiny backend to host the webpage.

### Annotation Page

- `webapp/`: Source for the annotation webpage.

### Quality Check Page

- `webapp-results/`: Source for the quality check page
- `webapp-results/server.js`: A tiny backend to host the webpage and pull results from Mephisto local files.

#### Frontend

- `webapp/src/components/task_components.jsx`: The DialCrowd component `WorkerCategory` is used at this place.
- `webapp/src/components/dialcrowd/worker_category.js`: `WorkerCategory` is defined here.

The `WorkerCategory` element takes in three attributes passed by ParlAI:

- `taskData`: The data to be annotate in this HIT. It is provided by the ParlAI backend.
- `taskConfig`: The task config loaded by ParlAI backend.
- `onSubmit`: A function that takes in a argument. The data specified by the argument will be passed to ParlAI and will be saved in the backend. When running locally with `python run.py`, this function does not save anything but only shows a pop-up window. When running on AMT, this function will not show the pop-up window, and data will be passed to the backend and will be saved.

#### Backend (ParlAI Scripts)

- `dialcrowd_blueprint.py`: Loading data/configuration files. The data is loaded to `self.raw_data` in the `DialCrowdStaticBlueprintArgs`. The configuration file is loaded in the function `DialCrowdStaticBlueprintArgs.get_frontend_args`. The return value of `DialCrowdStaticBlueprintArgs` will be the data passed to the `taskConfig` attribute of `WorkerCategory`.


#### Configurations

- `task_config/config.json`: Configuration file from DialCrowd.
- `hydra_configs/conf/example.yaml`: Configuration used by ParlAI.
- `data.jsonl`: Place to save the data. The format is as followed:

```json
{"id": 1, "sentences": ["please tell me my in-person transactions for the last three days using my debit card"], "category": []}
{"id": 2, "sentences": ["send $5 from savings to checking"], "category": []}
{"id": 3, "sentences": ["is there enough money in my bank of hawaii for vacation"], "category": []}
{"id": 4, "sentences": ["i need to pay my cable bill"], "category": []}
{"id": 5, "sentences": ["read my bill balances"], "category": []}
{"id": 6, "sentences": ["please tell me all of my recent transactions"], "category": []}
{"id": 7, "sentences": ["please transfer $100 from my checking to my savings account"], "category": []}
{"id": 8, "sentences": ["could you check my bank balance for me"], "category": []}
{"id": 9, "sentences": ["i need help paying my electric bill"], "category": []}
{"id": 10, "sentences": ["what is the amount of balance i have to pay on my bill"], "category": []}
```


## Note on DB

`<mephisto_root_dir>/data/data/runs/NO_PROJECT/<task_run_id>/<assignment_id>/<agent_id>/agent_data.json`.

Information about pushed tasks can be found in `Mephisto/data/database.db`, which is a SqlLite database.

Information about the tasks can be found in the table `assignments`. It includes `task_run_id`, which can be used to locate the directory containing the annotations done by the workers.

## Contributers

Jessica Huynh, Ting-Rui Chiang, Kyusong Lee
Carnegie Mellon University 2022
5 changes: 5 additions & 0 deletions parlai/crowdsourcing/tasks/dialcrowd/__init__.py
@@ -0,0 +1,5 @@
#!/usr/bin/env python3

# Copyright (c) Facebook, Inc. and its affiliates.
# This source code is licensed under the MIT license found in the
# LICENSE file in the root directory of this source tree.
9 changes: 9 additions & 0 deletions parlai/crowdsourcing/tasks/dialcrowd/config.sh
@@ -0,0 +1,9 @@
# /*********************************************
# @ Jessica Huynh, Ting-Rui Chiang, Kyusong Lee
# Carnegie Mellon University 2022
# *********************************************/

cd ./webapp-config/
npm install
npm run dev
node server.js
9 changes: 9 additions & 0 deletions parlai/crowdsourcing/tasks/dialcrowd/configquality.sh
@@ -0,0 +1,9 @@
# /*********************************************
# @ Jessica Huynh, Ting-Rui Chiang, Kyusong Lee
# Carnegie Mellon University 2022
# *********************************************/

cd ./webapp-results/
npm install
npm run dev
node server.js
10 changes: 10 additions & 0 deletions parlai/crowdsourcing/tasks/dialcrowd/data.jsonl
@@ -0,0 +1,10 @@
{"id": 1, "sentences": ["please tell me my in-person transactions for the last three days using my debit card"], "category": []}
{"id": 2, "sentences": ["send $5 from savings to checking"], "category": []}
{"id": 3, "sentences": ["is there enough money in my bank of hawaii for vacation"], "category": []}
{"id": 4, "sentences": ["i need to pay my cable bill"], "category": []}
{"id": 5, "sentences": ["read my bill balances"], "category": []}
{"id": 6, "sentences": ["please tell me all of my recent transactions"], "category": []}
{"id": 7, "sentences": ["please transfer $100 from my checking to my savings account"], "category": []}
{"id": 8, "sentences": ["could you check my bank balance for me"], "category": []}
{"id": 9, "sentences": ["i need help paying my electric bill"], "category": []}
{"id": 10, "sentences": ["what is the amount of balance i have to pay on my bill"], "category": []}
107 changes: 107 additions & 0 deletions parlai/crowdsourcing/tasks/dialcrowd/dialcrowd_blueprint.py
@@ -0,0 +1,107 @@
#!/usr/bin/env python3

# Copyright (c) Facebook, Inc. and its affiliates.
# This source code is licensed under the MIT license found in the
# LICENSE file in the root directory of this source tree.

import json
import logging
import os
from dataclasses import dataclass, field
from typing import Any, Dict, TYPE_CHECKING

from mephisto.operations.registry import register_mephisto_abstraction
from mephisto.abstractions.blueprint import SharedTaskState
from mephisto.abstractions.blueprints.static_react_task.static_react_blueprint import (
StaticReactBlueprint,
StaticReactBlueprintArgs,
)
from omegaconf import DictConfig

if TYPE_CHECKING:
from mephisto.data_model.task import TaskRun


def get_task_path():
return os.path.dirname(__file__)


STATIC_BLUEPRINT_TYPE = 'dialcrowd_static_blueprint'


@dataclass
class DialCrowdStaticBlueprintArgs(StaticReactBlueprintArgs):
_blueprint_type: str = STATIC_BLUEPRINT_TYPE
_group: str = field(
default="DialCrowdStaticBlueprint",
metadata={
'help': """This task renders conversations from a file and asks for turn by turn annotations of them."""
},
)
subtasks_per_unit: int = field(
default=-1, metadata={"help": "Number of subtasks/comparisons to do per unit"}
)


@register_mephisto_abstraction()
class DialCrowdStaticBlueprint(StaticReactBlueprint):
"""
This Blueprint has a subtasks number option to combine multiple conversations into
"sub-HITs".

It also has options for the onboarding data answers and the annotation bucket
definitions.
"""

ArgsClass = DialCrowdStaticBlueprintArgs
BLUEPRINT_TYPE = STATIC_BLUEPRINT_TYPE

def __init__(
self, task_run: "TaskRun", args: "DictConfig", shared_state: "SharedTaskState"
):
super().__init__(task_run, args=args, shared_state=shared_state)
self.subtasks_per_unit = self.args.blueprint.subtasks_per_unit

if self.subtasks_per_unit <= 0:
raise Exception(
f'subtasks_per_unit must be greater than zero but was {self.subtasks_per_unit}'
)

self.raw_data = self._initialization_data_dicts

# Now chunk the data into groups of <num_subtasks>
grouped_data = []
logging.info(
f'Raw data length: {len(self.raw_data)}. self.subtasks_per_unit: {self.subtasks_per_unit}'
)
for i in range(0, len(self._initialization_data_dicts), self.subtasks_per_unit):
chunk = self._initialization_data_dicts[i : i + self.subtasks_per_unit]
grouped_data.append(chunk)
self._initialization_data_dicts = grouped_data
# Last group may have less unless an exact multiple
logging.info(
f'Grouped data into {len(self._initialization_data_dicts)} tasks with {self.subtasks_per_unit} subtasks each.'
)

def get_frontend_args(self) -> Dict[str, Any]:
"""
Specifies what options within a task_config should be forwarded to the client
for use by the task's frontend.
"""

# load the task configuration
with open(os.path.join(get_task_path(), 'task_config/config.json')) as f:
task_config = json.load(f)

# combine the task configuration loaded from json with the settings
# required by ParlAI.
task_config.update(
{
"task_description": self.args.task.get('task_description', None),
"task_title": self.args.task.get('task_title', None),
"frame_height": '100%',
"num_subtasks": self.args.blueprint.subtasks_per_unit,
"block_mobile": True,
}
)
return task_config
@@ -0,0 +1,26 @@
#@package _global_
defaults:
- /mephisto/blueprint: dialcrowd_static_blueprint
- /mephisto/architect: local
- /mephisto/provider: mock
mephisto:
blueprint:
data_jsonl: ${task_dir}/data.jsonl
extra_source_dir: ${task_dir}/webapp/src/static
subtasks_per_unit: 4
task_source: ${task_dir}/webapp/build/bundle.js
units_per_assignment: 5
task:
allowed_concurrent: 1
assignment_duration_in_seconds: 450
max_num_concurrent_units: 0
maximum_units_per_worker: 1
task_description: Our goal is to build an AI chat bot that can help people complete
certain tasks. To achieve this goal and train the bot, we need some sentences
labeled by human annotators. Please help us classify the sentences below.
task_name: turn_annotations_static
task_reward: 0.65
task_tags: chat,conversation,dialog,partner
task_title: Annotate Sentences For Intent
mturk:
worker_blocklist_paths: null
5 changes: 5 additions & 0 deletions parlai/crowdsourcing/tasks/dialcrowd/images/__init__.py
@@ -0,0 +1,5 @@
#!/usr/bin/env python3

# Copyright (c) Facebook, Inc. and its affiliates.
# This source code is licensed under the MIT license found in the
# LICENSE file in the root directory of this source tree.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
3 changes: 3 additions & 0 deletions parlai/crowdsourcing/tasks/dialcrowd/package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.