Skip to content
This repository has been archived by the owner on Sep 27, 2023. It is now read-only.

google/stable-retraining-conversational-agents

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Random and Systematic Noisy Data for Stable Re-training of Conversational Agents

This repository contains the synthetic data for the paper:

  • Reducing Model Churn: Stable Re-training of Conversational Agents

The data is derived from the following datasets:

Data

The generated synthetic data is present in the following directories (corresponding respectively to the above links):

top_data/

topv2_data/

mtop_data/

snips_data/

Each dataset directory has two subdirectories:

swap_top/

distant_top/

These subdirectories contain the random and systematic noisy datasets from Section 6.

Each dataset will have a train.tsv file with columns for "query" and "label."

The dev and test sets are the same as the original papers and are thus not included here.

The distant_top directories have an additional file named distant_labeled_train.tsv, which corresponds to the heldout 10% of the training data that was labeled by a model.

About

No description, website, or topics provided.

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published