Skip to content
Switch branches/tags

Latest commit


Failed to load latest commit information.
Latest commit message
Commit time

Meta Dialog Platform (MDP)

Meta Dialog Platform: a toolkit platform for NLP Few-Shot Learning tasks of:

  • Text Classification
  • Sequence Labeling

It also provides the baselines for:



State-of-the-art solutions for Few-shot NLP:

Easy-to-start & flexible framework:

  • Provide tools for easy training & testing.
  • Support various few-shot models with unified and extendable interfaces, such as ProtoNet and TapNet.
  • Support easy-to-switch similarity-metrics and logits-scaling methods.
  • Provide tools of generating episode-style data for meta-learning.


Please cite code and data:

	title={FewJoint: A Few-shot Learning Benchmark for Joint Language Understanding},
	author={Yutai Hou, Jiafeng Mao, Yongkui Lai, Cheng Chen, Wanxiang Che, Zhigang Chen, Ting Liu},
	journal={arXiv preprint},

Get Started

Environment Requirement


Example for Sequence Labeling

Here, we take the few-shot slot tagging and NER task from (Hou et al., 2020) as quick start examples.

Step1: Prepare pre-trained embedding

  • Download the pytorch bert model, or convert tensorflow param by yourself with scripts.
  • Set BERT path in the ./scripts/ to your setting:

Step2: Prepare data

  • Download the compatible few-shot data at here: download

  • Set test, train, dev data file path in ./scripts/ to your setting.

For simplicity, your only need to set the root path for data as follow:


Step3: Train and test the main model

  • Build a folder to collect running log
mkdir result
  • Execute cross-evaluation script with two params: -[gpu id] -[dataset name]
Example for 1-shot slot tagging:
source ./scripts/ 0 snips
Example for 1-shot NER:
source ./scripts/ 0 ner

To run 5-shots experiments, use ./scripts/

Other detailed functions and options:

You can experiment freely by passing parameters to to choose different model architectures, hyperparameters, etc.

To view detailed options and corresponding descriptions, run commandline:

python --h

We provide scripts for general few-shot classification and sequence labeling task respectively:

  • classification
  • sequence labeling

The usage of these scripts are similar to process in Get Started.

Run with FewJoint/SMP data

  • Get reformatted FewJoint data at here or construct episode-style data by yourself with our tool.
  • Use script ./scripts/ and ./scripts/ to perform few-shot intent detection or few-shot slot filling respectively.
  • Notice that:
    1. Change train/dev/test path in the scripts before running.
    2. Find predicted results at trained_model_path within running scripts.

Few-shot Data Construction Tool

We also provide a generation tool for converting normal data into few-shot/meta-episode style. The tool is included at path: scripts/other_tool/

Run following commandline to view detailed interface:

python --h

For simplicity, we provide an example script to help generate few-shot data: ./scripts/

The following are some key params for you to control the generation process:

  • input_dir: raw data path
  • output_dir: output data path
  • episode_num: the number of episode which you want to generate
  • support_shots_lst: to specified the support shot size in each episode, we can specified multiple number to generate at the same time.
  • query_shot: to specified the query shot size in each episode
  • seed_lst: random seed list to control random generation
  • use_fix_support: set the fix support in dev dataset
  • dataset_lst: specified the dataset type which our tool can handle, there are some choices: stanford & SLU & TourSG & SMP.

If you want to handle other type of dataset, you can add your code for load raw dataset in meta_dataset_generator/

few-shot/meta-episode style data example
  "domain_name": [
    {  // episode
      "support": {  // support set
        "seq_ins": [["we", "are", "friends", "."], ["how", "are", "you", "?"]],  // input sequence
        "seq_outs": [["O", "O", "O", "O"], ["O", "O", "O", "O"]],  // output sequence in sequence labeling task
        "labels": [["statement"], ["query"]]  // output labels in classification task
      "query": {  // query set
        "seq_ins": [["we", "are", "friends", "."], ["how", "are", "you", "?"]],
        "seq_outs": [["O", "O", "O", "O"], ["O", "O", "O", "O"]],
        "labels": [["statement"], ["query"]]


The platform is developed by HIT-SCIR. If you have any question and advice for it, please contact us(Yutai Hou - or Yongkui Lai -


Platform for few-shot natural language processing: Text Classification, Sequene Labeling.



No releases published


No packages published