- Python 3
- TensorFlow 1.8.0
- Keras 2.2.2
- jieba 0.39
- msgpack 0.5.6
- scikit-learn 0.19.1
- sklearn_crfsuite 0.3.6
- spacy 2.0.17
- rasa-nlu 0.13.8
- rasa-core 0.10.4
- FastText预训练词向量,wiki训练,有两个版本可以选择,这里选择wiki训练的尺寸较小的版本,下载地址https://s3-us-west-1.amazonaws.com/fasttext-vectors/wiki.zh.vec
- rasa-nlu和rasa-core示例配置和语料 来自:https://github.com/zqhZY/_rasa_chatbot 这里仅整理为一个完整的包含NLU、Dialog训练和测试,ChatBot在线学习和使用的例子。
- git clone https://github.com/Ma-Dan/rasa_bot
- 下载FastText预训练词向量wiki.zh.vec 下载后运行下列命令准备词向量 python -m spacy init-model zh rasr_bot/spacy/wiki_zh --vectors-loc wiki.zh.vec
- 当前版本rasa-nlu的jieba_tokenizer存在重复加载用户字典问题,使用最新版覆盖site-packages/rasa_nlu/tokenizers下的jieba_tokenizer.py
- (可选)使用自己训练的FastText词向量,或其他工具训练的词向量
python bot.py train-nlu
> INFO:rasa_nlu.training_data.loading:Training data format of data/nlu.json is rasa_nlu
> INFO:rasa_nlu.training_data.training_data:Training data stats:
> - intent examples: 169 (13 distinct intents)
> - Found intents: 'inform_other_phone', 'inform_current_phone', 'unknown_intent', 'goodbye', 'thanks', 'inform_time', 'inform_item', 'greet', 'request_management', 'confirm', 'inform_package', 'deny', 'request_search'
> - entity examples: 102 (4 distinct entities)
> ...
> Part I: train segmenter
> ...
> Part II: train segment classifier
python -m rasa_nlu.server -c data/nlu_model_config.json --path models
curl -XPOST localhost:5000/parse -d '{"q":"你好", "project":"ivr", "model":"demo"}'
python bot.py train-dialogue
python -m rasa_core.server -p 5005 -d models/dialogue -u models/ivr/demo -o out.log
curl -XPOST localhost:5005/conversations/default/parse -d '{"query":"帮我查话费"}'
curl -XPOST localhost:5005/conversations/default/continue -d '{"executed_action": "utter_greet", "events": []}'
python bot.py run
python bot.py run online-train