QA

Information Retrieval (Sub model for Reinforcement Learning)

Environment

python 2.7

tensorflow 1.8.0

Files

Data Preporcessing --------- Data.py Data_Visualization.ipynb

TFIDF model ---------------------- TFIDF.py

LM model ---------------------- LM.py

CNN model ---------------------- CNN_model.py CNN_train.py

Inference ------------------ Inference.py

Data preprocessing

Address origin data (Chinese stop words removal, QA-pair --> pred_QA-pair.csv)

Generate CNN data (padding sentences 32 words each sentence and word embedding)

Models

TFIDF

Data preprocessing and read pred data

Generate tfidf representation for each sentence

Caculate cosine similarity for question - question pair

Get the answer with the most similar question

LM

Data preprocessing and read pred data

Caculate LM similarity for question - question pair

Get the answer with the most similar question

CNN

Data preprocessing and read pred data

Generate word embedding (initialize with baidu baike vector) representation

Train the CNN model with question - answer pair

Feed the question - answer pair and Get the most similar answer

Inference

Input : question, top_k

Output : TFIDF LM CNN top k response

plus: Data not uploaded beacuse of the privacy

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

QA

Information Retrieval (Sub model for Reinforcement Learning)

Environment

Files

Data preprocessing

Models

TFIDF

LM

CNN

Inference

Files

README.md

Latest commit

History

README.md

File metadata and controls

QA

Information Retrieval (Sub model for Reinforcement Learning)

Environment

Files

Data preprocessing

Models

TFIDF

LM

CNN

Inference