QuestionPairs - ALTEGRAD Project (MVA 2017/2018)

Alexandre Attia, Dan Constantini, Sharone Dayan, Tom Hayat

The aim was to predict which of the provided pairs of questions contain two questions with the same meaning. We tackle this challenge by considering multiple text features either from word embedding techniques such as Word2Vec, graph information from the underlying graph, some feature engineering techniques and finally, well-chosen classifier.

To reproduce the pipeline, run: First, create_features.py. Second, xgboost.py. Third, lstm.py. Finally, averaging.py.

Feature engineering

We introduce the different features that can arise from the analysis of the underlying data.
We compute text mining features, embedding features, TF-IDF features and Page Rank features.
The code to compute this pre-processing can be found in the create_features.py

Model and Comparison

We have tried different models to classify our sentences (same meaning or different meaning) :

1D CNN
Hand crafted features + Random Forest
Hand crafted features + XGBoost
Hand crafted features + XGBoost and LSTM

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
README.md		README.md
averaging.py		averaging.py
create_features.py		create_features.py
dataset_utils.py		dataset_utils.py
distance_features.py		distance_features.py
feature_engineering.py		feature_engineering.py
features_utils.py		features_utils.py
graph_features.py		graph_features.py
lstm.py		lstm.py
pagerank.py		pagerank.py
preprocessing_lstm.py		preprocessing_lstm.py
randomforest.py		randomforest.py
tfidf.py		tfidf.py
xgboost.py		xgboost.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

QuestionPairs - ALTEGRAD Project (MVA 2017/2018)

Feature engineering

Model and Comparison

About

Releases

Packages

Languages

chintakindisujani/QuestionPairs

Folders and files

Latest commit

History

Repository files navigation

QuestionPairs - ALTEGRAD Project (MVA 2017/2018)

Feature engineering

Model and Comparison

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages