graph embedding + deep learning for multi-label text classification

this projects attempts to combine:

for the purpose of multi-label text classification.

I compared three methods on stackexchange datasets, where the goal is to predict the tags of posts.

If you wan to know more, here are some slides

utility scripts

scripts/preprocessing_pipeline.sh: all the preprocessing, data splitting, feature extractio, etc
sample_random_walks.py: sample random walks on a graph
extract_embedding_labels.py: extract labels for embedding visualization

Name		Name	Last commit message	Last commit date
Latest commit History 83 Commits
scripts		scripts
.gitignore		.gitignore
README.md		README.md
build_question_graph.ipynb		build_question_graph.ipynb
build_question_graph.py		build_question_graph.py
combined.py		combined.py
combined_model.ipynb		combined_model.ipynb
combined_model_experiment.py		combined_model_experiment.py
data_helpers.py		data_helpers.py
deepwalk.ipynb		deepwalk.ipynb
deepwalk.py		deepwalk.py
embedding_labels.ipynb		embedding_labels.ipynb
eval_helpers.py		eval_helpers.py
evaluate.py		evaluate.py
extract_embedding_labels.py		extract_embedding_labels.py
fastxml_experiment.py		fastxml_experiment.py
get_tensor_from_checkpoint.py		get_tensor_from_checkpoint.py
k_max_pooling.ipynb		k_max_pooling.ipynb
kim_cnn.py		kim_cnn.py
kim_cnn_experiment.py		kim_cnn_experiment.py
print_data_set_property.py		print_data_set_property.py
process_posts.ipynb		process_posts.ipynb
process_posts.py		process_posts.py
process_train_dev_test.py		process_train_dev_test.py
project-slides.pdf		project-slides.pdf
requirements.txt		requirements.txt
sample_random_walks.py		sample_random_walks.py
split_train_dev_test.py		split_train_dev_test.py
test_data_helpers.py		test_data_helpers.py
test_eval_helpers.py		test_eval_helpers.py
text_cnn.py		text_cnn.py
tf_gather.ipynb		tf_gather.ipynb
tf_helpers.py		tf_helpers.py
train.py		train.py
word2vec.py		word2vec.py