this projects attempts to combine:
- graph embedding
- ConvNet
for the purpose of multi-label text classification.
I compared three methods on stackexchange datasets, where the goal is to predict the tags of posts.
If you wan to know more, here are some slides
scripts/preprocessing_pipeline.sh
: all the preprocessing, data splitting, feature extractio, etcsample_random_walks.py
: sample random walks on a graphextract_embedding_labels.py
: extract labels for embedding visualization
fastxml_experiment.py
: experiment for fastxmlkim_cnn_experiment.py
: experiment for cnncombined_model_experiment.py
: experiment for cnn + deepwalk