Skip to content
Permalink
Branch: master
Commits on Apr 8, 2019
  1. Remove document classification code that's been moved to hedwig (#186;

    achyudh authored and daemon committed Apr 8, 2019
    …close #185)
    
    * Remove document classification datasets
    
    * Remove ReutersEvaluator
    
    * Remove ReutersTrainer
    
    * Remove document classification models
    
    * Remove document classification from README.md
Commits on Mar 6, 2019
Commits on Feb 25, 2019
  1. Update README.md (#180)

    achyudh authored and daemon committed Feb 25, 2019
  2. Update README.md (#179)

    achyudh authored and daemon committed Feb 25, 2019
  3. Delete baseline_results.tsv (#178)

    achyudh authored and daemon committed Feb 25, 2019
Commits on Feb 6, 2019
  1. Tidy up some code (#174)

    daemon committed Feb 6, 2019
Commits on Jan 29, 2019
  1. Add ESIM model (#169)

    Victor0118 committed Jan 29, 2019
    * runnable
    
    * update mask
    
    * minor update
    
    * minor update
    
    * Update README.md
    
    * fix multi GPU issue
    
    * add visualize argument
    
    * fix more comments, retab
    
    * remove util
Commits on Jan 25, 2019
  1. Add TAR and AR (#172)

    Ashutosh-Adhikari authored and daemon committed Jan 25, 2019
    * Add TAR and AR
  2. Add document classification models and datasets (#171)

    Ashutosh-Adhikari authored and daemon committed Jan 25, 2019
    * Add ReutersTrainer, ReutersEvaluator options in Factory classes
    
    * Add Reuters to Kim-CNN command line arguments
    
    * Fix SST dataset path according to changes in Kim-CNN args
    
    The dataset path in args.py was made to point at the dataset folder rather than dataset/SST folder. Hence SST folder was added to paths in the SST dataset class
    
    * Add Reuters dataset class, and support in __main__
    
    * Add Reuters dataset trainers and evaluators
    
    * Remove debug print statement in reuters_evaluator
    
    * Fix rounding bug in reuters_trainer and reuters_evaluator
    
    * Add LSTM for baseline text classification measurements
    
    * Add eval metrics for lstm_baseline
    
    * Set batch_first param in lstm_baseline
    
    * Remove onnx args from lstm_baseline
    
    * Pack padded sequences in LSTM_baseline
    
    * Add TensorBoardX support for Reuters trainer
    
    * Add Arxiv Academic Paper Dataset (AAPD)
    
    * Add Hidden Bottleneck Layer to BiLSTM
    
    * Fix packing of padded tensors in Reuters
    
    * Add cmdline args for Hidden Bottleneck Layer for BiLSTM
    
    * Include pre-padding lengths in AAPD dataset
    
    * Remove duplication of preprocessing code in AAPD
    
    * Remove batch_size condition in ReutersTrainer
    
    * Add ignore_lengths option to ReutersTrainer and ReutersEvaluator
    
    * Add AAPDCharQuantized and ReutersCharQuantized
    
    * Rename Reuters_hierarchical to ReutersHierarchical
    
    * Add CharacterCNN for document classification
    
    * Update README.md for CharacterCNN
    
    * Fix table in README.md for CharacterCNN
    
    * Add AAPDHierarchical for HAN
    
    * Update HAN for changes in Reuters dataset endpoints
    
    * Fix bug in CharCNN when running on CPU
    
    * Add AAPD dataset support for KimCNN
    
    * Fix dataset paths for SST-1
    
    * Fix dimensions of FC1 in CharCNN
    
    * Add model checkpointing for Reuters based on F1
    
    * Refactor LSTM baseline __main__
    
    * Add precision, recall and F1 to Reuters evaluator
    
    * Checkpoint only at the end of an epoch for ReutersTrainer
    
    Add detailed log printing for dev evaluations
    
    * Fix log_template and dev_log_template in ReutersTrainer
    
    * Add IMDB dataset
    
    * Fix duplicate printing of header in ReutersTrainer
    
    * Add support for single_label datasets in ReutersTrainer
    
    * Add support for IMDB dataset in lstm_baseline and lstm_reg
    
    * Fix evaluator call in main method of HAN
    
    * Add IMDB for HAN
    
    * Fix for single_label
    
    * Fix evaluate_dataset method for single_label datasets
    
    * Reduce default patience to 5 epochs before early stopping
    
    * Revert change to save_state rather than the entire model
    
    * Add Yelp 2018 dataset
    
    * Integrate Yelp2018 with LSTM baseline
    
    * Replace Yelp2018 with Yelp2014 dataset
    
    * Add Yelp2014 to LSTM Baseline
    
    * Integrate Yelp14 into LSTM Regularization
    
    * Remove dropout in HBL for LSTM Baseline and Reg
    
    * Add Yelp for HAN
    
    * Fix the saving issue for HAN
    
    * Fix loading for HAN
    
    * Fix typo in ReutersEvaluator
    
    * Print to STDOUT rather than logger
    
    * Print XML-CNN eval to STDOUT rather than logger
    
    * Update max_length for IMDB dataset
    
    * Add single_label support for char_cnn
    
    * Fix evaluation method for char_cnn
    
    * Remove unwanted parameters from ReutersTrainer and ReutersEval
    
    * Fix code formatting in lstm_reg/args
    
    * Add support for IMDB and Yelp in KimCNN
    
    * Fix single_label incorporation
    
    * Remove unnecessary conditions
    
    * Fix num_classes in Yelp2014
    
    * Add single_label support for XML-CNN
    
    * Fix call to evaluator in XML-CNN
    
    * Address PEP8 issues
    
    * Address PEP8 issues
    
    * Address PEP8 issues
    
    * Address PEP8 issues
Commits on Dec 18, 2018
  1. Add SSE model (#168)

    Victor0118 committed Dec 18, 2018
    * runnable
    
    * add util file
    
    * update readme
    
    * update final layer and add model name
    
    * update argument
    
    * update readme, delete useless args
    
    * fix comments
    
    * fix more comments
  2. add DecAtt model (#170)

    Victor0118 committed Dec 18, 2018
    * add DecAtt model
    
    * update readme, add dropout
    
    * fix more comments
    
    * add trecqa, wikiqa results
    
    * remove extraneous comment
Commits on Dec 3, 2018
  1. add seperator (#167)

    Victor0118 committed Dec 3, 2018
Commits on Nov 13, 2018
Commits on Nov 11, 2018
  1. Neural Document Classification (#159)

    Achyudh Ram authored and Impavidity committed Nov 11, 2018
    * Add ReutersTrainer, ReutersEvaluator options in Factory classes
    
    * Add Reuters to Kim-CNN command line arguments
    
    * Fix SST dataset path according to changes in Kim-CNN args
    
    The dataset path in args.py was made to point at the dataset folder rather than dataset/SST folder. Hence SST folder was added to paths in the SST dataset class
    
    * Add Reuters dataset class, and support in __main__
    
    * Add Reuters dataset trainers and evaluators
    
    * Remove debug print statement in reuters_evaluator
    
    * Fix rounding bug in reuters_trainer and reuters_evaluator
    
    * Add LSTM for baseline text classification measurements
    
    * Add eval metrics for lstm_baseline
    
    * Set batch_first param in lstm_baseline
    
    * Remove onnx args from lstm_baseline
    
    * Pack padded sequences in LSTM_baseline
    
    * Add TensorBoardX support for Reuters trainer
    
    * Add Arxiv Academic Paper Dataset (AAPD)
    
    * Add Hidden Bottleneck Layer to BiLSTM
    
    * Fix packing of padded tensors in Reuters
    
    * Add cmdline args for Hidden Bottleneck Layer for BiLSTM
    
    * Include pre-padding lengths in AAPD dataset
    
    * Remove duplication of preprocessing code in AAPD
    
    * Remove batch_size condition in ReutersTrainer
    
    * Add ignore_lengths option to ReutersTrainer and ReutersEvaluator
    
    * Add AAPDCharQuantized and ReutersCharQuantized
    
    * Rename Reuters_hierarchical to ReutersHierarchical
    
    * Add CharacterCNN for document classification
    
    * Update README.md for CharacterCNN
    
    * Fix table in README.md for CharacterCNN
    
    * Add AAPDHierarchical for HAN
    
    * Update HAN for changes in Reuters dataset endpoints
    
    * Fix bug in CharCNN when running on CPU
    
    * Add AAPD dataset support for KimCNN
    
    * Fix dataset paths for SST-1
    
    * Fix dimensions of FC1 in CharCNN
    
    * Add model checkpointing for Reuters based on F1
    
    * Refactor LSTM baseline __main__
    
    * Add precision, recall and F1 to Reuters evaluator
    
    * Checkpoint only at the end of an epoch for ReutersTrainer
    
    Add detailed log printing for dev evaluations
    
    * Fix log_template and dev_log_template in ReutersTrainer
    
    * Add IMDB dataset
    
    * Add support for single_label datasets in ReutersTrainer
    
    * Add support for IMDB dataset in lstm_baseline and lstm_reg
Commits on Nov 10, 2018
  1. Fix HAN for batch_size 1 (#161)

    Ashutosh-Adhikari authored and daemon committed Nov 10, 2018
  2. Add AAPD for XML_CNN (#160)

    Ashutosh-Adhikari authored and daemon committed Nov 10, 2018
    * Add AAPD for XMLCNN
    
    * Add kwargs for XML
Commits on Nov 7, 2018
  1. Add model checkpointing to ReutersTrainer (#158)

    Achyudh Ram authored and daemon committed Nov 7, 2018
    * Add ReutersTrainer, ReutersEvaluator options in Factory classes
    
    * Add Reuters to Kim-CNN command line arguments
    
    * Fix SST dataset path according to changes in Kim-CNN args
    
    The dataset path in args.py was made to point at the dataset folder rather than dataset/SST folder. Hence SST folder was added to paths in the SST dataset class
    
    * Add Reuters dataset class, and support in __main__
    
    * Add Reuters dataset trainers and evaluators
    
    * Remove debug print statement in reuters_evaluator
    
    * Fix rounding bug in reuters_trainer and reuters_evaluator
    
    * Add LSTM for baseline text classification measurements
    
    * Add eval metrics for lstm_baseline
    
    * Set batch_first param in lstm_baseline
    
    * Remove onnx args from lstm_baseline
    
    * Pack padded sequences in LSTM_baseline
    
    * Add TensorBoardX support for Reuters trainer
    
    * Add Arxiv Academic Paper Dataset (AAPD)
    
    * Add Hidden Bottleneck Layer to BiLSTM
    
    * Fix packing of padded tensors in Reuters
    
    * Add cmdline args for Hidden Bottleneck Layer for BiLSTM
    
    * Include pre-padding lengths in AAPD dataset
    
    * Remove duplication of preprocessing code in AAPD
    
    * Remove batch_size condition in ReutersTrainer
    
    * Add ignore_lengths option to ReutersTrainer and ReutersEvaluator
    
    * Add AAPDCharQuantized and ReutersCharQuantized
    
    * Rename Reuters_hierarchical to ReutersHierarchical
    
    * Add CharacterCNN for document classification
    
    * Update README.md for CharacterCNN
    
    * Fix table in README.md for CharacterCNN
    
    * Add AAPDHierarchical for HAN
    
    * Update HAN for changes in Reuters dataset endpoints
    
    * Fix bug in CharCNN when running on CPU
    
    * Add AAPD dataset support for KimCNN
    
    * Fix dataset paths for SST-1
    
    * Fix dimensions of FC1 in CharCNN
    
    * Add model checkpointing for Reuters based on F1
    
    * Refactor LSTM baseline __main__
    
    * Add precision, recall and F1 to Reuters evaluator
Commits on Nov 6, 2018
  1. Add regularization modules for LSTM baseline (#156)

    Ashutosh-Adhikari authored and daemon committed Nov 6, 2018
    * Add Regularization Modules for LSTM
    
    * Update Reuters Trainer and Evalueator for regularization
    
    * Remove unnecessary comments
    
    * Comply with PEP8
    
    * Comply import order with PEP8
    
    * Fix typos in README.md
    
    * Comply with PEP8
    
    * Add BSD 3-Clause Licence
    
    * Remove deprecated call to Variable for PyTorch 0.4
    
    * Update dataset selection in main
    
    * Remove block comments
Commits on Nov 5, 2018
  1. Fix KimCNN for SST, AAPD datasets (#157)

    Achyudh Ram authored and Impavidity committed Nov 5, 2018
    * Add ReutersTrainer, ReutersEvaluator options in Factory classes
    
    * Add Reuters to Kim-CNN command line arguments
    
    * Fix SST dataset path according to changes in Kim-CNN args
    
    The dataset path in args.py was made to point at the dataset folder rather than dataset/SST folder. Hence SST folder was added to paths in the SST dataset class
    
    * Add Reuters dataset class, and support in __main__
    
    * Add Reuters dataset trainers and evaluators
    
    * Remove debug print statement in reuters_evaluator
    
    * Fix rounding bug in reuters_trainer and reuters_evaluator
    
    * Add LSTM for baseline text classification measurements
    
    * Add eval metrics for lstm_baseline
    
    * Set batch_first param in lstm_baseline
    
    * Remove onnx args from lstm_baseline
    
    * Pack padded sequences in LSTM_baseline
    
    * Add TensorBoardX support for Reuters trainer
    
    * Add Arxiv Academic Paper Dataset (AAPD)
    
    * Add Hidden Bottleneck Layer to BiLSTM
    
    * Fix packing of padded tensors in Reuters
    
    * Add cmdline args for Hidden Bottleneck Layer for BiLSTM
    
    * Include pre-padding lengths in AAPD dataset
    
    * Remove duplication of preprocessing code in AAPD
    
    * Remove batch_size condition in ReutersTrainer
    
    * Add ignore_lengths option to ReutersTrainer and ReutersEvaluator
    
    * Add AAPDCharQuantized and ReutersCharQuantized
    
    * Rename Reuters_hierarchical to ReutersHierarchical
    
    * Add CharacterCNN for document classification
    
    * Update README.md for CharacterCNN
    
    * Fix table in README.md for CharacterCNN
    
    * Add AAPDHierarchical for HAN
    
    * Update HAN for changes in Reuters dataset endpoints
    
    * Fix bug in CharCNN when running on CPU
    
    * Add AAPD dataset support for KimCNN
    
    * Fix dataset paths for SST-1
Commits on Oct 28, 2018
  1. Add CharacterCNN for Document Classification (#155)

    Achyudh Ram authored and Impavidity committed Oct 28, 2018
    * Add ReutersTrainer, ReutersEvaluator options in Factory classes
    
    * Add Reuters to Kim-CNN command line arguments
    
    * Fix SST dataset path according to changes in Kim-CNN args
    
    The dataset path in args.py was made to point at the dataset folder rather than dataset/SST folder. Hence SST folder was added to paths in the SST dataset class
    
    * Add Reuters dataset class, and support in __main__
    
    * Add Reuters dataset trainers and evaluators
    
    * Remove debug print statement in reuters_evaluator
    
    * Fix rounding bug in reuters_trainer and reuters_evaluator
    
    * Add LSTM for baseline text classification measurements
    
    * Add eval metrics for lstm_baseline
    
    * Set batch_first param in lstm_baseline
    
    * Remove onnx args from lstm_baseline
    
    * Pack padded sequences in LSTM_baseline
    
    * Add TensorBoardX support for Reuters trainer
    
    * Add Arxiv Academic Paper Dataset (AAPD)
    
    * Add Hidden Bottleneck Layer to BiLSTM
    
    * Fix packing of padded tensors in Reuters
    
    * Add cmdline args for Hidden Bottleneck Layer for BiLSTM
    
    * Include pre-padding lengths in AAPD dataset
    
    * Remove duplication of preprocessing code in AAPD
    
    * Remove batch_size condition in ReutersTrainer
    
    * Add ignore_lengths option to ReutersTrainer and ReutersEvaluator
    
    * Add AAPDCharQuantized and ReutersCharQuantized
    
    * Rename Reuters_hierarchical to ReutersHierarchical
    
    * Add CharacterCNN for document classification
    
    * Update README.md for CharacterCNN
    
    * Fix table in README.md for CharacterCNN
    
    * Add AAPDHierarchical for HAN
    
    * Update HAN for changes in Reuters dataset endpoints
    
    * Fix bug in CharCNN when running on CPU
Commits on Oct 26, 2018
  1. Replication of STOA for Reuters Dataset (#152)

    Achyudh Ram authored and Impavidity committed Oct 26, 2018
    * Add ReutersTrainer, ReutersEvaluator options in Factory classes
    
    * Add Reuters to Kim-CNN command line arguments
    
    * Fix SST dataset path according to changes in Kim-CNN args
    
    The dataset path in args.py was made to point at the dataset folder rather than dataset/SST folder. Hence SST folder was added to paths in the SST dataset class
    
    * Add Reuters dataset class, and support in __main__
    
    * Add Reuters dataset trainers and evaluators
    
    * Remove debug print statement in reuters_evaluator
    
    * Fix rounding bug in reuters_trainer and reuters_evaluator
    
    * Add LSTM for baseline text classification measurements
    
    * Add eval metrics for lstm_baseline
    
    * Set batch_first param in lstm_baseline
    
    * Remove onnx args from lstm_baseline
    
    * Pack padded sequences in LSTM_baseline
    
    * Add TensorBoardX support for Reuters trainer
    
    * Add Arxiv Academic Paper Dataset (AAPD)
    
    * Add Hidden Bottleneck Layer to BiLSTM
    
    * Fix packing of padded tensors in Reuters
    
    * Add cmdline args for Hidden Bottleneck Layer for BiLSTM
    
    * Include pre-padding lengths in AAPD dataset
    
    * Remove duplication of preprocessing code in AAPD
    
    * Remove batch_size condition in ReutersTrainer
Commits on Oct 25, 2018
  1. Add HAN and XML_CNN for Doc Classification (#154)

    Ashutosh-Adhikari authored and Victor0118 committed Oct 25, 2018
    * Add Reuters option in common.dataset
    
    * Add Reuters option in common.dataset
    
    * Add HAN model
    
    * Add XML-CNN
    
    * Add HAN
    
    * Add Hierarchical tokenization for Reuters
    
    * Add README for HAN
    
    * Add XML Readme
    
    * Update HAN Readme
Commits on Oct 11, 2018
  1. Baseline LSTM implementation (#150)

    Achyudh Ram authored and Impavidity committed Oct 11, 2018
    * Add ReutersTrainer, ReutersEvaluator options in Factory classes
    
    * Add Reuters to Kim-CNN command line arguments
    
    * Fix SST dataset path according to changes in Kim-CNN args
    
    The dataset path in args.py was made to point at the dataset folder rather than dataset/SST folder. Hence SST folder was added to paths in the SST dataset class
    
    * Add Reuters dataset class, and support in __main__
    
    * Add Reuters dataset trainers and evaluators
    
    * Remove debug print statement in reuters_evaluator
    
    * Fix rounding bug in reuters_trainer and reuters_evaluator
    
    * Add LSTM for baseline text classification measurements
    
    * Add eval metrics for lstm_baseline
    
    * Set batch_first param in lstm_baseline
    
    * Remove onnx args from lstm_baseline
Commits on Oct 9, 2018
  1. Add sts2014 and quora evaluators in common/evaluators/ (#151)

    likicode authored and Victor0118 committed Oct 9, 2018
    * add SNLI dataset
    
    * add STS-2014
    
    * add STS-2014
    
    * add trainers and evaluators for Quora
    
    * add quora in datasets/
    
    * process the  merge confict in common/dataset.py
    
    * add sts2014_evaluator.py and quora_evaluator.py
Commits on Oct 8, 2018
  1. add SNLI / STS-2014 / Quora dataset (#148)

    likicode authored and Impavidity committed Oct 8, 2018
    * add SNLI dataset
    
    * add STS-2014
    
    * add trainers and evaluators for Quora
    
    * add quora in datasets/
    
    * process the  merge confict in common/dataset.py
Commits on Oct 7, 2018
  1. Add Reuters dataset option for common.dataset (#149)

    Ashutosh-Adhikari authored and Impavidity committed Oct 7, 2018
    * Add Reuters option in common.dataset
Commits on Oct 3, 2018
  1. WIP: Add Reuters-21578 dataset (#147)

    Achyudh Ram authored and Impavidity committed Oct 3, 2018
    * Add ReutersTrainer, ReutersEvaluator options in Factory classes
    
    * Add Reuters to Kim-CNN command line arguments
    
    * Fix SST dataset path according to changes in Kim-CNN args
    
    The dataset path in args.py was made to point at the dataset folder rather than dataset/SST folder. Hence SST folder was added to paths in the SST dataset class
    
    * Add Reuters dataset class, and support in __main__
    
    * Add Reuters dataset trainers and evaluators
    
    * Remove debug print statement in reuters_evaluator
    
    * Fix rounding bug in reuters_trainer and reuters_evaluator
Commits on Sep 8, 2018
  1. Add twitter url dataset (#145)

    Victor0118 committed Sep 8, 2018
    * add prediction/qrel files dump option
    
    * fix comment
    
    * add twitter-url dataset, minor refactor
    
    * fix minor error
Commits on Aug 12, 2018
  1. Add PIT2015 dataset (#141)

    Victor0118 committed Aug 12, 2018
    * add prediction/qrel files dump option
    
    * fix comment
    
    * upgrade to torchtext 0.3
    
    * remove ngram
    
    * update device
    
    * minor update
    
    * add pit2015
    
    * revert update torchtext
    
    * revert update torchtext
    
    * revert update torchtext
Commits on Aug 4, 2018
  1. Make Kim CNN ONNX-exportable (#136)

    tuzhucheng committed Aug 4, 2018
    * Kim CNN - only set embedding for corresponding mode
    
    * Kim CNN ONNX Export
    
    * Specify dummy ONNX input size from command line
Commits on Jul 12, 2018
  1. Tune Kim CNN for SST-2 and Improve SST-1 Results with Dataset/Initial…

    tuzhucheng committed Jul 12, 2018
    …ization Changes (#133)
    
    * SST change min_freq and Kim CNN init distribution
    
    * Add tuned results for SST-1 and SST-2
    
    * Fix typo
Commits on Jul 11, 2018
  1. Add Apache 2.0 license (#132)

    daemon authored and lintool committed Jul 11, 2018
Commits on Jul 3, 2018
  1. Kim CNN OOP Refactoring (#124)

    tuzhucheng committed Jul 3, 2018
    * Nuke obsolete artifacts
    
    * Refactor Kim CNN
    
    * Make kim_cnn a module
    
    * Fix bugs
    
    * Update README
    
    * Add choices to dataset arg
    
    * update for sst2
    
    update sst.py
    
    update sst.py
    
    * Add Kim CNN dataset choices to args.py
    
    * Update tuned SST-1 accuracy
  2. Update VDPWI README with new results (#131)

    likicode authored and daemon committed Jul 3, 2018
    - Update with results on SICK, TrecQA, and WikiQA datasets
Older
You can’t perform that action at this time.