From 19e19151d673ed440ea54761803996f659ff8d68 Mon Sep 17 00:00:00 2001 From: Chris Little Date: Sun, 14 Oct 2018 18:16:11 -0700 Subject: [PATCH] more minor fixes (giving up on mybinder running classification) --- binder/Text Classification of Drug Reviews.ipynb | 2 ++ binder/requirements.txt | 2 +- 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/binder/Text Classification of Drug Reviews.ipynb b/binder/Text Classification of Drug Reviews.ipynb index ac5d51c15..e46afa6ee 100644 --- a/binder/Text Classification of Drug Reviews.ipynb +++ b/binder/Text Classification of Drug Reviews.ipynb @@ -11,6 +11,8 @@ "The text classification task below uses customer review text to predict the condition for which the drug in question was prescribed. No other data (the drug name, for example) is used in this task.\n", "\n", "### Caveats\n", + "Unfortunately, this notebook crashes near the end when run on mybinder.org. But it runs fine on Google Colab, though you'll need to add a cell at the beginning to call `!pip install abydos`.\n", + "\n", "This is a toy problem. I have taken a dataset that was already divided into training & test sets and used the test set for validation, not as a genuine test set. On the other hand, I haven't done much hyperparameter tuning. Indeed, all of the classifiers used below have identical parameters: `LinearSVC(loss='hinge', C=1, max_iter=2000, random_state=1337)`.\n", "\n", "However, Abydos was used in a [winning submission](https://www.kaggle.com/c/anlp-2015-classification-assignment/leaderboard) to a Kaggle (InClass) competition in UC Berkeley's 2015 Applied NLP course. The same [notebook](https://gist.github.com/chrislit/3852eed7cce4b3544db2) (but with its Pseudo-SSK classifier disabled due to memory requirements) was applied to [the following year's competition](https://www.kaggle.com/c/anlp-2016-classification-assignment/leaderboard), after the competition deadline, and beat that year's leader (0.89535 to 0.89369) without any tuning. So... Abydos can be useful in generalizing text classification tasks.\n", diff --git a/binder/requirements.txt b/binder/requirements.txt index 28a7a1c8e..ba43f4375 100644 --- a/binder/requirements.txt +++ b/binder/requirements.txt @@ -2,6 +2,6 @@ abydos numpy pandas scikit-learn +tensorflow keras nltk -