Skip to content

Commit

Permalink
reorder exercises
Browse files Browse the repository at this point in the history
  • Loading branch information
ogrisel committed Aug 9, 2011
1 parent dc27354 commit d2006f4
Show file tree
Hide file tree
Showing 5 changed files with 23 additions and 14 deletions.
File renamed without changes.
File renamed without changes.
37 changes: 23 additions & 14 deletions tutorial/exercises.rst
Expand Up @@ -2,7 +2,12 @@ Exercises
=========

To do the exercises, copy the content of the 'skeletons' folder as
a new folder named 'workspace'.
a new folder named 'workspace'::

% cp -r skeletons workspace

You can then edit the content of the workspace without fear of loosing
the original exercise instructions.

Then fire an ipython shell and run the work-in-progress script with::

Expand All @@ -13,33 +18,37 @@ mortem ipdb session.

Refine the implementation and iterate until the exercise is solved.

**For each exercise, the skeleton file provides all the necessary import
statements, boilerplate code to load the data and sample code to evaluate
the predictive accurracy of the model.**

Exercise 1: Sentiment Analysis on movie reviews
-----------------------------------------------

- Write a text classification pipeline to classify movie reviews as either
positive or negative.
Exercise 1: Language identification
-----------------------------------

- Find a good set of parameters using grid search.
- Write a text classification pipeline using a custom preprocessor and
``CharNGramAnalyzer`` using data from Wikipedia articles as training set.

- Evaluate the performance on a held out test set.
- Evaluate the performance on some held out test set.

ipython command line::

%run workspace/exercise_01_sentiment.py data/movie_reviews/txt_sentoken/
%run workspace/exercise_01_language_train_model.py data/languages/paragraphs/


Exercise 2: Language identification
-----------------------------------
Exercise 2: Sentiment Analysis on movie reviews
-----------------------------------------------

- Write a text classification pipeline using a custom preprocessor and
``CharNGramAnalyzer`` using data from Wikipedia articles as training set.
- Write a text classification pipeline to classify movie reviews as either
positive or negative.

- Evaluate the performance on some held out test set.
- Find a good set of parameters using grid search.

- Evaluate the performance on a held out test set.

ipython command line::

%run workspace/exercise_02_language_train_model.py data/languages/paragraphs/
%run workspace/exercise_02_sentiment.py data/movie_reviews/txt_sentoken/


Exercise 3: CLI text classification utility
Expand Down

0 comments on commit d2006f4

Please sign in to comment.