Skip to content

mstewart/kaggle-evergreen

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Working on the Kaggle StumbleUpon evergreen classification challenge: http://www.kaggle.com/c/stumbleupon

1. Apply SVM using pre-existing features supplied.
(Put all data points into a vector space, draw a best-fit plane separating the datapoints of each classification.)
Probably not going to give good results here but a good exercise.


2. Principal component discriminant analysis:
See if a single feature dominates the result.
(Oscar says this is isomorphic to LU decomposition - need to look at an example of this.)


3. Extract additional features from the raw text.
Word frequency - see if there are words which are indicators of evergreen-ness.


4. Pull apart the url.
Definitely domain and non-domain.


5. CART: An algorithm for building binary classification trees.

About

Kaggle project for StumbleUpon evergreen-ness classification

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published