Source folder - : holds the path to training and test files. The git only stores tokenized dictionaries. Not the entire xml files given in the website. contains utilities to mine the xml train files. for example - given a conversation id get all the conversations creates a self designed feature set with 4 parameters. these parameters were crossvalidated and hard coded in it. These features are then used to train a naive bayes classifier.

Results folder -

     The best result so far is 93 % true positive
                               25 % false positive

     Scope for improvement - making the features taking into account the frequency when the bayes classifier is being constructed.

For anydoubt contact elango at kth dot se