# Exercise 2 - Naive Bayes Multilabel Classification
(10 points)

Implement a Naive Bayes classifier which is able to assign multiple classes to a single document by finalizing the two given classes. The `MultiLabelLearner` class acts like a builder for the `MultiLabelClassifier` instances. That means that the learner gets the number of classes during its creation and the `learnExample` method of the learner is called once for each document of the training set. Internally, the learner should gather all statistics that are necessary for the classifier when processing the training examples.
After the learner saw all training documents, the `createClassifier` method is called which creates an instance of the `MultiLabelClassifier` class and initializes it with the statistics gathered before. 
The classification itself is carried out by the `classify` method which takes an unknown document and assigns it a set of classes learned before.

#### Hints

- Please do not forget to preprocess your documents. What exactly the preprocessing does is up to you.
- The classification should be based on the naive Bayes classification. You may want to reuse code from exercise 1.
- In our datasets, each document has *at least one* class. You may want to take this information into account.
- The evaluation will use micro precision, micro recall and micro F1-measure (also named or F1-score).
- The evaluation in the hidden tests has three stages. 
  1. Your solution will get 4 points as soon as it is better than the baselines. The baselines are:
     - For each class, a classifier that always returns this class.
     - A random guesser that returns a random class.
  2. If your solution has an F1-score >= 0.7, you will get 3 more points.
  3. If your solution has an F1-score >= 0.8, you will get 3 more points.
- You can download the [multi-class-train.tsv](https://hobbitdata.informatik.uni-leipzig.de/teaching/SNLP/classification/multi-class-train.tsv) file. It comprises one document per line. The first part comprises the classes (separated with a `", "` string), followed by a tab character (`\t`). The remaining content of the line is the text of the document.

#### Notes

- Do not add additional external libraries.
- Interface
  - You can use _[TAB]_ for autocompletion and _[SHIFT]_+_[TAB]_ for code inspection.
  - Use _Menu_ -> _View_ -> _Toggle Line Numbers_ for debugging.
  - Check _Menu_ -> _Help_ -> _Keyboard Shortcuts_.
- Finish
  - Save your solution by clicking on the _disk icon_.
  - Finally, choose _Menu_ -> _File_ -> _Close and Halt_.
  - Do not forget to _Submit_ your solution in the _Assignments_ view.

In [5]:
// YOUR CODE HERE


public class BayesianClassifier {
	// YOUR CODE HERE
	Set<String> uniqueWords = null;
	HashMap<String, Integer> updatedClassCount = null;
	HashMap<String, BigDecimal> updatedClassProb = null;
	HashMap<String, HashMap<String, Integer>> UpdatedClassWordCount = null;
	int globalTotalCount = 0;

	HashMap<String, Double> classProbabilites = new HashMap<>();

	public final static String STARTS_WITH_NUMBER = "[1-9]\\s*(\\w+)";

	BayesianClassifier(HashMap<String, BigDecimal> classProb, HashMap<String, HashMap<String, Integer>> classWordCount,
			HashMap<String, Integer> classCount, int totalCount, Set<String> uniqueWords) {
		this.updatedClassCount = classCount;
		this.updatedClassProb = classProb;
		this.UpdatedClassWordCount = classWordCount;
		this.globalTotalCount = totalCount;
		this.uniqueWords = uniqueWords;

	}

	public String preprocess(String text) {

		
		
		text = text.replaceAll(STARTS_WITH_NUMBER, "");
		
		// text = text.replaceAll("@([^\\s]+)", "");
		text = text.replaceAll("[^\\s\\w']*", "");
		text = text.replaceAll("(\\bthe\\b)*", "");
		text = text.replaceAll("(\\band\\b)*", "");
		text = text.replaceAll("(\\ba\\b)*", "");
		text = text.replaceAll("(\\bis\\b)*", "");
		text = text.replaceAll("(\\bits\\b)*", "");
		text = text.replaceAll("(\\bfrom\\b)*", "");
		text = text.replaceAll("(\\bit\\b)*", "");
		text = text.replaceAll("(\\bfor\\b)*", "");
		text = text.replaceAll("(\\bin\\b)*", "");
		text = text.replaceAll("(\\bto\\b)*", "");
		text = text.replaceAll("(\\bof\\b)*", "");
		text = text.replaceAll("(\\bhas\\b)*", "");
		text = text.replaceAll("(\\bhad\\b)*", "");
		text = text.replaceAll("(\\bhave\\b)*", "");
		text = text.replaceAll("(\\bwas\\b)*", "");
		text = text.replaceAll("(\\bare\\b)*", "");
		text = text.replaceAll("(\\bat\\b)*", "");
		text = text.replaceAll("([0-9]+)*", "");
		
		return text;
	}

	/**
	 * Classifies the given document and returns the class name.
	 */
	public String classify(String text) {
		String clazz = null;

		text = preprocess(text);
		String[] words = text.toLowerCase().split("[^a-z0-9']+");
		HashMap<String, Integer> map = new HashMap<String, Integer>();

		int uniqueWordsSize = this.uniqueWords.size();

		for (String word : words) {

			this.uniqueWords.add(word);

			if (map.containsKey(word)) {
				int value = map.get(word);
				value++;
				map.put(word, value);
			} else {
				map.put(word, 1);
			}

		}

		BigDecimal minimum = BigDecimal.ZERO;

		for (String key : this.updatedClassCount.keySet()) {
			BigDecimal finalProbability = BigDecimal.ONE;

			BigDecimal classProb = BigDecimal.ZERO;

			if (updatedClassProb.containsKey(key)) {
				// System.out.println("Class "+key+ " has probability
				// "+updatedClassProb.get(key));
				classProb = updatedClassProb.get(key);
			}

			HashMap<String, Integer> temp = UpdatedClassWordCount.get(key);
			// System.out.println("Temp "+temp);
			int totalWordsInHapMapForThatClass = 0;

			for (int value : temp.values()) {
				totalWordsInHapMapForThatClass += value;
			}

			for (String k : map.keySet()) {
				BigDecimal wordFrequency = BigDecimal.ZERO;

				if (temp.containsKey(k)) {
					wordFrequency = new BigDecimal(temp.get(k));
					// System.out.println(" word frequ for word "+k+" is
					// "+wordFreq);
					BigDecimal denominator = new BigDecimal(totalWordsInHapMapForThatClass)
							.add(new BigDecimal(uniqueWordsSize));
					BigDecimal probByClass = (wordFrequency.add(BigDecimal.ONE)).divide(denominator,
							MathContext.DECIMAL128);

					BigDecimal power = probByClass.pow(map.get(k),MathContext.DECIMAL128);
					finalProbability = finalProbability.multiply(power,MathContext.DECIMAL128);

				} else {

					BigDecimal denominator = (new BigDecimal(totalWordsInHapMapForThatClass))
							.add(new BigDecimal(uniqueWordsSize));
					BigDecimal probByClass = (BigDecimal.ONE).divide(denominator, MathContext.DECIMAL128);

					BigDecimal power = probByClass.pow(map.get(k),MathContext.DECIMAL128);
					finalProbability = finalProbability.multiply(power,MathContext.DECIMAL128);

				}

			}

			finalProbability = finalProbability.multiply(classProb,MathContext.DECIMAL128);

			if (finalProbability.compareTo(minimum) == 1) {
				clazz = key;
				minimum = finalProbability;
			}

		}

		// update ClassCounr Map
		globalTotalCount++;

		if (updatedClassCount.containsKey(clazz)) {
			int value = updatedClassCount.get(clazz);
			value++;
			updatedClassCount.put(clazz, value);

			for (String key : updatedClassCount.keySet()) {
				BigDecimal value1 = (new BigDecimal(updatedClassCount.get(key)))
						.divide(new BigDecimal(globalTotalCount), MathContext.DECIMAL128);
				updatedClassProb.put(key, value1);
			}

		}

		if (UpdatedClassWordCount.containsKey(clazz)) {
			for (String key : map.keySet()) {
				if (UpdatedClassWordCount.get(clazz).containsKey(key)) {
					UpdatedClassWordCount.get(clazz).put(key, UpdatedClassWordCount.get(clazz).get(key) + map.get(key));
				} else {
					UpdatedClassWordCount.get(clazz).put(key, map.get(key));
				}
			}
		}

		return clazz;
	}
}


/**
 * Learner (or Builder) class for a naive Bayes classifier.
 */
/**
 * Learner (or Builder) class for a naive Bayes classifier.
 */
public class BayesianLearner {
	// YOUR CODE HERE
	Set<String> uniqueWords = new HashSet<>();
	HashMap<String, Integer> classCount = new HashMap<String, Integer>();
	HashMap<String, BigDecimal> classProb = new HashMap<String, BigDecimal>();
	HashMap<String, HashMap<String, Integer>> classWordCount = new HashMap<String, HashMap<String, Integer>>();
	int totalCount = 0;

	
	public final static String STARTS_WITH_NUMBER = "[1-9]\\s*(\\w+)";

	/**
	 * Constructor taking the set of classes the classifier should be able to
	 * distinguish.
	 */
	public BayesianLearner(Set<String> classes) {
		// YOUR CODE HERE
		for (String c : classes) {
			this.classCount.put(c, 0);
			this.classProb.put(c, BigDecimal.ZERO);
		}

	}

	/**
	 * The method used to learn the training examples. It takes the name of the
	 * class as well as the text of the training document.
	 */
	public void learnExample(String clazz, String text) {
		// YOUR CODE HERE
		totalCount++;
		// line = line.toLowerCase().replaceAll("[^a-z0-9\\s]","");

		text = preprocess(text);
		String[] words = text.toLowerCase().split("[^a-z0-9']+");

		if (classWordCount.containsKey(clazz)) {
			// HashMap<String, Integer> temp = classWordCount.get(clazz);
			// classWordCount.put(clazz, wordCount);

			HashMap<String, Integer> temp = classWordCount.get(clazz);
			for (String c : words) {
				uniqueWords.add(c);
				if (temp.containsKey(c)) {
					int val = temp.get(c);
					val++;
					temp.put(c, val);
				} else {
					temp.put(c, 1);
				}

			}

			classWordCount.put(clazz, temp);

		} else {
			HashMap<String, Integer> wordCount = new HashMap<String, Integer>();
			for (String c : words) {
				uniqueWords.add(c);
				if (wordCount.containsKey(c)) {
					int val = wordCount.get(c);
					val++;
					wordCount.put(c, val);
				} else {
					wordCount.put(c, 1);
				}

			}
			classWordCount.put(clazz, wordCount);
		}

		// updating class count based on input clazz
		if (classCount.containsKey(clazz)) {
			int val = classCount.get(clazz);

			val++;
			// System.out.println("value of "+clazz+ " is "+ val);
			classCount.put(clazz, val);
		} else {
			classCount.put(clazz, 0);
		}

	}

	public String preprocess(String text) {

		  
		//text = text.replaceAll(STARTS_WITH_NUMBER, "");
		
		text = text.replaceAll(STARTS_WITH_NUMBER, "");
		

		// text = text.replaceAll("@([^\\s]+)", "");
		text = text.replaceAll("[^\\s\\w']*", "");
		text = text.replaceAll("(\\bthe\\b)*", "");
		text = text.replaceAll("(\\band\\b)*", "");
		text = text.replaceAll("(\\ba\\b)*", "");
		text = text.replaceAll("(\\bis\\b)*", "");
		text = text.replaceAll("(\\bits\\b)*", "");
		text = text.replaceAll("(\\bfrom\\b)*", "");
		text = text.replaceAll("(\\bit\\b)*", "");
		text = text.replaceAll("(\\bfor\\b)*", "");
		text = text.replaceAll("(\\bin\\b)*", "");
		text = text.replaceAll("(\\bto\\b)*", "");
		text = text.replaceAll("(\\bof\\b)*", "");
		text = text.replaceAll("(\\bhas\\b)*", "");
		text = text.replaceAll("(\\bhad\\b)*", "");
		text = text.replaceAll("(\\bhave\\b)*", "");
		text = text.replaceAll("(\\bwas\\b)*", "");
		text = text.replaceAll("(\\bare\\b)*", "");
		text = text.replaceAll("(\\bat\\b)*", "");
		text = text.replaceAll("([0-9]+)*", "");
		
		

		return text;
	}

	/**
	 * Creates a BayesianClassifier instance based on the statistics gathered
	 * from the training example.
	 */
	public BayesianClassifier createClassifier() {
		BayesianClassifier classifier = null;
		// YOUR CODE HERE

		for (String key : this.classCount.keySet()) {

			BigDecimal value = (new BigDecimal(classCount.get(key))).divide(new BigDecimal(totalCount),
					MathContext.DECIMAL128);
			this.classProb.put(key, value);
		}

		classifier = new BayesianClassifier(this.classProb, this.classWordCount, this.classCount, this.totalCount,
				this.uniqueWords);
		return classifier;
	}
}




/**
 * Classifier implementing naive Bayes classification for a multilabel
 * classification.
 */
public class MultiLabelClassifier {
    // YOUR CODE HERE
    
	Map<String, BayesianLearner> learner = null;
	Set<String> totalClass = null;
	Map<String, BayesianClassifier> classifier = null;
    
    MultiLabelClassifier(Map<String, BayesianLearner> learner, Set<String> totalClass, Map<String, BayesianClassifier> classifier) {
		this.learner = learner;
		this.totalClass = totalClass;
		this.classifier = classifier;

	}
    
   
    /**
     * Classifies the given document and returns the class names.
     */
    public Set<String> classify(String text) {
        Set<String> results = null;
        
        results = new HashSet<String>();
        for(String key:this.totalClass)
        {
        	
        	BayesianClassifier bc = this.classifier.get(key);
        	String clazz = bc.classify(text);
        	
        	if(clazz == "yes")
        	{
        		results.add(key);
        	}
        	
        	//MultiLabelClassifier mc = new MultiLabelClassifier(learner, totalClass);
        	
        	
        }
       
        return results;
    }
}


/**
 * Learner (or Builder) class for a naive Bayes multilabel classifier.
 */
public class MultiLabelLearner {
    // YOUR CODE HERE

    Map<String, BayesianLearner> learner = new HashMap<String, BayesianLearner>();
    Map<String, BayesianClassifier> classifier = new HashMap<String, BayesianClassifier>();
    Set<String> totalClass = new HashSet<String>();
   
    /**
     * Constructor taking the number of classes the classifier should be able to
     * distinguish.
     */
    public MultiLabelLearner(Set<String> classes) {
        // YOUR CODE HERE
        Set<String> baseClass = new HashSet<String>();
        this.totalClass = classes;
        
        baseClass.add("yes");
        baseClass.add("no");
        
        for (String c : classes) {
        	learner.put(c, new BayesianLearner(baseClass));
		}
        
    }

    /**
     * The method used to learn the training examples. It takes the names of the
     * classes as well as the text of the training document.
     */
    public void learnExample(Set<String> classes, String text) {
    	
    	for(String c:this.totalClass)
    	{
    		if(classes.contains(c))
    		{
    			BayesianLearner bl = learner.get(c);
    			bl.learnExample("yes", text);
    		}
    		else
    		{
    			BayesianLearner bl = learner.get(c);
    			bl.learnExample("no", text);
    		}
    	}
    }

    
    
    /**
     * Creates a MultiLabelClassifier instance based on the statistics gathered from
     * the training example.
     */
    public MultiLabelClassifier createClassifier() {
        MultiLabelClassifier classifier = null;
        
        for (String key : this.totalClass) {
        	BayesianLearner bl = learner.get(key);
        	this.classifier.put(key, bl.createClassifier());
			
		}

		classifier = new MultiLabelClassifier(this.learner, this.totalClass, this.classifier);
        
        // YOUR CODE HERE
        return classifier;
    }
}
// This line should make sure that compile errors are directly identified when executing this cell
// (the line itself does not produce any meaningful result)
new MultiLabelLearner(new HashSet<>(Arrays.asList("good","bad")));
System.out.println("compiled");

compiled


# Evaluation

- Run the following cell to test your implementation.
- You can ignore the cells afterwards.

In [6]:
%maven org.junit.jupiter:junit-jupiter-api:5.3.1
import org.junit.jupiter.api.Assertions;
import org.opentest4j.AssertionFailedError;
import java.util.stream.Collectors;
import java.util.Map.Entry;
import org.apache.commons.io.FileUtils;
import java.io.File;
import java.io.IOException;

/**
 * Simple structure to store the classes and the text of a document.
 */
public class ClassifiedDocument {
    public final Set<String> classes;
    public final String text;
    public ClassifiedDocument(Set<String> classes, String text) {
        this.classes = classes;
        this.text = text;
    }
}
/**
 * Simple method for reading classification examples from a file as a list of (classes, text) pairs.
 */
public static List<ClassifiedDocument> readClassData(String filename) throws IOException {
    return FileUtils.readLines(new File(filename), "utf-8").stream().map(s -> s.split("\t"))
            .filter(s -> s.length > 1)
            .map(s -> new ClassifiedDocument(new HashSet<>(Arrays.asList(s[0].split(", "))), s[1]))
            .collect(Collectors.toList());
}

public static void checkClassifier(List<ClassifiedDocument> trainingCorpus,
        List<ClassifiedDocument> evaluationCorpus, double minF1Score) {
    try {
        System.out.print("Training corpus size: ");
        System.out.println(trainingCorpus.size());
        System.out.print("Eval. corpus size   : ");
        System.out.println(evaluationCorpus.size());
        // Determine the classes
        Set<String> classes = Arrays.asList(trainingCorpus, evaluationCorpus).stream().flatMap(l -> l.stream())
                .map(d -> d.classes).flatMap(c -> c.stream()).distinct().collect(Collectors.toSet());
        // Determine the number of instances per class in the evaluation set
        Map<String, Long> evalClassCounts = evaluationCorpus.stream().map(d -> d.classes).flatMap(c -> c.stream())
                .collect(Collectors.groupingBy(c -> c, Collectors.counting()));
        for (String clazz : classes) {
            if (!evalClassCounts.containsKey(clazz)) {
                evalClassCounts.put(clazz, 0L);
            }
        }
        long expectedClassSum = evalClassCounts.entrySet().stream().mapToLong(e -> e.getValue()).sum();

        // Determine the expected accuracies of the baselines
        Map<String, double[]> f1ForClassGuessers = new HashMap<>();
        for (Entry<String, Long> e : evalClassCounts.entrySet()) {
            f1ForClassGuessers.put(e.getKey(), calcStats(e.getValue().intValue(),
                    evaluationCorpus.size() - e.getValue().intValue(), (int) (expectedClassSum - e.getValue())));
        }

        // Train the classifier
        long time1 = System.currentTimeMillis();
        MultiLabelLearner learner = new MultiLabelLearner(classes);
        for (ClassifiedDocument trainingExample : trainingCorpus) {
            learner.learnExample(trainingExample.classes, trainingExample.text);
        }
        MultiLabelClassifier classifier = learner.createClassifier();
        time1 = System.currentTimeMillis() - time1;
        System.out.println("Training took       : " + time1 + "ms");

        // Classify the evaluation corpus
        long time2 = System.currentTimeMillis();
        Map<String, int[]> classCounts = new HashMap<>();
        final int TP = 0, FP = 1, FN = 2, TN = 3;
        for (String clazz : classes) {
            classCounts.put(clazz, new int[4]);
        }
        int id = 0;
        Set<String> result;
        List<String[]> errorDetails = new ArrayList<>();
        boolean added;
        for (ClassifiedDocument evalExample : evaluationCorpus) {
            added = false;
            result = classifier.classify(evalExample.text);
            String resultAsString = result.toString();
            for (String clazz : classes) {
                if (evalExample.classes.contains(clazz)) {
                    if (result.contains(clazz)) {
                        ++classCounts.get(clazz)[TP];
                    } else {
                        ++classCounts.get(clazz)[FN];
                        if (!added) {
                            errorDetails.add(new String[] { Integer.toString(id), evalExample.classes.toString(),
                                    resultAsString });
                        }
                    }
                } else {
                    if (result.contains(clazz)) {
                        ++classCounts.get(clazz)[FP];
                        if (!added) {
                            errorDetails.add(new String[] { Integer.toString(id), evalExample.classes.toString(),
                                    resultAsString });
                        }
                    } else {
                        ++classCounts.get(clazz)[TN];
                    }
                }
            }
            result.removeAll(evalExample.classes);
            if ((result.size() > 0) && (!added)) {
                errorDetails.add(
                        new String[] { Integer.toString(id), evalExample.classes.toString(), result.toString() });
            }
            ++id;
        }
        time2 = System.currentTimeMillis() - time2;
        System.out.println("Classification took : " + time2 + "ms");
        int counts[] = new int[4];
        for (Entry<String, int[]> stats : classCounts.entrySet()) {
            counts[0] += stats.getValue()[0];
            counts[1] += stats.getValue()[1];
            counts[2] += stats.getValue()[2];
            counts[3] += stats.getValue()[3];
        }
        double solutionPerformance[] = calcStats(counts[TP], counts[FP], counts[FN]);

        System.out.println("classifiers           precision    recall  f1-score");
        for (Entry<String, double[]> baseResult : f1ForClassGuessers.entrySet()) {
            System.out.println(String.format("Always %-13s:   %-7.5f   %-7.5f   %-7.5f", baseResult.getKey(),
                    baseResult.getValue()[0], baseResult.getValue()[1], baseResult.getValue()[2]));
        }
        System.out.println(
                String.format("Your solution       :   %-7.5f   %-7.5f   %-7.5f (%d tp, %d tn, %d fp, %d fn)",
                        solutionPerformance[0], solutionPerformance[1], solutionPerformance[2], counts[TP],
                        counts[TN], counts[FP], counts[FN]));
        if (errorDetails.size() > 0) {
            System.out.println("  Wrong classifications are:");
            for (int i = 0; i < Math.min(errorDetails.size(), 20); ++i) {
                System.out.print("    id=");
                System.out.print(errorDetails.get(i)[0]);
                System.out.print(" expected=");
                System.out.print(errorDetails.get(i)[1]);
                System.out.print(" result=");
                System.out.println(errorDetails.get(i)[2]);
            }
            if (errorDetails.size() > 20) {
                System.out.println("    ...");
            }
        }

        // Make sure that the students solution is better than all baselines
        for (Entry<String, double[]> baseResult : f1ForClassGuessers.entrySet()) {
            if (baseResult.getValue()[2] >= solutionPerformance[2]) {
                StringBuilder builder = new StringBuilder();
                builder.append("Your solution is not better than a classifier that always chooses the \"");
                builder.append(baseResult.getKey());
                builder.append("\" class.");
                Assertions.fail(builder.toString());
            }
        }
        if ((minF1Score > 0) && (minF1Score > solutionPerformance[2])) {
            Assertions.fail("Your solution did not reach the expected F1-score of " + minF1Score);
        }
        System.out.println("Test successfully completed.");
    } catch (

    AssertionFailedError e) {
        throw e;
    } catch (Throwable e) {
        System.err.println("Your solution caused an unexpected error:");
        throw e;
    }
}
/**
 * Simple method for calculating micro precision, recall and F1-measure.
 */
public static double[] calcStats(int tp, int fp, int fn) {
    double precision = tp / (double) (tp + fp);
    double recall = tp / (double) (tp + fn);
    return new double[] { precision, recall, (2 * precision * recall) / (precision + recall) };
}

System.out.println("---------- Simple example corpus ----------");
List<ClassifiedDocument> exampleCorpusTrain = Arrays.asList(
        new ClassifiedDocument(new HashSet<String>(Arrays.asList("chess")),
                "white king, black rook, black queen, white pawn, black knight, white bishop."),
        new ClassifiedDocument(new HashSet<String>(Arrays.asList("history")),
                "knight person granted honorary title knighthood"),
        new ClassifiedDocument(new HashSet<String>(Arrays.asList("history")),
                "knight order eligibility, knighthood, head of state, king, prelate, middle ages."),
        new ClassifiedDocument(new HashSet<String>(Arrays.asList("chess", "game")),
                "Defense knight king pawn opening game opponent."),
        new ClassifiedDocument(new HashSet<String>(Arrays.asList("game")),
                "Game. player opponent victory. draw."));
List<ClassifiedDocument> exampleCorpusTest = Arrays.asList(
        new ClassifiedDocument(new HashSet<String>(Arrays.asList("history")), "Knighthood Middle Ages."),
        new ClassifiedDocument(new HashSet<String>(Arrays.asList("game", "chess")),
                "player black knight opponent pawn queen checkmate game draw victory."),
        // document with unknown words
        new ClassifiedDocument(new HashSet<String>(Arrays.asList("game")), "player opponent opening"));
checkClassifier(exampleCorpusTrain, exampleCorpusTest, 0);

System.out.println();
System.out.println("---------- Larger example corpus ----------");
List<ClassifiedDocument> classificationData = readClassData("/srv/distribution/multi-class-train.tsv");
checkClassifier(classificationData.subList(0, 600), classificationData.subList(600, classificationData.size()),
        0);

---------- Simple example corpus ----------
Training corpus size: 5
Eval. corpus size   : 3
Training took       : 27ms
Classification took : 12ms
classifiers           precision    recall  f1-score
Always game         :   0.66667   0.50000   0.57143
Always chess        :   0.33333   0.25000   0.28571
Always history      :   0.33333   0.25000   0.28571
Your solution       :   0.80000   1.00000   0.88889 (4 tp, 4 tn, 1 fp, 0 fn)
  Wrong classifications are:
    id=2 expected=[game] result=[game, chess]
    id=2 expected=[game] result=[chess]
Test successfully completed.

---------- Larger example corpus ----------
Training corpus size: 600
Eval. corpus size   : 195
Training took       : 11410ms
Classification took : 3850ms
classifiers           precision    recall  f1-score
Always money-fx     :   0.43590   0.29720   0.35343
Always nat-gas      :   0.03077   0.02098   0.02495
Always interest     :   0.10256   0.06993   0.08316
Always corn         :   0.06667   0.04545   0.05405
Always sh

In [None]:
// Ignore this cell

In [None]:
// Ignore this cell

In [None]:
// Ignore this cell