# Machine Learning SoSe21 Practice Class

Dr. Timo Baumann, Dr. Özge Alaçam, Björn Sygo <br>
Email: baumann@informatik.uni-hamburg.de, alacam@informatik.uni-hamburg.de, 6sygo@informatik.uni-hamburg.de

## Exercise 4
**Description:** Analyse and work with a machine learning library for support vector machines <br>
**Deadline:** Saturday, 22.05.2021, 23:59 <br>
**Working together:** You can work in pairs or triples but no larger teams are allowed. <br>
&emsp;&emsp;&emsp; &emsp; &emsp; &emsp; &emsp; Please adhere to the honor code discussed in class. <br>
&emsp;&emsp;&emsp; &emsp; &emsp; &emsp; &emsp; All members of the team must get involved in understanding and coding the solution.

## Submission: 
**Put your names here**

*Also put high-level comments that should be read before looking at your code and results.*

### Goal
 1. analyze existing code of machine learning algorithms (in this case for training SVMs)
 2. experiment with integrating and using code that is available in libraries.

You will be working with two libraries, one implemented in C++ (LibSVM, see https://github.com/cjlin1/libsvm) and the other in JAVA (part of the WEKA library, see https://github.com/Waikato/weka-3.8).

Note that if you are running Windows, doing the C++ parts of this exercise in the Windows Subsystem for Linux may (or may not) simplify your life.

## Task 1: Mapping Code to Machine Learning Concepts

In the following, we ask you to relate concepts used in SVMs to the code in the two implementations. When we ask you to identify where in the code a given concept is given, please report the filename(s) and line(s) as well as the class/function/variable names.


### Kernels
1. Find the code that defines the abstract kernel definition and report what types of kernels are available in WEKA/LibSVM respectively (as well as where/how they are implemented).
2. How would you go about implementing an additional kernel?
3. Describe the process involved in computing the dot product (`svm.cpp:294ff.`, operation `Kernel::dot` for LibSVM; `weka.classifiers.functions.supportVector.CachedKernel#dotProd`, lines 292ff. for WEKA) including what are the arguments to the respective functions.
4. Describe the caching that is going on in both libraries (LRU refers to least-recently-used cache). What is being cached and why is this relevant?

### Sequential Minimal Optimization

1. As described in the lectures, SMO alternates between finding two parameters α to optimize, and optimizing their values. Find this loop in both implementations and name the methods used for performing the two steps.
2. Identify the code that performs the 'clipping' of lagrange multipliers as described in the lecture notes on page 25. Can you find the optimization computations involved in finding the new value for $α_i$ and $α_j$?
3. Describe where in the code the mapping function $ϕ$ is being computed.

### Hints
* Download both WEKA and LibSVM from the URLs given above (preferably use `git clone`).
* For LibSVM, you'll primarily look at `svm.cpp`. Any editor with syntax highlighting will do.
* Open the WEKA code in the IDE of your choice (IntelliJ, Eclipse, ...). WEKA comes with an Eclipse project definition that IntelliJ can import as well.
* Sometimes it may be easier to start the analysis with LibSVM, sometimes with WEKA -- although C++ may be less familiar, it's easy to get lost in the WEKA class hierarchies; conversely, the class hierarchy may also help understanding.

### Kernels

1. 
 * libsvm  
 The abstract kernel is defined as a class (svm.cpp:202). Specific kernels are implemented by deciding which kernel function to use in the constructor of the Kernel class (svm.cpp:253). The available kernels are LINEAR, POLY, RBF, SIGMOID, and PRECOMPUTED.

 * Weka  
 The abstract kernel class is implemented in `supportVector\Kernel.java`. The kernel implementations are classes in other files (e.g. `supportVector\RBFKernel.java`) that extend the Kernel class and overwrite its methods as needed, e.g. `buildKernel` and `evaluate`. The available kernels are Poly, NormalizedPoly, RBF, Subsequence (`StringKernel.java`), PrecomputedKernel, and Pearson universal kernel (`PUK.java`).

2.
 * libsvm  
 Define a new kernel function in the Kernel class (svm.cpp:250)  
 Add the new function to the kernel type switch case (svm.cpp:273)  
 * Weka  
 Create a new class in the supportVector folder which extends Kernel or CachedKernel.

3.
 * libsvm  
 The dot product takes two pointers as arguments. These pointers signify the start of the two vectors. The vectors are expected to be terminated with a `svm_node` with index -1. As long as the indices are equal, the product of the values of the vectors as this position is added to the total sum. The sum is then returned.

 * weka 
 The dot product takes two Instances as arguments. These are a data type that store indices and corresponding values. The implementation follows the same principle as that of libsvm, values are multiplied and summed as long as indices of the two vectors are equal.

 4.

  * libsvm

  * weka  
  The implementation caches the evaluation results of the Kernel function. This means that results for repeating values can be read from cache and do not need to be recalculated from scratch, saving time.
 

## SMO

1. 

 * libsvm  
 The optimization loop happens at `svm.cpp:564`. `select_working_set()` is used to find the two alphas i and j to optimize. They are then updated...

 * weka  
 The optimization loop is located at `SMO.java:591`. This is called from the training loop `SMO.java:427` with `smo.buildClassifier`, taking a number of training examples as the argument. `examineExample` is called for an instance, which then performs one optimization step by calling `takeStep`. 

2.

 * libsvm  

 * weka  
 The clipping happens at `SMO.java:968` or `SMO.java:987` depending on the sign of the second derivative. $a_1$ is updated to be more optimized at `SMO.java:1007` with the value of $a_2$, which is optimized at `SMO.java:965` or `SMO.java:976-986` (and then clipped). The values are saved for each step at `SMO.java:1104-1105`.



## Task 2: Working with LibSVM

In the second part, you will use the previously analysed code on the examples from the previous face classification task.

### Setup

First, you will have to make the LibSVM library available to Python.

You can use a pre-built python version for this; instructions are at

https://pypi.org/project/libsvm/

Alternatively, you can compile it from the github project you already analyzed for the first part. Further instructions are in the `README`.

To use the library, you should read the documentation in the github project's `README` to inform you how to use it. You can also view the example to get a grasp on how it works in python. It shows you the neccessary functions, but to adapt them to the tasks below, you may have to look up the parameters in the documentation.

Note: we are aware that the instructions above are sketchy. Get used to it and ask questions on the exercise forum. Once you're a machine learning practitioner, you'll get worse instructions on more badly maintained software.

### Testing different kernels for face classification

Now you have to use the feature vectors you have generated in the previous exercise submission on GDA and apply the SVM algorithm of the library to classify them. You may have to convert them to the neccessary format in a text file first to utilize the library. Try testing with different kernels, namely the linear, polynomial (degrees d = 1,2,3,4), sigmoid and radial basis kernel.

In this submission, please also report results on the full dataset, not only the small (balanced) 80-image set.

Note that you may of course use the GDA sample solution if you are unsure about your own feature extraction.

### Tune your parameters

To effectively use kernels on your task, you will have to tune the parameters $C$ and $gamma$ for your model. Try different values to improve your results. You need not search for the overall best parameters but describe and apply some systematic approach that you use.

### Report Submission

Prepare a report of your solution as a commented Jupyter notebook (using markdown for your results and comments); include figures and results.
If you must, you can also upload a PDF document with the report annexed with your Python code.

Upload your report file to the Machine Learning Moodle Course page. Please make sure that your submission team corresponds to the team's Moodle group that you're in.