# Binary Classifier - Active Learning

#### Objectives:

This is an interactive interface created using Ipython widget to explore active learning for our binary classification model. The goal is to have experienced eyes check the false negatives and false positives generated by our binary classification model since we believe some of the initial data we were given was misclassified. By doing a manual inspection of these results and correcting any misclassified data, we will increase the label quality of our data and thereby increase the performance of our model. 

Using 95% confidence score threshold, there are 551 misclassified 200x200 cutouts, from 47 full 4200x2000 image in the test set. There are 2 types of misclassifications:
1. False negatives: cutouts that are labeled as containing artifact `1` but are predicted as containing no artifact `0` by our model
2. False positives: cutouts that are labeled as containing no artifact `0` but are predicted as containing artifact `1` by our model

#### Interface Overview:
1. `image_name` drop-down list: The list of image names of 47 full images in the test set.
2. `error_id` drop-down list: The list of unique identifiers of misclassified cutouts based on the `image_name` selected. The identifiers are running counts of 551 cutouts from 47 images.
3. `Prev` and `Next` buttons: Navigate through the `error_id`
    1. If `Next` clicked when at the last `error_id` of the current `image_name`, `error_id` will move to the first `error_id` of the next `image_name`. 
    2. If `Prev` clicked when at the first `error_id` of the current `image_name`, `error_id` will move to the last `error_id` of the previous `image_name`.
4. Table output: The table of information relating to the `error_id` selected
    1. `label`: original label of the cutout in our data.  
    2. `preds`: predicted label of the cutout based on our model.
    3. `label_new`: correct label of the cutout as the result of the manual inspection. Initial value set as the same as `label` and can be changed with the `artifact` and `no artifact` toggle.
5. `grid_x` and `grid_y`slider: The x-coordinate and y-coordinate of the top left corner of the cutout based on the `error_id` selected. 
    1. Moving the slider views other cutouts along the x-axis and y-axis of the whole image and updates the output table accordingly. This is only necessary if you would like to visualize the surrounding cutouts of the current `error_id`.
    2. If you navigate to another misclassified cutout, `error_id` will be updated.
7. `artifact` and `no artifact` toggle: The binary indicator that captures if the cutout contains an artifact and updates the `label_new` column in the table output. Initial value reflects the current value of `label_new`.
8. `image` output: The 200x200 cutout of the current image based on the `grid_x` and `grid_y` coordinates. The image can be changed by either selecting a new `image_name` and `error_id`, or using the `grid_x` or `grid_y` sliders.
    1. False positives are marked with a purple box and `FP` label in bottom right corner.
    2. False negatives are marked with a blue box and `FN` label in bottom right corner.
    3. True positives are marked with a green and red box.
    4. For false postives and true positives, the confidence score of the prediction is indicated in the top left corner.

#### Instructions:
1. Choose an image from the `image_name` drop-down list that has not yet been manually checked.
2. Navigate through the image’s misclassified 200x200 cutouts, identified with `error_id`, using the `Prev` and `Next` buttons.
3. Once an `error_id` is chosen, both the cutout image and the table containing its information will be refreshed.
4. The `artifact` and `no artifact` toggle indicates if there is an artifact (`1`) or no artifact (`0`) in the current cutout based on original labels in the data. If the indicator is correct, feel free to move to the next `error_id`. If you want to make changes, do the following: 
    1. For false negatives, if there is an artifact in the cutout image, click `artifact`. You should see the `label_new` column in the table change to 1. 
    2. For false positives, if there is no artifact, click `no artifact`. You should see the `label_new` column change to 0.
6. When you are done with manual inspection, click `save` to export the updated table to `test_outputs.csv`.


#### Notes:
1. The `grid_x` and `grid_y` sliders allow you to navigate along the x and y axis of the image if you would like to see the surrounding cutouts.
2. You may choose to manually inspect any other cutouts that not misclassified, and changes will be captured with `save`.
Through using the sliders, if you come across a cutout that appears misclassified but is not included in the 551, you may choose to update the `artifact` and `no artifact` toggle and changes will be reflected in the table output.

In [1]:
from active_learning import *

active_learning()

HBox(children=(Dropdown(description='image_name', layout=Layout(width='220px'), options=('', 'COSMOS_C01', 'CO…

HBox(children=(Output(layout=Layout(height='80px', width='700px')),), layout=Layout(align_items='center', disp…

HBox(children=(IntSlider(value=0, description='grid_x', layout=Layout(width='450px'), max=4000, step=200, styl…

HBox(children=(Output(layout=Layout(height='400px', width='400px')), IntSlider(value=0, description='grid_y', …