# Welcome to the Machine learning for IDE students application!

This is a JupyterNotebook created for IDE students to get introduced to using machine learning for prototype improvement. It might seem like a lot if you don't have any coding experience, but just follow the steps, read the instructions and it should be a breeze! 

## Setting up

This application will take you through the preprocessing, labeling and classification steps of the machine learning process. Make sure you've read through the [User Guide](https://docs.google.com/document/d/1J9c5sHokh8Rj-4lO4yKX1Tv7LwY2_r-mfQ9YKviiXZk/edit?usp=sharing) before you continue.

To be able to access the code we need to import packages. These packages are written by us and contain the main functions of the program. The actual Python code is to be found in different files, so you have a nice interface without a bunch of code. 

**Make sure you run the two cells below!** To run your first cell, just click inside the cell and press play. You can consult [this tutorial](https://www.dataquest.io/blog/jupyter-notebook-tutorial/) if you experience difficulty using Anaconda.

In [1]:
from AI_for_Designers.data_processing import Preprocessing, empty_files
from AI_for_Designers.active_learning import ActiveLearning

In [6]:
Activity = 'Walking3'

## Pre-processing
During preprocessing, the raw data gets divided into frames. This makes for more certain predictions and a faster model.

There are **4 variables** that are important for you to edit:

 - Frame size (size=...): set larger for slow movements, smaller for very quick movements. Default = 2 sec
 - Frame offset (offset=...): increase if you set a bigger frame size
 - Start-chop (start_offset=...)
 - End-chop (stop_offset=...)

Enter the values below, and run the cell to start the preprocessing. 

After you've preprocessed your data, the application will have extracted features like standard deviation and most present frequency. These features will be used to analyze the characteristics of a data point and classify it to an activity.

In [9]:
empty_files([f'Preprocessed-data/{Activity}/features_{Activity}.txt',
             f'Preprocessed-data/{Activity}/features_{Activity}_scaled.csv',
             f'Preprocessed-data/{Activity}/processed_data_files.txt'])

pre = Preprocessing(Activity)
pre.windowing([r"Data/data-lopen/Walking_part_1.csv", r"Data/data-lopen/Walking_part_1_gyro.csv"], r"Data/data-lopen/Walking_part_1.mp4",
              start_offset=2.5, stop_offset=5, size=1, offset=0.2, epsilon=0.01, do_plot=False, do_scale=False)
pre.windowing([r"Data/data-lopen/Walking_part_2.csv", r"Data/data-lopen/Walking_part_2_gyro.csv"], r"Data/data-lopen/Walking_part_2.mp4",
              start_offset=2.5, stop_offset=5, size=1, offset=0.2, epsilon=0.01, do_plot=False, do_scale=True)

Amount of sensors: 6, amount of features per sensor: 8


## Labeling and training

If you've set up your preprocessing correctly this step should be quite easy. The application will show you part of your recording, and you will be asked to label it according to the action performed on screen. You can add a new label if the activity you see is not one of the options you expected. 

First, enter the activities that you think will be in the video. Don't worry about doing this extensively, you can always add more later.

An example:

    labels = ['walking', 'running', 'stairs_up', 'stairs_down']

Enter your labels inside the code block below. **Only change the text inside the brackets, remember to put the values in quotation marks and seperate them by commas**.

Next up is training the model. After entering your labels, run the cell and you will be shown a GIF and asked to label the activity you see. Take your time to label the data correctly, as the results fully rely on accurate labels. You get the option to delete a data point if you are not completely sure about the label or if it's a faulty sample. 

In [10]:
labels = ['walking', 'running', 'stairs_up', 'stairs_down']
# vid = VideoLabeler(labels)
# vid.labeling

AL = ActiveLearning(fr'Preprocessed-data/{Activity}/features_{Activity}_scaled.csv', Activity, labels, 1)
AL.training(5)
AL.plotting()
AL.write_to_file()

10.000000000000004 1


Enter the index or the name of one of the following labels. Enter 'n' to add a new label:
1. walking
2. running
3. stairs_up
4. stairs_down
Label does not exist! Try again
Enter the index or the name of one of the following labels. Enter 'n' to add a new label:
1. walking
2. running
3. stairs_up
4. stairs_down
Label does not exist! Try again
Enter the index or the name of one of the following labels. Enter 'n' to add a new label:
1. walking
2. running
3. stairs_up
4. stairs_down
Label does not exist! Try again
Enter the index or the name of one of the following labels. Enter 'n' to add a new label:
1. walking
2. running
3. stairs_up
4. stairs_down
Label does not exist! Try again
Enter the index or the name of one of the following labels. Enter 'n' to add a new label:
1. walking
2. running
3. stairs_up
4. stairs_down
Label does not exist! Try again
Enter the index or the name of one of the following labels. Enter 'n' to add a new label:
1. walking
2. running
3. stairs_up
4. stairs_down


In [5]:
AL.testing(10)

59.80000000000028 1


TESTING
Enter the index or the name of one of the following labels. Enter 'n' to add a new label:
1. walking
2. running
3. stairs_up
4. stairs_down
5. open_door
Error rate: 0.2 (10 samples)


(2, 10)


## Congratulations!

You have just trained a machine-learning model. It's now ready to make predictions on real-world data, detect unexpected use of your product, and show you analytics about usage. This will help you to create a better product.


Now you can feed the trained model more data, for example about real-life use of your product.