# AI Air Conditioner Detector

We will be using the `pyAudioAnalysis` library to train an AI model to classify audio samples.  
For demonstration purposes we will train it to be able to calssify between a variety of sounds in the urban sounds dataset. And in doing so, distinguishing an AC from environmental noise becomes easier.

First we import the library:

In [1]:
from pyAudioAnalysis import audioTrainTest as aT

#### Update description with actual dataset in `audio_training_dataset`
Next we need some examples of each class we want to train the model in. In this case we have a directory, `AC_sounds/`, that contains 31 different audio recordings of air conditioners, and another directory, `dog_sounds/` that contains 28 different audio recordings of dogs barking.  
These files were obtained from the <a href='https://www.kaggle.com/mehmetokuyar/urbansounds'>UrbanSounds dataset</a> that can be found on <a href='Kaggle.com'>Kaggle.com</a>. Kaggle is a great resource for finding datasets for machine learning purposes.

The `pyAudioAnalysis` library provides us with a very convenient wrapper function, `extract_features_and_train()`, that extracts audio features for classifier training. This function takes five parameters:

1. **paths**: a list of paths of directories. Each directory contains a signle audio class whose samples are stored in seperate WAV files.
2. **mid_window**, **mid_step**: mid-term window length and step
3. **short_window**, **short_step**: short-term window and step
4. **classifier_type**: "svm" or "knn" or "randomforest" or "gradientboosting" or "extratrees"
5. **model_name**: name of the model to be saved

This function does not return anything, rather the resulting classifier along with the respective model parameters are saved on files.

We call the function as follows:

In [6]:
aT.extract_features_and_train(["./archive/UrbanSound8K/air_conditioner_files",
                               "./archive/UrbanSound8K/dog_bark_files",
                               "./archive/UrbanSound8K/car_horn_files",
                               "./archive/UrbanSound8K/children_playing_files",
                               "./archive/UrbanSound8K/drilling_files",
                               "./archive/UrbanSound8K/engine_idling_files",
                               "./archive/UrbanSound8K/gun_shot_files",
                               "./archive/UrbanSound8K/jackhammer_files",
                               "./archive/UrbanSound8K/siren_files",
                               "./archive/UrbanSound8K/street_music_files"], 
                              1.0, 1.0, aT.shortTermWindow, aT.shortTermStep, "svm", "urbanSounds", False)

Analyzing file 1 of 606: ./archive/UrbanSound8K/air_conditioner_files/100852-0-0-0.wav
Analyzing file 2 of 606: ./archive/UrbanSound8K/air_conditioner_files/100852-0-0-1.wav
Analyzing file 3 of 606: ./archive/UrbanSound8K/air_conditioner_files/100852-0-0-10.wav
Analyzing file 4 of 606: ./archive/UrbanSound8K/air_conditioner_files/100852-0-0-11.wav
Analyzing file 5 of 606: ./archive/UrbanSound8K/air_conditioner_files/100852-0-0-12.wav
Analyzing file 6 of 606: ./archive/UrbanSound8K/air_conditioner_files/100852-0-0-13.wav
Analyzing file 7 of 606: ./archive/UrbanSound8K/air_conditioner_files/100852-0-0-14.wav
Analyzing file 8 of 606: ./archive/UrbanSound8K/air_conditioner_files/100852-0-0-15.wav
Analyzing file 9 of 606: ./archive/UrbanSound8K/air_conditioner_files/100852-0-0-16.wav
Analyzing file 10 of 606: ./archive/UrbanSound8K/air_conditioner_files/100852-0-0-17.wav
Analyzing file 11 of 606: ./archive/UrbanSound8K/air_conditioner_files/100852-0-0-18.wav
Analyzing file 12 of 606: ./arch

And that's it, we now have a trained AI model!

The training of the model will require the creation of classifer files in the current directory, which is why the `urbanSounds*` files are there.

Now to test it out we can use the `file_classification()` function which takes three parameters:
1. **path**: Path to a WAV file to be classified
2. **model_name**: Name of the model we want to use
3. **classifier_type**: "svm" or "knn" or "randomforest" or "gradientboosting" or "extratrees"

We will pass it a WAV file of what we know is an air conditioner:

In [3]:
aT.file_classification("./AC_sounds/100852-0-0-0.wav", "urbanSounds","svm")

(0.0,
 array([0.71248313, 0.00300407, 0.04405805, 0.01606195, 0.0874116 ,
        0.05001789, 0.00231831, 0.06691664, 0.00359959, 0.01412878]),
 ['air_conditioner_files',
  'dog_bark_files',
  'car_horn_files',
  'children_playing_files',
  'drilling_files',
  'engine_idling_files',
  'gun_shot_files',
  'jackhammer_files',
  'siren_files',
  'street_music_files'])

Now let's examine what this function returned.  
First we have the class_id, which represents the index of the classes array that the file is most similar to. 
Second we have a list of probabilities that represent the likelihood of the provided file being of the class in the corresponding index in the last array.
Finally, the array with the classes the trained model is able to distinguish.