###  Created by Luis Alejandro (alejand@umich.edu)

Builds datasets by creating a npy file per observation. You can customize every class by merging any aircraft types together. You can select what feature extraction process to follow with many related digital signal processing options.

**Install the following packages:**

pip install npTDMS<br>
pip install mysql-connector-python<br>

In [None]:
import mysql.connector
import pandas as pd
from npy_dataset import AircraftFeaturesExtractor

In [None]:
# Connect to DB
mydb = mysql.connector.connect(
    host = "localhost",
    user = "root",
    passwd = "cuba",
    database = "airnoise"
)
dbcursor = mydb.cursor()   

### Initial exploration on the dataset
Here we have a count of how many measurements we have per aircraft type. However each measurement contains 12 signals sampled simultaneously. To build your own dataset from here you must pick what indexes are in what class using a dictionary as shown below.

In [None]:
sql = '''SELECT a.id_aircraft, a.identifier, COUNT(m.aircraft) AS observations, a.description
FROM measurements m, aircrafts a WHERE a.id_aircraft = m.aircraft AND m.quality IN (1,2,3)
GROUP BY m.aircraft ORDER BY observations;'''

dbcursor.execute(sql)
results = dbcursor.fetchall()
df = pd.DataFrame(results,columns =['aircraft', 'identifier', 'observations','description'])
df

### We manually build the classes
* Group aircrafts in different classes. You have the freedom of grouping more than one aircraft type together.

In [None]:
classes = {
    'Airbus': [1,2,4],
    'Boeing': [10,9]
}
builder = AircraftFeaturesExtractor(dbcursor, classes, 51200)
result = builder.build()

In [None]:
classes = {
    'A320-2xx (CFM56-5)': [1],
    'B737-7xx (CF56-7B22-)': [10],
    'ERJ190 (CF34-10E)': [13],
    'B737-8xx (CF56-7B22+)': [9],
    'ERJ145 (AE3007)': [11],
    'A320-2xx (V25xx)': [2],
    'A319-1xx (V25xx)': [4],
    'ERJ170/175 (CF34-8E)': [12]
}
builder = AircraftDatasetBuilder(dbcursor, classes, 51200)
result = builder.build()

In [None]:
classes = {
    'A320-2xx (CFM56-5)': [1],
    'B737-7xx (CF56-7B22-)': [10],
    'ERJ190 (CF34-10E)': [13],
    'B737-8xx (CF56-7B22+)': [9],
}

builder = AircraftDatasetBuilder(dbcursor, classes, 51200, segmentation='tmid')
result = builder.build()