# Machine Learning para DOA - Dregon Dataset

O objetivo aqui é utilizar regrassão para determinar a direção de chegada. Para isso, já extraí os delays referentes a todas as combinações de microfones do Dregon Dataset - Clean Speech.

O algoritmo usado para extrair os delays foi o delayDatasetCreator.py. Pode ser que a informação de todos os delays seja redundante, uma vez que o algoritmo determinístico usa só os delays em relação ao microfone da origem do espaço vetorial.

In [1]:
import pandas as pd
import numpy as np
from sklearn.linear_model import BayesianRidge
from math import sqrt, cos, sin, pi
from sklearn.linear_model import LinearRegression, LassoLars, RANSACRegressor
from sklearn.model_selection import train_test_split as tts
from sklearn.metrics import mean_squared_error as mse, r2_score as r2

#### Abrindo o CSV com Pandas e separando as features das classificações 

O dataset tem as seguintes colunas:

Nome do arquivo WAV, Features, Azimutal Real, Elevação Real

In [2]:
dataset       = pd.read_csv("/home/dimi/Programming/IC2019/DOA/Datasets/dregonDelaysDataset.csv")
xReal         = np.array(dataset[dataset.columns[1:-2]])
yRealAzimutal = np.array(dataset[dataset.columns[-2]])
yRealElevacao = np.array(dataset[dataset.columns[-1]])

In [3]:
print(xReal[0], yRealAzimutal[0], yRealElevacao[0])

[ 12  12  22  18  17   7   6  -1  10   6   4  -5  -6  11   7   5  -5  -6
  -5  -6 -16 -17  -1 -11 -12  -9 -10  -1] 45 -30


#### Definindo algumas funções que vou precisar

In [4]:
def radParaGrau(angulo):
    return (angulo*180)/pi

def grauParaRad(angulo):
    return (angulo*pi)/180

def vetorUnitario(vetor):
    return vetor/np.linalg.norm(vetor)

def tempoParaAmostras(tempo, freqAmostragem):
    return tempo * freqAmostragem

## Treinando com dados fictícios

#### Criando os dados fictícios de treino

Os dados de treino não serão dados reais. Vou usar a conta do produto interno para verificar qual seria o delay ideal entre cada microfone. Assim, posso criar dados fictícios que representam o caso ideal, e com eles, treinar os regressores.

http://www.labbookpages.co.uk/audio/beamforming/delayCalc.html

In [5]:
# COORDENADAS ORIGINAIS
coordenadasMics = np.array([
    [0.0420, 0.0615, -0.0410],
    [-0.0420, 0.0615, 0.0410],
    [-0.0615, 0.0420, -0.0410],
    [-0.0615, -0.0420, 0.0410],
    [-0.0420, -0.0615, -0.0410],
    [0.0420, -0.0615, 0.0410],
    [0.0615, -0.0420, -0.0410],
    [0.0615, 0.0420, 0.0410]
])

In [6]:
# CRIANDO OS ARRAYS DE XTRAIN E YTRAIN
xTrain         = []
yTrainAzimutal = []
yTrainElevacao = []

# VOU PRECISAR DA FREQUENCIA DE AMOSTRAGEM DO DATASET
freqAmostragem = 44100

# VOU CALCULAR OS DELAYS COMO SE O SOM TIVESSE VINDO DAS SEGUINTES COMBINACOES
# DE AZIMUTAIS E ELEVACOES

# azimutaisDesejados = [45,60,75,90]
# elevacoesDesejadas = [-30,-15,0]
# for azimutalAtual in azimutaisDesejados:
#     for elevacaoAtual in elevacoesDesejadas:
for azimutalAtual in np.arange(0, 91, 10):
    for elevacaoAtual in np.arange(-90, 1, 10):
        
        # COLOCANDO NOS ARRAYS DE YTRAIN
        yTrainAzimutal.append(azimutalAtual)
        yTrainElevacao.append(elevacaoAtual)
        
        # TENHO QUE PASSAR PRA RADIANOS PRA FAZER AS CONTAS
        azimutalAtualRad = grauParaRad(azimutalAtual)
        elevacaoAtualRad = grauParaRad(elevacaoAtual)

        # PARA CADA COMBINAÇÃO DE MICROFONES
        linhaAtualXTrain = []
        for micI in range(0, 8):
            for micJ in range(micI + 1, 8):

                # COORDENADAS DA DIFERENCA DOS MICS
                coordenadasDiferenca = coordenadasMics[micI] - coordenadasMics[micJ]

                # COORDENADAS DO VETOR WAVEFRONT (JA VOU DEIXAR ELE UNITARIO)
                w = vetorUnitario(np.array([
                    cos(azimutalAtualRad)*cos(elevacaoAtualRad),
                    sin(azimutalAtualRad)*cos(elevacaoAtualRad),
                    sin(elevacaoAtualRad)
                ]))

                # CALCULANDO O PRODUTO INTERNO E DIVIDINDO PELA VELOCIDADE DO SOM
                delayTemporal = coordenadasDiferenca[0] * w[0]
                delayTemporal += coordenadasDiferenca[1] * w[1]
                delayTemporal += coordenadasDiferenca[2] * w[2]
                delayTemporal /= 340

                # VOU COLOCANDO OS RESULTADOS NA LINHA ATUAL DE XTRAIN
                linhaAtualXTrain.append(tempoParaAmostras(delayTemporal, freqAmostragem))
        
        # AGORA, JA POSSO COLOCAR A LINHA INTEIRA EM XTRAIN
        xTrain.append(linhaAtualXTrain)
        
# PASSANDO TUDO PRA NUMPY5
xTrain = np.array(xTrain).astype(int)
yTrainAzimutal = np.array(yTrainAzimutal)
yTrainElevacao = np.array(yTrainElevacao)

Verificando que os dados fictícios são parecidos com os dados reais

In [7]:
indexReal     = 0
indexFicticio = 0

print(yRealAzimutal[indexReal], yRealElevacao[indexReal])
print(yTrainAzimutal[indexFicticio], yTrainElevacao[indexFicticio])
print("\n\n\n")

for i, feature in enumerate(xReal[indexReal]):
    print(feature, "\t", xTrain[indexFicticio][i])

45 -30
0 -90




12 	 10
12 	 0
22 	 10
18 	 0
17 	 10
7 	 0
6 	 10
-1 	 -10
10 	 0
6 	 -10
4 	 0
-5 	 -10
-6 	 0
11 	 10
7 	 0
5 	 10
-5 	 0
-6 	 10
-5 	 -10
-6 	 0
-16 	 -10
-17 	 0
-1 	 10
-11 	 0
-12 	 10
-9 	 -10
-10 	 0
-1 	 10


#### Treinando o regressor e prezidendo dados

In [8]:
# PRIMEIRO COM O AZIMUTAL
regressorAzimutal = LinearRegression().fit(xTrain, yTrainAzimutal)
yPredAzimutal = regressorAzimutal.predict(xReal)

for i, predicao in enumerate(yPredAzimutal):
    print(yRealAzimutal[i], predicao)

45 55.63241747408091
90 88.49189769645378
60 56.58551317587522
75 82.86782914247749
60 68.36803847514051
45 39.07923320253231
60 72.95501880296953
45 52.08603112423505
90 84.42178409114487
90 87.4703663990154
60 66.66448537074182
60 62.20721456267512
75 73.13524307962288
45 55.231774415445386
75 78.65299849082857
90 89.95863974509201
60 62.11187134438064
60 73.1628294606847
60 71.10490486153508
75 76.48982982144986
90 78.0740372478586
75 87.35105452116466
75 83.59675182221622
60 55.21050115438214
90 95.31575827521371
45 43.00687052474884
90 91.33365176658502
75 71.36106802538504
90 101.02772560564114
75 92.08734226947251
75 84.48528132038851
90 101.23327741640904
45 45.41114054407722
45 44.604280751775796
60 73.56904619435072
75 74.99403194913967
60 73.79144439988454
45 42.18947745767998
45 52.28524554419343
75 81.33096677835161
75 74.72630939629734
90 74.16493595429603
75 73.93029476722072
60 52.06431678126988
45 48.440663321039665
75 78.65299849082857
75 73.13524307962288
90 93.43155

In [9]:
# AGORA COM A ELEVACAO
regressorElevacao = LinearRegression().fit(xTrain, yTrainElevacao)
yPredElevacao = regressorAzimutal.predict(xReal)

for i, predicao in enumerate(yPredElevacao):
    print(yRealElevacao[i], "\t", predicao)

-30 	 55.63241747408091
-15 	 88.49189769645378
-30 	 56.58551317587522
-15 	 82.86782914247749
0 	 68.36803847514051
-30 	 39.07923320253231
0 	 72.95501880296953
-15 	 52.08603112423505
0 	 84.42178409114487
-15 	 87.4703663990154
-15 	 66.66448537074182
-15 	 62.20721456267512
-30 	 73.13524307962288
-15 	 55.231774415445386
-15 	 78.65299849082857
0 	 89.95863974509201
-30 	 62.11187134438064
-15 	 73.1628294606847
0 	 71.10490486153508
0 	 76.48982982144986
-15 	 78.0740372478586
0 	 87.35105452116466
-15 	 83.59675182221622
-30 	 55.21050115438214
-15 	 95.31575827521371
-15 	 43.00687052474884
0 	 91.33365176658502
0 	 71.36106802538504
0 	 101.02772560564114
0 	 92.08734226947251
-15 	 84.48528132038851
-15 	 101.23327741640904
-15 	 45.41114054407722
0 	 44.604280751775796
-15 	 73.56904619435072
-15 	 74.99403194913967
0 	 73.79144439988454
-15 	 42.18947745767998
0 	 52.28524554419343
-30 	 81.33096677835161
0 	 74.72630939629734
-30 	 74.16493595429603
-30 	 73.930294767220

## Treinando com dados reais

### Linear Regression

In [10]:
objLS = LinearRegression()

xTrain, xTest, yTrain, yTest = tts(xReal, yRealAzimutal, test_size=0.90)
    
objLS.fit(xTrain, yTrain)
    
yPred = objLS.predict(xTest)

for i, predicao in enumerate(yPred):
    print(yTest[i], predicao)

75 76.17855844887744
90 89.96574547535639
90 90.4614716945006
75 75.97443752787783
45 44.10892803585473
90 88.91025365053218
90 89.39382981318862
60 61.400188011853615
90 89.89451213931358
45 45.000000000000014
75 74.59055637172678
60 62.31465248346205
60 60.29780044287071
90 90.2159336217386
90 89.64095548333617
75 74.94967533138261
60 64.22607629860636
60 60.293020017383284
60 59.47866436657129
60 60.60448047467416
60 59.1699963475711
75 75.4573720322617
90 86.2158991417883
60 56.99179829567764
45 44.01715420332897
75 76.7242151676936
45 51.55320717053421
60 60.697935438259414
75 74.5489816310147
45 45.9447273546635
45 45.94816062168144
90 86.97173499758969
90 89.62440344952205
90 86.92045538987486
45 44.623625568889324
90 90.96905718272082
45 47.56582078022957
60 62.66119599275334
75 75.98257272716828
45 43.75492834782071
45 44.982831806538364
75 74.89143454683004
90 90.92148949717395
60 60.779278264067244
60 63.74713996601269
60 63.54787878683197
45 46.42849332698286
45 43.92259199

### Lasso Lars

In [11]:
objLL = LassoLars()

xTrain, xTest, yTrain, yTest = tts(xReal, yRealAzimutal, test_size=0.5)
    
objLL.fit(xTrain, yTrain)
    
yPred = objLL.predict(xTest)

for i, predicao in enumerate(yPred):
    print(yTest[i], predicao)

60 66.02957682314374
45 60.09022322429775
60 66.02957682314374
75 72.91403847263383
60 66.02957682314374
90 78.81545387341629
45 60.2841063485772
60 65.91157009499136
45 60.12816142236129
90 76.57121758767099
75 72.91403847263383
90 75.39009608072213
45 60.09022322429775
60 66.02957682314374
60 65.91157009499136
60 66.02957682314374
45 61.347221127373665
45 60.2841063485772
45 60.12816142236129
75 71.85092369383736
60 65.91157009499136
75 71.85092369383736
90 77.63433236646745
45 60.09022322429775
45 60.12816142236129
45 60.09022322429775
45 61.465227855526045
60 66.06751502120729
90 77.63433236646745
45 60.09022322429775
60 65.91157009499136
60 65.91157009499136
60 66.02957682314374
45 60.12816142236129
75 72.91403847263383
60 66.06751502120729
45 60.12816142236129
75 71.85092369383736
45 60.09022322429775
45 60.12816142236129
45 60.09022322429775
45 60.2841063485772
45 61.347221127373665
90 78.81545387341629
90 77.63433236646745
60 66.02957682314374
75 71.85092369383736
75 72.9140384

### Bayesian Ridge

In [12]:
objBR = BayesianRidge()

xTrain, xTest, yTrain, yTest = tts(xReal, yRealAzimutal, test_size=0.9)
    
objBR.fit(xTrain, yTrain)
    
yPred = objBR.predict(xTest)

for i, predicao in enumerate(yPred):
    print(yTest[i], predicao)

90 87.64364480829146
45 44.6945823149338
45 44.126254712032804
45 46.09251985377163
90 84.90331491145034
60 60.99911111220389
60 58.741589101017
75 76.21845291188238
90 86.06874777855737
75 73.87794330834443
90 86.62179326302979
45 46.458732016700004
75 75.04188614607247
60 58.179734653581114
60 59.59117031765773
75 74.97598992853493
90 89.94116246611793
60 58.700158454693764
90 87.22801378379543
90 87.68378812273406
60 59.29291288824884
45 43.84783762583142
90 89.72220971074427
90 87.79536230314514
45 50.4675660002913
75 72.98662177166662
45 45.295417819862564
45 44.06999911725845
60 61.25144902048494
60 60.36029227161646
45 46.58050166811576
90 90.07341072916446
45 46.54300143862203
45 44.44624269506662
45 48.016182345358885
60 58.5964370098458
90 89.9781293771043
90 86.56884428919638
75 74.73823640927264
45 44.30647034164142
45 43.84783762583142
90 91.18956793846121
45 43.49859066363915
75 75.33205365120175
45 47.1162107463104
75 74.99999604273498
60 60.91085038303035
60 62.00781731