In [1]:
%reload_ext autoreload
%autoreload 2

## Series de datos

Las mediciones utilizadas en este estudio fueron obtenida en la estación antártica Marambio (\ang{64;15;}S;\ang{56;38;}O) a 198 m.s.n.m. con un Medidor de partículas de movilidad diferencial (DMPS; \textit{Differential Mobility Particle Sizer}) \cite{Asmi2018}. 

Un laboratorio para el estudio de aerosoles, montado en el año 2013, mide de forma continua distintas propiedades ópticas y químicas de las partículas atmosféricas \cite{Asmi2018}. De la serie completa se seleccionó una porción de datos del año 2017. Se seleccionaron 157 días donde se consideró que las mediciones eran representativas de las condiciones "reales" de la atmósfera. Se eliminaron los momentos donde el instrumento no funcionaba o funcionaba fuera de los parámetros nominales, también se filtraron los datos afectados por contaminación local. 

Creo las series de datos a usar para entrenar y evaluar los modelos

In [30]:
from npfd import data

X_train, X_val, y_train, y_val = data.dataset.make_dataset(dataset_name='dmps', test_size=0.2, seed=7)

INFO:root:Converting real raw files to HTK format ...
INFO:root:Generating Master Label File (Train)...
INFO:root:Generating Master Label File (Test)...


In [31]:
print(f"Nº total de archivos: {X_train['count'] + X_val['count']}\n\
Nº de archivos de entrnamiento: {X_train['count']}\n\
Nº de archivos de validación: {X_val['count']}")

Nº total de archivos: 1237
Nº de archivos de entrnamiento: 991
Nº de archivos de validación: 246


Genero los graficos de los datos a usar con sus etiquetas

In [27]:
from npfd import visualization as viz

viz.visualize.generate_plots('train', X_train, y_train)
viz.visualize.generate_plots('test', X_val, y_val)

## HTK

In [53]:
from npfd.models.base import HiddenMarkovModel

vf = 20
mv = 0.02

model = HiddenMarkovModel()
model.initialize(X_train, variance_floor=vf)

model.train(X_train, y_train, minimum_variance=mv, trace=4)

results = model.test(X_val, y_val)

INFO:root:Initializing model...
INFO:root:/home/gfogwil/Documentos/Facultad/Tesis/models/bdb/npfd/models/HTK/misc
INFO:root:Training the model...



/home/gfogwil/Documentos/Facultad/Tesis/models/bdb/notebooks/RPIC
Pruning-Off
Starting Model Update
Model e[1] to be updated with 132 examples
Model ne[2] to be updated with 1116 examples

Pruning-Off
Starting Model Update
Model e[1] to be updated with 132 examples
Model ne[2] to be updated with 1116 examples



INFO:root:Testing model: 3


Pruning-Off
Starting Model Update
Model e[1] to be updated with 132 examples
Model ne[2] to be updated with 1116 examples

/home/gfogwil/Documentos/Facultad/Tesis/programs/htk/HTKTools/HVite -C /home/gfogwil/Documentos/Facultad/Tesis/models/bdb/npfd/models/HTK/misc/config -H /home/gfogwil/Documentos/Facultad/Tesis/models/bdb/models/hmm/3/macros -H /home/gfogwil/Documentos/Facultad/Tesis/models/bdb/models/hmm/3/hmmdefs -p 0.0000000000 -s 1.0000000000 -A -T 0 -S /home/gfogwil/Documentos/Facultad/Tesis/models/bdb/data/interim/dmps_test_D_A.scp -i /home/gfogwil/Documentos/Facultad/Tesis/models/bdb/data/interim/results.mlf -w /home/gfogwil/Documentos/Facultad/Tesis/models/bdb/npfd/models/HTK/misc/wdnet /home/gfogwil/Documentos/Facultad/Tesis/models/bdb/npfd/models/HTK/misc/dict /home/gfogwil/Documentos/Facultad/Tesis/models/bdb/npfd/models/HTK/misc/monophones 

  Date: Mon Jul 26 19:07:59 2021
  Ref : >ocumentos/Facultad/Tesis/models/bdb/data/interim/dmps_test_labels.mlf
  Rec : >gfogwil/Do

In [54]:
model.edit([f'AT 2 4 0.2 {{ne.transP}}', 
            f'AT 4 2 0.2 {{ne.transP}}'])

model.train(X_train, y_train, minimum_variance=mv)
results = model.test(X_val, y_val)

gaussian_duplication_times = 4
for i in range(1, gaussian_duplication_times+1):
    print(2**i)
    model.edit([f'MU {2**i} {{*.state[2-4].mix}}'])
    model.train(X_train, y_train, minimum_variance=mv)
    
    results = model.test(X_val, y_val)

INFO:root:Editing model 3
INFO:root:Training the model...



Pruning-Off

Pruning-Off



INFO:root:Testing model: 7


Pruning-Off

/home/gfogwil/Documentos/Facultad/Tesis/programs/htk/HTKTools/HVite -C /home/gfogwil/Documentos/Facultad/Tesis/models/bdb/npfd/models/HTK/misc/config -H /home/gfogwil/Documentos/Facultad/Tesis/models/bdb/models/hmm/7/macros -H /home/gfogwil/Documentos/Facultad/Tesis/models/bdb/models/hmm/7/hmmdefs -p 0.0000000000 -s 1.0000000000 -A -T 0 -S /home/gfogwil/Documentos/Facultad/Tesis/models/bdb/data/interim/dmps_test_D_A.scp -i /home/gfogwil/Documentos/Facultad/Tesis/models/bdb/data/interim/results.mlf -w /home/gfogwil/Documentos/Facultad/Tesis/models/bdb/npfd/models/HTK/misc/wdnet /home/gfogwil/Documentos/Facultad/Tesis/models/bdb/npfd/models/HTK/misc/dict /home/gfogwil/Documentos/Facultad/Tesis/models/bdb/npfd/models/HTK/misc/monophones 



INFO:root:Editing model 7
INFO:root:Training the model...


  Date: Mon Jul 26 19:08:02 2021
  Ref : >ocumentos/Facultad/Tesis/models/bdb/data/interim/dmps_test_labels.mlf
  Rec : >gfogwil/Documentos/Facultad/Tesis/models/bdb/data/interim/results.mlf
------------------------ Overall Results --------------------------
SENT: %Correct=64.63 [H=159, S=87, N=246]
WORD: %Corr=71.56, Acc=64.69 [H=229, D=58, S=33, I=22, N=320]
------------------------ Confusion Matrix -------------------------
       e   n 
           e  Del [ %c / %e]
   e  32   0    6
  ne  33  197  52 [85.7/10.3]
Ins   22   0

2

Pruning-Off

Pruning-Off



INFO:root:Testing model: 11


Pruning-Off



INFO:root:Editing model 11
INFO:root:Training the model...


/home/gfogwil/Documentos/Facultad/Tesis/programs/htk/HTKTools/HVite -C /home/gfogwil/Documentos/Facultad/Tesis/models/bdb/npfd/models/HTK/misc/config -H /home/gfogwil/Documentos/Facultad/Tesis/models/bdb/models/hmm/11/macros -H /home/gfogwil/Documentos/Facultad/Tesis/models/bdb/models/hmm/11/hmmdefs -p 0.0000000000 -s 1.0000000000 -A -T 0 -S /home/gfogwil/Documentos/Facultad/Tesis/models/bdb/data/interim/dmps_test_D_A.scp -i /home/gfogwil/Documentos/Facultad/Tesis/models/bdb/data/interim/results.mlf -w /home/gfogwil/Documentos/Facultad/Tesis/models/bdb/npfd/models/HTK/misc/wdnet /home/gfogwil/Documentos/Facultad/Tesis/models/bdb/npfd/models/HTK/misc/dict /home/gfogwil/Documentos/Facultad/Tesis/models/bdb/npfd/models/HTK/misc/monophones 

  Date: Mon Jul 26 19:08:06 2021
  Ref : >ocumentos/Facultad/Tesis/models/bdb/data/interim/dmps_test_labels.mlf
  Rec : >gfogwil/Documentos/Facultad/Tesis/models/bdb/data/interim/results.mlf
------------------------ Overall Results --------------------

INFO:root:Testing model: 15


Pruning-Off



INFO:root:Editing model 15
INFO:root:Training the model...


/home/gfogwil/Documentos/Facultad/Tesis/programs/htk/HTKTools/HVite -C /home/gfogwil/Documentos/Facultad/Tesis/models/bdb/npfd/models/HTK/misc/config -H /home/gfogwil/Documentos/Facultad/Tesis/models/bdb/models/hmm/15/macros -H /home/gfogwil/Documentos/Facultad/Tesis/models/bdb/models/hmm/15/hmmdefs -p 0.0000000000 -s 1.0000000000 -A -T 0 -S /home/gfogwil/Documentos/Facultad/Tesis/models/bdb/data/interim/dmps_test_D_A.scp -i /home/gfogwil/Documentos/Facultad/Tesis/models/bdb/data/interim/results.mlf -w /home/gfogwil/Documentos/Facultad/Tesis/models/bdb/npfd/models/HTK/misc/wdnet /home/gfogwil/Documentos/Facultad/Tesis/models/bdb/npfd/models/HTK/misc/dict /home/gfogwil/Documentos/Facultad/Tesis/models/bdb/npfd/models/HTK/misc/monophones 

  Date: Mon Jul 26 19:08:12 2021
  Ref : >ocumentos/Facultad/Tesis/models/bdb/data/interim/dmps_test_labels.mlf
  Rec : >gfogwil/Documentos/Facultad/Tesis/models/bdb/data/interim/results.mlf
------------------------ Overall Results --------------------

INFO:root:Testing model: 19


Pruning-Off



INFO:root:Editing model 19
INFO:root:Training the model...


/home/gfogwil/Documentos/Facultad/Tesis/programs/htk/HTKTools/HVite -C /home/gfogwil/Documentos/Facultad/Tesis/models/bdb/npfd/models/HTK/misc/config -H /home/gfogwil/Documentos/Facultad/Tesis/models/bdb/models/hmm/19/macros -H /home/gfogwil/Documentos/Facultad/Tesis/models/bdb/models/hmm/19/hmmdefs -p 0.0000000000 -s 1.0000000000 -A -T 0 -S /home/gfogwil/Documentos/Facultad/Tesis/models/bdb/data/interim/dmps_test_D_A.scp -i /home/gfogwil/Documentos/Facultad/Tesis/models/bdb/data/interim/results.mlf -w /home/gfogwil/Documentos/Facultad/Tesis/models/bdb/npfd/models/HTK/misc/wdnet /home/gfogwil/Documentos/Facultad/Tesis/models/bdb/npfd/models/HTK/misc/dict /home/gfogwil/Documentos/Facultad/Tesis/models/bdb/npfd/models/HTK/misc/monophones 

  Date: Mon Jul 26 19:08:21 2021
  Ref : >ocumentos/Facultad/Tesis/models/bdb/data/interim/dmps_test_labels.mlf
  Rec : >gfogwil/Documentos/Facultad/Tesis/models/bdb/data/interim/results.mlf
------------------------ Overall Results --------------------

INFO:root:Testing model: 23


Pruning-Off

/home/gfogwil/Documentos/Facultad/Tesis/programs/htk/HTKTools/HVite -C /home/gfogwil/Documentos/Facultad/Tesis/models/bdb/npfd/models/HTK/misc/config -H /home/gfogwil/Documentos/Facultad/Tesis/models/bdb/models/hmm/23/macros -H /home/gfogwil/Documentos/Facultad/Tesis/models/bdb/models/hmm/23/hmmdefs -p 0.0000000000 -s 1.0000000000 -A -T 0 -S /home/gfogwil/Documentos/Facultad/Tesis/models/bdb/data/interim/dmps_test_D_A.scp -i /home/gfogwil/Documentos/Facultad/Tesis/models/bdb/data/interim/results.mlf -w /home/gfogwil/Documentos/Facultad/Tesis/models/bdb/npfd/models/HTK/misc/wdnet /home/gfogwil/Documentos/Facultad/Tesis/models/bdb/npfd/models/HTK/misc/dict /home/gfogwil/Documentos/Facultad/Tesis/models/bdb/npfd/models/HTK/misc/monophones 

  Date: Mon Jul 26 19:08:35 2021
  Ref : >ocumentos/Facultad/Tesis/models/bdb/data/interim/dmps_test_labels.mlf
  Rec : >gfogwil/Documentos/Facultad/Tesis/models/bdb/data/interim/results.mlf
------------------------ Overall Results -------

In [15]:
from npfd import visualization as viz

viz.visualize.generate_plots('results', X_test, y_test, results)