# Aprendizaje Computacional  

## Mario Graff (mgraffg@ieee.org, mario.graff@infotec.mx)  
## [https://github.com/ingeotec](https://github.com/ingeotec)
## [https://github.com/mgraffg](https://github.com/mgraffg)
## CONACYT - INFOTEC  

# Temas

1. Introducción
2. Aprendizaje supervisado
3. Métodos paramétricos
4. Métodos no-paramétricos
5. Máquinas de kernel
6. Métodos no convencionales de aprendizaje
7. Diseño y análisis de experimentos de aprendizaje
8. Aplicaciones

# Clasificación de imágenes

## Leer los archivos de entrenamiento

In [None]:
from glob import glob
train = glob('data_tarea/train/*/features/*.npy')

## Ver una imagen

In [None]:
%pylab inline
from skimage import io
img = io.imread('data_tarea/train/forest/image_0063.jpg')
_ = io.imshow(img)

## Leer los descriptores de cada imagen

In [None]:
from sklearn import cluster
D = [np.load(x) for x in train]

## Usar K-Means

In [None]:
X = np.concatenate(D)

In [None]:
kmeans = cluster.MiniBatchKMeans(n_clusters=1000, random_state=0, init_size=3000).fit(X)

In [None]:
import numpy as np
r = kmeans.predict(D[10])
a = np.histogram(r, bins=np.arange(0, 1001))[0]

In [None]:
train[0], train[-1]

In [None]:
a = np.histogram(kmeans.predict(D[0]) , bins=np.arange(0, 1001))[0]
b = np.histogram(kmeans.predict(D[-1]) , bins=np.arange(0, 1001))[0]

In [None]:
plot(a, b, '.')

## Crear un clasificador

In [None]:
Xp = [np.histogram(kmeans.predict(x) , bins=np.arange(0, 1001))[0] for x in D]

In [None]:
Xp = np.array(Xp)

In [None]:
from sklearn.preprocessing import LabelEncoder
y = [(x.split('/train/')[1]).split('/')[0] for x in train]
l = LabelEncoder().fit(y)
y = l.transform(y)

In [None]:
from sklearn.svm import LinearSVC
from sklearn.metrics import recall_score
from sklearn.model_selection import StratifiedKFold
st = StratifiedKFold(n_splits=10, shuffle=True, random_state=0)
score = [recall_score(y[vs], LinearSVC().fit(Xp[tr], y[tr]).predict(Xp[vs]), average=None) for tr, vs in st.split(Xp, y)]

In [None]:
y = [(x.split('/train/')[1]).split('/')[0] for x in train]
print(np.unique(y, return_counts=True))
np.mean(score, axis=0)
# recall_score(y, m.predict(Xp), average=None)

# Series de Tiempo 
## Laboratorio Nacional de Internet del Futuro (LaNIF)
## Dr. Hugo Estrada


In [None]:
import json
with open('aire/indice_2017.JSON') as fpt:
    data = json.loads(fpt.read())

In [None]:
pollution = data['pollutionMeasurements']

In [None]:
data = pollution['date']
keys = [x for x in data.keys()]
vars = [x for x in data[keys[0]].keys()]
vars.sort()

In [None]:
import numpy as np
def convert(a):
    try:
        return float(a)
    except ValueError:
        return np.nan
D = np.array([[convert(data[k][v]) for v in vars] for k in keys])
# m = np.all(np.isfinite(D), axis=1)
# D = D[m]

# ¿Cómo hago una predicción?

* $y_{t+1} = ay_t + \sum_i b_i x^i_t$
* $y_{t+1} = f(y_t, x^1_t, \cdots, )$
* $c^1_{t+1} = f(c^1_t, c^2_t, \cdots, c^{25}_t)$
* $c^1_{t+1} = f(c^1_t, c^2_t, \cdots, c^{25}_t, c^1_{t-1}, c^2_{t-1}, \cdots, c^{25}_{t-1})$

In [None]:
X = D[:-1]
Y = D[1:]
m = np.all(np.isfinite(X), axis=1)
my = np.all(np.isfinite(Y), axis=1)
_ = m & my
X = X[_]
Y = Y[_]

In [None]:
c = 3
coef = np.linalg.lstsq(X, Y[:, c])[0]
print(vars[c])
_ = plot(Y[:, c], np.dot(X, coef), '.')

In [None]:
from EvoDAG.utils import RSE
RSE(Y[:, c], np.dot(X, coef))