# Miscelanea

Funciones varias de uso común.

## Redes neuronales

### Regla de la piramide geometrica

La regla de la piramide geometrica sirve para ayudar determinar el tamaño de las capas ocultas en base al tamaño de las capas de input y output y al numero de capas ocultas. 

Esta aproximacion esta propuesta por Masters(1993): 
>  "For a three layer network with n input and m output neurons, the hidden layer would have sqrt(N * M) neurons."
>
> -- <cite> Masters, Timothy. Pratical neural network recipes in C++. Morgan Kaufmann, 1993.</cite>

[Enlace al artículo](https://eulertech.wordpress.com/2018/01/02/how-to-choose-the-number-of-hidden-layers-and-nodes-in-a-feedforward-neural-network/)

In [8]:
def pyramid_rule(h_layers, input_size, output_size):
    if h_layers < 1:
        print("No layers")
        return
    print("Layers for input %d and output %d:" % (input_size,  output_size))
    rate = (input_size/output_size)**(1/(h_layers+1))
    for l in range(h_layers):
        layer_size = output_size*(rate**(h_layers-l))
        layer_size = round(layer_size)
        print("Layer %d: %d neurons" % (l+1, layer_size))
    return

In [9]:
if __name__ == "__main__":
    pyramid_rule(1, 3072, 20)
    pyramid_rule(2, 3072, 20)
    pyramid_rule(3, 3072, 20)
    pyramid_rule(4, 3072, 20)

Layers for input 3072 and output 20:
Layer 1: 248 neurons
Layers for input 3072 and output 20:
Layer 1: 574 neurons
Layer 2: 107 neurons
Layers for input 3072 and output 20:
Layer 1: 873 neurons
Layer 2: 248 neurons
Layer 3: 70 neurons
Layers for input 3072 and output 20:
Layer 1: 1122 neurons
Layer 2: 410 neurons
Layer 3: 150 neurons
Layer 4: 55 neurons


### EMD(Earth Mover's Distance)

EMD es una medida de distancia entre distribuciones de probabilidad, que consiste en representar ambas distribuciones como montones de tierra, en los que la distancia se determina en cuanto es el trabajo minimo que llevaria transformar un monticulo en otro. Matematicamente a EMD se la conoce como la métrica de Wasserstein.

Segun [este artículo](https://machinelearningmastery.com/how-to-implement-wasserstein-loss-for-generative-adversarial-networks/) esta metrica se puede aplicar de la siguiente manera:

In [10]:
def wasserstein_loss(y_true, y_pred):
	return mean(y_true * y_pred)

## Dataset

In [11]:
import tensorflow as tf

In [12]:
dataset = tf.data.Dataset.range(8)
list(dataset.as_numpy_iterator())


[0, 1, 2, 3, 4, 5, 6, 7]

In [13]:
dataset = dataset.shuffle(8)
list(dataset.as_numpy_iterator())

[7, 5, 4, 3, 1, 2, 0, 6]

In [14]:
dataset = dataset.batch(3)
list(dataset.as_numpy_iterator())

[array([1, 4, 3], dtype=int64),
 array([2, 7, 0], dtype=int64),
 array([5, 6], dtype=int64)]

In [15]:
list(dataset.as_numpy_iterator())

[array([5, 6, 2], dtype=int64),
 array([0, 4, 1], dtype=int64),
 array([7, 3], dtype=int64)]

In [16]:
for step, batch in enumerate(dataset):
    print(step, batch)

0 tf.Tensor([2 1 0], shape=(3,), dtype=int64)
1 tf.Tensor([3 4 6], shape=(3,), dtype=int64)
2 tf.Tensor([5 7], shape=(2,), dtype=int64)


In [17]:
for step, batch in enumerate(dataset):
    print(step, batch)

0 tf.Tensor([3 4 7], shape=(3,), dtype=int64)
1 tf.Tensor([0 2 5], shape=(3,), dtype=int64)
2 tf.Tensor([6 1], shape=(2,), dtype=int64)


In [18]:
# https://www.cs.toronto.edu/~kriz/cifar.html
def unpickle(file):
    import pickle
    with open(file, 'rb') as fo:
        dict = pickle.load(fo, encoding='latin')
    return dict

In [19]:
batch1 = unpickle("Input\cifar-10-batches-py\data_batch_1")["data"].reshape(10000, 3, 32, 32).transpose(0,2,3,1)

In [20]:
batch1= tf.data.Dataset.from_tensor_slices(batch1)
list(batch1.as_numpy_iterator())

[array([[[ 59,  62,  63],
         [ 43,  46,  45],
         [ 50,  48,  43],
         ...,
         [158, 132, 108],
         [152, 125, 102],
         [148, 124, 103]],
 
        [[ 16,  20,  20],
         [  0,   0,   0],
         [ 18,   8,   0],
         ...,
         [123,  88,  55],
         [119,  83,  50],
         [122,  87,  57]],
 
        [[ 25,  24,  21],
         [ 16,   7,   0],
         [ 49,  27,   8],
         ...,
         [118,  84,  50],
         [120,  84,  50],
         [109,  73,  42]],
 
        ...,
 
        [[208, 170,  96],
         [201, 153,  34],
         [198, 161,  26],
         ...,
         [160, 133,  70],
         [ 56,  31,   7],
         [ 53,  34,  20]],
 
        [[180, 139,  96],
         [173, 123,  42],
         [186, 144,  30],
         ...,
         [184, 148,  94],
         [ 97,  62,  34],
         [ 83,  53,  34]],
 
        [[177, 144, 116],
         [168, 129,  94],
         [179, 142,  87],
         ...,
         [216, 184, 140],
  

In [21]:
dict = {"Data":[], "Label":[]}
dict

{'Data': [], 'Label': []}

In [43]:
dict["Data"]= [1,2,3,4]
dict["Label"] = ["one","two","three","four"]
dict

{'Data': [1, 2, 3, 4], 'Label': ['one', 'two', 'three', 'four']}

In [44]:
dataset= tf.data.Dataset.from_tensor_slices(dict)

In [45]:
list(dataset.as_numpy_iterator())

[{'Data': 1, 'Label': b'one'},
 {'Data': 2, 'Label': b'two'},
 {'Data': 3, 'Label': b'three'},
 {'Data': 4, 'Label': b'four'}]

In [46]:
dataset = dataset.shuffle(8)

dataset = dataset.batch(2)



In [53]:
list(dataset.as_numpy_iterator())

[{'Data': array([3, 2]), 'Label': array([b'three', b'two'], dtype=object)},
 {'Data': array([1, 4]), 'Label': array([b'one', b'four'], dtype=object)}]

In [59]:
for step, data in enumerate(dataset):
    print(data["Data"])
    print(data["Label"])

tf.Tensor([2 3], shape=(2,), dtype=int32)
tf.Tensor([b'two' b'three'], shape=(2,), dtype=string)
tf.Tensor([1 4], shape=(2,), dtype=int32)
tf.Tensor([b'one' b'four'], shape=(2,), dtype=string)


In [9]:
def func(param1,  param2=2, param3=3):
    print(param1, param2,param3)
    return

def runfunc(ff, param1, params):
    ff(*param1, **params)

runfunc(func, ("param1", "param2"), { "param3":"param3"})
runfunc(func, (), {"param1":"p1", "param3":"param3"})
runfunc(func, (), {"param1":"p1", "param2":"param2", "param3":"param3"})
runfunc(func, ("param1",), { "param3":"param3"})


param1 param2 param3
p1 2 param3
p1 param2 param3
param1 2 param3


In [20]:
import numpy as np
dict = {"data":np.array([[1,2,3,4],[1,2,3,4],[1,2,3,4],[1,2,3,4]]), "labels":np.array([4,3,2,1])}
dict.items()
ids = np.random.randint(0,dict["data"].shape[0], 3)
set= {k:v[ids] for k,v in dict.items()}
set
print(ids)

[3 3 1]
