# Weekly Report: 06/07/2019 -- 06/17/2019

## Brief: What was done previously
Previous work discovered multiple mechanisms to improve the accuracy of prediction. This week we quantify the performance of those methods and test them against a standard baseline.

## Hypothesis

1. Hypothesis 1: Prediction accuracy vs increased magnitude of sensor noise (fixed gaussians per pixel noise)
2. Hypothesis 2: Prediction accuracy vs increasing number of missing sensor samples (entire patch - randomized drop-out)
3. Hypothesis 3: Prediction accuracy vs increasing number of sensor values (per pixel randomized drop-out)



## Summary of Main Results and Discussions



### Experiment 1 results and discussion
Put main result and conclusions here. Discuss importance/impact in terms of the project goals.

### Experiment 2: results and discussion
Put main result and conclusions here. Discuss importance/impact in terms of the project goals.


## Plan for next effort
What will be tested/extended from this week?
    

    


In [1]:
# import packages 
import numpy as np
import numpy.linalg as la
import matplotlib.pyplot as plt
import os

# Hypothesis 1

Deep recurrent networks are tolerant to sensor noise below a certain magnitude

## Validation accuracy over increasing fixed sample level sensor noise
We add decreasing magnitudes of gaussian noise to the input and predict the clean future signal. We expect, for
high levels of noise, the model to over-fit to the noise. However after some threshold we expect the model to learn 
to recover from small perturbations by learning the underlying distribution.

In [None]:
# Gaussian noise study
import numpy as np
import tensorflow as tf

from src.predict_turbulence_recurrent import train
from src.dataLoader.turbulence import Turbulence, RANDOM_SEED, LARGE_DATASET

# Use a fixed seed for noise    
np.random.seed(RANDOM_SEED)

for scale in [0.75, 0.5, 0.25, 0.1, 0.05, 0.025, 0.01, 0.005, 0.0025, 0.0001, 0]:
    noise_data = np.random.normal(size=(360, 279, 1000), scale=scale)
    
    loader = Turbulence(pred_length=20, dataset_idx=LARGE_DATASET, input_noise=noise_data, debug=False)
    
    train(loader=loader, dataset_idx=LARGE_DATASET, num_batches=100000, net_name='lstm_3_cells_20_static_noise_{}'.format(scale))
    
    tf.reset_default_graph()


C:\Users\brandon\source\orbitalMechanics
Instructions for updating:
Colocations handled automatically by placer.
Input shape: (20, ?, 2500)
Output shape: (20, ?, 50, 50)
Encoder input shape: [20, None, 2500]
Instructions for updating:
This class is equivalent as tf.keras.layers.LSTMCell, and will be replaced by that in Tensorflow 2.0.
Instructions for updating:
This class is equivalent as tf.keras.layers.StackedRNNCells, and will be replaced by that in Tensorflow 2.0.
Instructions for updating:
Please use `keras.layers.RNN(cell)`, which is equivalent to this API
Padded input shape: [20, 64, 2500]
[20, 64, 250]
Instructions for updating:
Use keras.layers.dense instead.
Instructions for updating:
Use tf.cast instead.
Instructions for updating:
Use tf.cast instead.
INFO:tensorflow:Summary name Loss Histogram is illegal; using Loss_Histogram instead.
INFO:tensorflow:Summary name Mean Abs Error is illegal; using Mean_Abs_Error instead.
( 0.283889 0 )
( 0.008285959 500 )
0.0033416222 1000
0.

## Explore results

In this test given a 3 layer encoder/decoder with 250 units per layer, we see that the performance of the model is 
resistant to up to 5% noise without any degradation in predictive power. Even with 100% noise the model learned to 
reject some amount of noise and 

In [1]:
# Compare accuracy of model with increasing fixed noise
import os
import plotly.plotly as py
import plotly.graph_objs as go
import plotly

plotly.offline.init_notebook_mode(connected=True)
    
# Compare MSE vs magnitude of noise
noise = [1, 0.75, 0.5, 0.25, 0.1, 0.05, 0.025, 0.01]
validation_accuracy = [8.1668e-4, 5.475e-4, 2.8722e-4, 1.0855e-4, 4.5386e-5, 3.1529e-5, 2.7214e-5, 2.0735e-5]

# Create a trace
trace = go.Scatter(
    x = noise,
    y = validation_accuracy,
    name=""
)

data = [trace]
layout = go.Layout(
    title="Magnitude of Sensor Noise vs Prediction Accuracy",
    xaxis=dict(
        type='log',
        autorange=True,
        title='Standard Deviation of Added Gaussian Noise',
    ),
    yaxis=dict(
        type='log',
        autorange=True,
        title='Mean Squared Validation Error over 20 Steps',
    )
)
fig = go.Figure(data=data, layout=layout)
py.iplot(fig, filename='static_noise_model')




Consider using IPython.display.IFrame instead



In [6]:
# Visualize learned data
import numpy as np

exp_dir = './experiments/turbulence/recurrent_scaled_mse'
for directory in os.listdir(exp_dir):
    # for file in os.listdir(os.path.join(exp_dir, directory)):
    #     if file.endswith('.npy') and file.startswith('label_'):
    #         print(os.path.join(exp_dir, directory, file))
    #         with np.load(os.path.join(exp_dir, directory, file)) as foo:
    #             print(foo.shape)
    try:
        with np.load(os.path.join(exp_dir, directory, 'label_100000_0.0008166771149262786.npy')) as labels:
            with np.load(os.path.join(exp_dir, directory, 'pred_100000_0.0008166771149262786.npy')) as predictions:
                foo = {key:labels[key].item() for key in labels}
                for arr in predictions.keys():
                    print(predictions[arr].item())
                    for key, value in dict(predictions[arr].tolist()).items():
                        print (key,':', value.shape)
                    # print(dict(labels))
                    # print(dict(labels[key].tolist())['2000'])
                    # print(predictions[key])
    except FileNotFoundError:
        continue
#     
# 
# trace = go.Heatmap(
#     z=frame,
#     colorscale='Viridis')
# data=[trace]
# py.iplot(data, filename='basic-heatmap')


AttributeError: __enter__

# Hypothesis 2

Sample level dropout

# Test 1

Description of test 1 of hypothesis 2.


In [2]:
# Test 1, hyp 2 code