### This IPython notebook defines several variations of convolutional neural networks for channel estimation. The training inputs are the preamble + preamble passed through channel; the predicted output is the channel taps that correspond to the input. We explore several ideas here:
#### (A) multi-scale convolution (learned) filters applied separately to the [preamble input] and to the [preamble thorugh channel] input
#### (B) multi-scale convolution (learned) filters applied to both (e.g., 2D convolution filters)

<pre>
model 1:
   preamble -> conv1_1 -> conv2_1 -> conv3_1 ->\
                                                concat -> conv1 -> fc2 -> channel
   received -> conv1_2 -> conv2_2 -> conv3_2 ->/
   
model 2:
   preamble -> conv1 -> conv2 -> conv3 ->\
                                          concat -> conv1 -> fc2 -> channel
   received -> conv1 -> conv2 -> conv3 ->/
   
model 3:
     [   |        |    ]
     [preamble received] -> conv1 -> conv2 -> conv3 -> conv4 -> fc2 -> channel
     [   |        |    ]
</pre>

We make several assumptions about the channel model here as well:
* Channel length is <= 20
* Channel energy (am I saying this correctly?) is 1 (also, does normalizing channel taps by l2 norm ensure this?)
* Channel is sparse (most entries near 0, except for a few spikes)
  * Potential simplifying assumption (maybe include initially?) first entry of channel is 'large'
  
  
Questions: 
1. for my preamble, I am using +/- 1; Nikhil used 1/0 .. which is correct? (It should not matter really for training/testing since it is a simple affine transform between the two, but I want to do the "correct" thing)
2. do my assumptions make sense? for a real model I mean
3. am I adding noise correctly for the SNR I am setting
4. More of a "TODO" but...I am only training and testing on preamble inputs, not additional data -- the reasoning is that for additional data, we really want something that handles sequences (e.g., and RNN) in my opinion and this is more of an exploratoration of convolutional layers here

## ALSO NOTE: I am making a lot of things very modular on purpose..I want to discuss with everyone the problem statement again (I still feel like a lot of things are unclear/ambiguous) and then we can move a lot of this modular code to a rigid "util.py" file that everyone should import from so that we can more easily guarantee correctness and consistency and speed up development time.

In [1]:
# standard imports
import numpy as np
import scipy.signal as sig
import matplotlib.pyplot as plt
import tensorflow as tf
%matplotlib inline

  from ._conv import register_converters as _register_converters


In [2]:
# utility functions...we really should standardize this in a Python file [TODO!!!!]
"""Generates random sequence [1 1 1 -1 1 -1 -1 ...] of length LENGTH."""
def gen_preamble(length=100):
    return np.random.randint(2, size=(1,length))*2 - 1

"""Generates N channels of length LENGTH, each with NUM_TAPS taps. This
   means that NUM_TAPS of the entries will be non-zero, and the rest will
   be 'close' to 0 (e.g., noise). 
   Example below.
   
   >>> np.around(gen_channel(),2)
   >>> array([[-0.08,  0.  , -0.06,  0.02,  0.  ,  0.02, -0.85,  0.05, -0.03,
        -0.07,  0.5 , -0.02, -0.  , -0.05, -0.  ,  0.03, -0.07, -0.04,
        -0.01,  0.08]])"""
def gen_channel(N=1,num_taps=2,length=20):
    ret = np.zeros((N, length))
    tap_idxs = np.random.randint(length, size=(N, num_taps))
    tap_vals = ((np.random.randint(10, size=(N, num_taps))+1)*\
                (np.random.randint(2, size=(N, num_taps))*2 - 1))\
                / 10.
    for i in range(N):
        np.put(ret[i], tap_idxs[i], tap_vals[i])
    ret += 5e-2*np.random.randn(N,length)
    return ret / np.linalg.norm(ret,axis=1,keepdims=True)

"""Simulates passing data through a noisy channel.
   If SNR == -1, then no noise. Otherwise, uses AWGN model.
   
   Returned value has shape (1, len(channel.T) + len(data.T) - 1).
   With default settings, this means it is (1, 119)."""
def apply_channel(channel, data, snr=-1):
    ret = sig.convolve(data, channel, mode='full')
    if snr > 0:
        ret += (1./np.sqrt(snr)) * np.random.randn(len(ret))
    return ret

In [3]:
# functions for networks..should also put this in util.py!
"""Run before building a new network. Rests randomization for repeatability."""
def reset():
    tf.reset_default_graph()
    np.random.seed(0)
    tf.set_random_seed(0)
    
"""Defines the loss function."""
def define_loss(placeholders, loss_type):
    output, correct_output = placeholders
    return tf.reduce_mean(tf.reduce_sum((output-correct_output)**2, axis=1))
    
"""Defines the optimizer."""
def define_optimizer(loss, trainable_weights, optimizer, lr):
    opt = tf.train.AdamOptimizer(lr)
    gradients = opt.compute_gradients(loss, trainable_weights)
    train_step = opt.apply_gradients(gradients)
    return train_step

"""Defines a trainable variable with truncated normal initialization."""
def define_variable(name, shape, stddev):
    var = tf.get_variable(name, shape, initializer=
                    tf.truncated_normal_initializer(stddev=stddev, dtype=tf.float32),
                    dtype=tf.float32)
    return var

In [103]:
# define the networks
"""Builds the network [model 1] -- a basic convolution network; use as a base
   for the next network models.
   
   preamble -> conv1_1 -> conv2_1 -> conv3_1 ->\
                                                concat -> conv1 -> fc2 -> channel
   received -> conv1_2 -> conv2_2 -> conv3_2 ->/
   
   
   Elements in PARAMS:
   
   * 'preamble_len' : length of preamble; [default = 100]
   * 'channel_len' : length of channel; [default = 20]
   * 'use_max_pool': True to use max pooling in first part of net; [default = False]
   * 'loss' : loss function to use
   * 'optimizer' : optimizer to use
   * 'lr' : base learning rate
   
   """
def build_network1(params=None):
    default_params = {'preamble_len':100, 'channel_len':20,
                  'use_max_pool':False, 'loss':"", 'optimizer':"", 'lr':4e-5}
    if params == None:
        params = {}
    for k in default_params.keys():
        if k not in params:
            params[k] = default_params[k]
            
    preamble = tf.placeholder(tf.float32, [1, params['preamble_len'], 1], name="preamble_input")
    # use same length as preamble as per discussion on April 12
    received = tf.placeholder(tf.float32, [None, params['preamble_len'], 1], name="received_preamble")
    channel_true = tf.placeholder(tf.float32, [None, params['channel_len']])
    batch_size = tf.shape(received)[0]
    
    inputs=[preamble,received,channel_true]
    outputs=[]
    weights=[]
    
    nets=[preamble,received]
    
    # Process PREAMBLE and RECEIVED separately through convolutions
    num_filters = [1, 30, 30, 10]
    for i in [1,2]:
        net = nets[i-1]
        for j in range(1, len(num_filters)):
            num_filter = num_filters[j]
            prev = num_filters[j-1]
            with tf.variable_scope("conv%d_%d" % (j+1, i)) as scope:
                # use same weight initializer for all, and always use 3x_ convolutions
                kernel = define_variable('conv_weights', [3, prev, num_filter], 5e-2)
                biases = define_variable('conv_biases', [num_filter], 5e-3)
                weights.extend([kernel, biases])
                # apply network
                net = tf.nn.conv1d(net, kernel, stride=1, padding='SAME')
                net = tf.nn.bias_add(net, biases)
                net = tf.nn.relu(net)
                if params['use_max_pool']:
                    net = tf.nn.max_pool(net, [1, 3, 1], [1, 2, 1], padding='SAME')
        nets[i-1] = net
        
    # Concatenate
    nets[0] = tf.tile(nets[0], [batch_size, 1, 1])
    output = tf.concat(nets, axis=1)
    with tf.variable_scope("conv1_concat") as scope:
        kernel = define_variable('conv_weights', [3, num_filters[-1], 10], 5e-2)
        biases = define_variable('conv_biases', [10], 5e-3)
        weights.extend([kernel, biases])
        # apply network
        net = tf.nn.conv1d(net, kernel, stride=1, padding='SAME')
        net = tf.nn.bias_add(net, biases)
        net = tf.nn.relu(net)
    with tf.variable_scope("fc2_concat") as scope:
        dim = output.get_shape()[1].value*output.get_shape()[2].value
        batch_size = tf.shape(output)[0]
        
        kernel = define_variable('conv_weights', [dim, params['channel_len']], 5e-2)
        biases = define_variable('conv_biases', [params['channel_len']], 5e-3)
        weights.extend([kernel, biases])
        # apply network
        output = tf.reshape(output, [batch_size, -1])
        output = tf.matmul(output, kernel) + biases
    
    outputs=[output]
    
    loss = define_loss([output, channel_true], params['loss'])
    train = define_optimizer(loss, weights, params['optimizer'], params['lr'])

    return inputs, outputs, weights, loss, train

"""Builds the network [model 2] -- a basic convolution network; use as a base
   for the next network models. [difference from model 1: share initial convolutions]
   
   preamble -> conv1 -> conv2 -> conv3 ->\
                                          concat -> conv1 -> fc2 -> channel
   received -> conv1 -> conv2 -> conv3 ->/
   
   Elements in PARAMS:
   
   * 'preamble_len' : length of preamble; [default = 100]
   * 'channel_len' : length of channel; [default = 20]
   * 'use_max_pool': True to use max pooling in first part of net; [default = False]
   * 'loss' : loss function to use
   * 'optimizer' : optimizer to use
   * 'lr' : base learning rate
   
   """
def build_network2(params=None):
    default_params = {'preamble_len':100, 'channel_len':20,
                  'use_max_pool':False, 'loss':"", 'optimizer':"", 'lr':4e-5}
    if params == None:
        params = {}
    for k in default_params.keys():
        if k not in params:
            params[k] = default_params[k]
            
    preamble = tf.placeholder(tf.float32, [1, params['preamble_len'], 1], name="preamble_input")
    # use same length as preamble as per discussion on April 12
    received = tf.placeholder(tf.float32, [None, params['preamble_len'], 1], name="received_preamble")
    channel_true = tf.placeholder(tf.float32, [None, params['channel_len']])
    batch_size = tf.shape(received)[0]
    
    inputs=[preamble,received,channel_true]
    outputs=[]
    weights=[]
    
    nets=[preamble,received]
    
    # Process PREAMBLE and RECEIVED separately through convolutions
    num_filters = [1, 30, 30, 10]
    
    for j in range(1, len(num_filters)):
        num_filter = num_filters[j]
        prev = num_filters[j-1]
        with tf.variable_scope("conv%d" % j) as scope:
            # use same weight initializer for all, and always use 3x_ convolutions
            kernel = define_variable('conv_weights', [3, prev, num_filter], 5e-2)
            biases = define_variable('conv_biases', [num_filter], 5e-3)
            weights.extend([kernel, biases])
            # apply network
            for i in range(2):
                net = nets[i]
                net = tf.nn.conv1d(net, kernel, stride=1, padding='SAME')
                net = tf.nn.bias_add(net, biases)
                net = tf.nn.relu(net)
                if params['use_max_pool']:
                    net = tf.nn.max_pool(net, [1, 3, 1], [1, 2, 1], padding='SAME')
                nets[i] = net

    # Concatenate
    nets[0] = tf.tile(nets[0], [batch_size, 1, 1])
    output = tf.concat(nets, axis=1)
    with tf.variable_scope("conv1_concat") as scope:
        kernel = define_variable('conv_weights', [3, num_filters[-1], 10], 5e-2)
        biases = define_variable('conv_biases', [10], 5e-3)
        weights.extend([kernel, biases])
        # apply network
        net = tf.nn.conv1d(net, kernel, stride=1, padding='SAME')
        net = tf.nn.bias_add(net, biases)
        net = tf.nn.relu(net)
    with tf.variable_scope("fc2_concat") as scope:
        dim = output.get_shape()[1].value*output.get_shape()[2].value
        batch_size = tf.shape(output)[0]
        
        kernel = define_variable('conv_weights', [dim, params['channel_len']], 5e-2)
        biases = define_variable('conv_biases', [params['channel_len']], 5e-3)
        weights.extend([kernel, biases])
        # apply network
        output = tf.reshape(output, [batch_size, -1])
        output = tf.matmul(output, kernel) + biases
    
    outputs=[output]
    
    loss = define_loss([output, channel_true], params['loss'])
    train = define_optimizer(loss, weights, params['optimizer'], params['lr'])

    return inputs, outputs, weights, loss, train

"""Builds the network [model 3] -- a basic convolution network; use as a base
   for the next network models. [difference from model 1: wide convolutions]
   
     [   |        |    ]
     [preamble received] -> conv1 -> conv2 -> conv3 -> conv4 -> fc2 -> channel
     [   |        |    ]
   
   Elements in PARAMS:
   
   * 'preamble_len' : length of preamble; [default = 100]
   * 'channel_len' : length of channel; [default = 20]
   * 'use_max_pool': True to use max pooling in first part of net; [default = False]
   * 'loss' : loss function to use
   * 'optimizer' : optimizer to use
   * 'lr' : base learning rate
   
   """
def build_network3(params=None):
    default_params = {'preamble_len':100, 'channel_len':20,
                  'use_max_pool':False, 'loss':"", 'optimizer':"", 'lr':4e-5}
    if params == None:
        params = {}
    for k in default_params.keys():
        if k not in params:
            params[k] = default_params[k]
            
    preamble = tf.placeholder(tf.float32, [1, params['preamble_len'], 1], name="preamble_input")
    # use same length as preamble as per discussion on April 12
    received = tf.placeholder(tf.float32, [None, params['preamble_len'], 1], name="received_preamble")
    channel_true = tf.placeholder(tf.float32, [None, params['channel_len']])
    batch_size = tf.shape(received)[0]
    
    temp = tf.tile(preamble, [batch_size, 1, 1])
    network_input = tf.concat([temp, received], axis=2)
    
    inputs=[preamble,received,channel_true]
    outputs=[]
    weights=[]
    
    output = network_input
    
    # Process PREAMBLE and RECEIVED separately through convolutions
    num_filters = [2, 30, 30, 10, 10]
    
    for j in range(1, len(num_filters)):
        num_filter = num_filters[j]
        prev = num_filters[j-1]
        with tf.variable_scope("conv%d" % j) as scope:
            # use same weight initializer for all, and always use 3x_ convolutions
            kernel = define_variable('conv_weights', [3, prev, num_filter], 5e-2)
            biases = define_variable('conv_biases', [num_filter], 5e-3)
            weights.extend([kernel, biases])
            # apply network
            output = tf.nn.conv1d(output, kernel, stride=1, padding='SAME')
            output = tf.nn.bias_add(output, biases)
            output = tf.nn.relu(output)
            if params['use_max_pool']:
                output = tf.nn.max_pool(output, [1, 3, 1], [1, 2, 1], padding='SAME')
                
    with tf.variable_scope("fc5") as scope:
        dim = output.get_shape()[1].value*output.get_shape()[2].value
        batch_size = tf.shape(output)[0]
        
        kernel = define_variable('conv_weights', [dim, params['channel_len']], 5e-2)
        biases = define_variable('conv_biases', [params['channel_len']], 5e-3)
        weights.extend([kernel, biases])
        # apply network
        output = tf.reshape(output, [batch_size, -1])
        output = tf.matmul(output, kernel) + biases
    
    outputs=[output]
    
    loss = define_loss([output, channel_true], params['loss'])
    train = define_optimizer(loss, weights, params['optimizer'], params['lr'])

    return inputs, outputs, weights, loss, train

In [119]:
# training a network
def train(params=None):
    if params == None:
        params = {}
    if 'network_option' not in params:
        params['network_option'] = build_network1
    reset()
    inputs, outputs, weights, loss, train = params['network_option']()
    num_iter=3000
    batch_size=10
    # use a single fixed preamble
    preamble=gen_preamble(length=100)
    sess = tf.InteractiveSession()
    sess.run(tf.global_variables_initializer())
    for i in range(0,num_iter):
        # generate data
        channels = gen_channel(N=batch_size)
        received = apply_channel(channels, preamble, snr=-1)
        #channels = channels.reshape((batch_size,-1,1))
        received = received.reshape((batch_size,-1,1))[:,:100,:]

        # train
        sess.run(train, feed_dict={inputs[0]:preamble.reshape((1, 100, 1)),
                         inputs[1]:received,
                         inputs[2]:channels})
        if i % 100 == 0:
            l = sess.run(loss, feed_dict={inputs[0]:preamble.reshape((1, 100, 1)),
                         inputs[1]:received,
                         inputs[2]:channels})
            print(i,l)
    return inputs, outputs, weights, loss, train, preamble, sess

In [105]:
inputs, outputs, weights, loss, train, preamble, sess = train({'network_option':build_network1})

0 1.0034059
100 0.99406207
200 0.9641431
300 0.9146236
400 0.7659405
500 0.5151045
600 0.25332564
700 0.105047725
800 0.05494233
900 0.05332415
1000 0.03433824
1100 0.022247884
1200 0.016355796
1300 0.015447098
1400 0.013221271
1500 0.012774525
1600 0.008903926
1700 0.009098433
1800 0.010346154
1900 0.0095137935
2000 0.0077663898
2100 0.007662937
2200 0.0054184003
2300 0.0065629496
2400 0.0056787883
2500 0.0059009115
2600 0.0035318478
2700 0.004153458
2800 0.0039181514
2900 0.0034711831


In [110]:
channel = gen_channel(N=1)
received = apply_channel(channel, preamble, snr=-1)
received = received.reshape((1,-1,1))[:,:100,:]

In [111]:
np.around(channel,5)

array([[ 0.10041,  0.01058, -0.01333, -0.10365, -0.2791 , -0.02789,
        -0.92421,  0.06079,  0.00245, -0.1089 , -0.06566, -0.01948,
        -0.03377, -0.07654,  0.03636, -0.00687, -0.04569, -0.10998,
         0.00376, -0.05686]])

In [116]:
sess.run(outputs[0], feed_dict={inputs[0]:preamble.reshape((1, 100, 1)),
                         inputs[1]:received})

array([[ 0.08661211,  0.01858502, -0.0070053 , -0.08288264, -0.26393795,
        -0.02926837, -0.9157137 ,  0.06948219,  0.00326868, -0.09695397,
        -0.06386734, -0.02889397, -0.02585243, -0.07243113,  0.03938943,
        -0.01084725, -0.0369459 , -0.08873605,  0.00550403, -0.04088632]],
      dtype=float32)

In [117]:
inputs, outputs, weights, loss, train, preamble, sess = train({'network_option':build_network2})

0 1.0120399
100 0.9935743
200 0.96209985
300 0.91719896
400 0.7838737
500 0.57799083
600 0.39312035
700 0.20717505
800 0.12511323
900 0.07955359
1000 0.046352156
1100 0.037004977
1200 0.026565403
1300 0.02128685
1400 0.013435662
1500 0.015796253
1600 0.009331873
1700 0.007815642
1800 0.008455646
1900 0.007679333
2000 0.007691738
2100 0.0071474984
2200 0.006544511
2300 0.0076162503
2400 0.0057881684
2500 0.0063463757
2600 0.0042829304
2700 0.0045123696
2800 0.0032021254
2900 0.003537111


In [120]:
inputs, outputs, weights, loss, train, preamble, sess = train({'network_option':build_network3})

0 0.996899
100 1.0017035
200 0.98909837
300 0.9746598
400 0.9332038
500 0.7685693
600 0.474307
700 0.24476442
800 0.14162117
900 0.1199242
1000 0.052551784
1100 0.03371287
1200 0.025724152
1300 0.01534312
1400 0.009300559
1500 0.009421677
1600 0.0051653897
1700 0.005025029
1800 0.004409372
1900 0.004913197
2000 0.004300531
2100 0.0043088626
2200 0.003134552
2300 0.002657528
2400 0.0031706714
2500 0.0035061655
2600 0.002363956
2700 0.0022497396
2800 0.001941032
2900 0.0016652367


In [123]:
preamble=gen_preamble()
channels = gen_channel(N=30)
received = apply_channel(channels, preamble, snr=-1)
print(preamble.shape, channels.shape, received.shape)

(1, 100) (30, 20) (30, 119)
