### This IPython notebook defines several variations of convolutional neural networks for channel estimation. The training inputs are the preamble + preamble passed through channel; the predicted output is the channel taps that correspond to the input. We explore several ideas here:
#### (A) multi-scale convolution (learned) filters applied separately to the [preamble input] and to the [preamble thorugh channel] input
#### (B) multi-scale convolution (learned) filters applied to both (e.g., 2D convolution filters)

<pre>
model 1:
   preamble -> conv1_1 -> conv2_1 -> conv3_1 ->\
                                                concat -> conv1 -> fc2 -> channel
   received -> conv1_2 -> conv2_2 -> conv3_2 ->/
   
model 2:
   preamble -> conv1 -> conv2 -> conv3 ->\
                                          concat -> conv1 -> fc2 -> channel
   received -> conv1 -> conv2 -> conv3 ->/
   
model 3:
     [   |        |    ]
     [preamble received] -> conv1 -> conv2 -> conv3 -> conv4 -> fc2 -> channel
     [   |        |    ]
</pre>

We make several assumptions about the channel model here as well:
* Channel length is <= 20
* Channel energy (am I saying this correctly?) is 1 (also, does normalizing channel taps by l2 norm ensure this?)
* Channel is sparse (most entries near 0, except for a few spikes)
  * Potential simplifying assumption (maybe include initially?) first entry of channel is 'large'
  
  
Questions: 
1. for my preamble, I am using +/- 1; Nikhil used 1/0 .. which is correct? (It should not matter really for training/testing since it is a simple affine transform between the two, but I want to do the "correct" thing)
2. do my assumptions make sense? for a real model I mean
3. am I adding noise correctly for the SNR I am setting
4. More of a "TODO" but...I am only training and testing on preamble inputs, not additional data -- the reasoning is that for additional data, we really want something that handles sequences (e.g., and RNN) in my opinion and this is more of an exploratoration of convolutional layers here

## ALSO NOTE: I am making a lot of things very modular on purpose..I want to discuss with everyone the problem statement again (I still feel like a lot of things are unclear/ambiguous) and then we can move a lot of this modular code to a rigid "util.py" file that everyone should import from so that we can more easily guarantee correctness and consistency and speed up development time.

In [1]:
# standard imports
import numpy as np
import scipy.signal as sig
import matplotlib.pyplot as plt
import tensorflow as tf
%matplotlib inline

  from ._conv import register_converters as _register_converters


In [2]:
# utility functions...we really should standardize this in a Python file [TODO!!!!]
"""Generates random sequence [1 1 1 -1 1 -1 -1 ...] of length LENGTH."""
def gen_preamble(length=100):
    return np.random.randint(2, size=(1,length))*2 - 1

"""Generates N channels of length LENGTH, each with NUM_TAPS taps. This
   means that NUM_TAPS of the entries will be non-zero, and the rest will
   be 'close' to 0 (e.g., noise). 
   Example below.
   
   >>> np.around(gen_channel(),2)
   >>> array([[-0.08,  0.  , -0.06,  0.02,  0.  ,  0.02, -0.85,  0.05, -0.03,
        -0.07,  0.5 , -0.02, -0.  , -0.05, -0.  ,  0.03, -0.07, -0.04,
        -0.01,  0.08]])"""
def gen_channel(N=1,num_taps=2,length=20):
    ret = np.zeros((N, length))
    tap_idxs = np.random.randint(length, size=(N, num_taps))
    tap_vals = ((np.random.randint(10, size=(N, num_taps))+1)*\
                (np.random.randint(2, size=(N, num_taps))*2 - 1))\
                / 10.
    for i in range(N):
        np.put(ret[i], tap_idxs[i], tap_vals[i])
    ret += 5e-2*np.random.randn(N,length)
    return ret / np.linalg.norm(ret,axis=1,keepdims=True)

"""Simulates passing data through a noisy channel.
   If SNR == -1, then no noise. Otherwise, uses AWGN model.
   
   Returned value has shape (1, len(channel.T) + len(data.T) - 1).
   With default settings, this means it is (1, 119)."""
def apply_channel(channel, data, snr=-1):
    ret = sig.convolve(data, channel, mode='full')
    if snr > 0:
        ret += (1./np.sqrt(snr)) * np.random.randn(len(ret))
    return ret

In [3]:
# functions for networks..should also put this in util.py!
"""Run before building a new network. Rests randomization for repeatability."""
def reset():
    tf.reset_default_graph()
    np.random.seed(0)
    tf.set_random_seed(0)
    
"""Defines the loss function."""
def define_loss(placeholders, loss_type):
    output, correct_output = placeholders
    return tf.reduce_mean(tf.reduce_sum((output-correct_output)**2, axis=1))
    
"""Defines the optimizer."""
def define_optimizer(loss, trainable_weights, optimizer, lr):
    opt = tf.train.AdamOptimizer(lr)
    gradients = opt.compute_gradients(loss, trainable_weights)
    train_step = opt.apply_gradients(gradients)
    return train_step

"""Defines a trainable variable with truncated normal initialization."""
def define_variable(name, shape, stddev):
    var = tf.get_variable(name, shape, initializer=
                    tf.truncated_normal_initializer(stddev=stddev, dtype=tf.float32),
                    dtype=tf.float32)
    return var

In [21]:
# define the networks
"""Builds the network [model 1] -- a basic convolution network; use as a base
   for the next network models.
   
   preamble -> conv1_1 -> conv2_1 -> conv3_1 ->\
                                                concat -> conv1 -> fc2 -> channel
   received -> conv1_2 -> conv2_2 -> conv3_2 ->/
   
   
   Elements in PARAMS:
   
   * 'preamble_len' : length of preamble; [default = 100]
   * 'channel_len' : length of channel; [default = 20]
   * 'use_max_pool': True to use max pooling in first part of net; [default = False]
   * 'loss' : loss function to use
   * 'optimizer' : optimizer to use
   * 'lr' : base learning rate
   
   """
def build_network1(params=None):
    default_params = {'preamble_len':100, 'channel_len':20,
                  'use_max_pool':False, 'loss':"", 'optimizer':"", 'lr':4e-5}
    if params == None:
        params = {}
    for k in default_params.keys():
        if k not in params:
            params[k] = default_params[k]
            
    preamble = tf.placeholder(tf.float32, [1, params['preamble_len'], 1], name="preamble_input")
    # use same length as preamble as per discussion on April 12
    received = tf.placeholder(tf.float32, [None, params['preamble_len'], 1], name="received_preamble")
    channel_true = tf.placeholder(tf.float32, [None, params['channel_len']])
    batch_size = tf.shape(received)[0]
    
    inputs=[preamble,received,channel_true]
    outputs=[]
    weights=[]
    
    nets=[preamble,received]
    
    # Process PREAMBLE and RECEIVED separately through convolutions
    num_filters = [1, 30, 30, 10]
    for i in [1,2]:
        net = nets[i-1]
        for j in range(1, len(num_filters)):
            num_filter = num_filters[j]
            prev = num_filters[j-1]
            with tf.variable_scope("conv%d_%d" % (j+1, i)) as scope:
                # use same weight initializer for all, and always use 3x_ convolutions
                kernel = define_variable('conv_weights', [3, prev, num_filter], 5e-2)
                biases = define_variable('conv_biases', [num_filter], 5e-3)
                weights.extend([kernel, biases])
                # apply network
                net = tf.nn.conv1d(net, kernel, stride=1, padding='SAME')
                net = tf.nn.bias_add(net, biases)
                net = tf.nn.relu(net)
                if params['use_max_pool']:
                    net = tf.nn.max_pool(net, [1, 3, 1], [1, 2, 1], padding='SAME')
        nets[i-1] = net
        
    # Concatenate
    nets[0] = tf.tile(nets[0], [batch_size, 1, 1])
    output = tf.concat(nets, axis=1)
    with tf.variable_scope("conv1_concat") as scope:
        kernel = define_variable('conv_weights', [3, num_filters[-1], 10], 5e-2)
        biases = define_variable('conv_biases', [10], 5e-3)
        weights.extend([kernel, biases])
        # apply network
        net = tf.nn.conv1d(net, kernel, stride=1, padding='SAME')
        net = tf.nn.bias_add(net, biases)
        net = tf.nn.relu(net)
    with tf.variable_scope("fc2_concat") as scope:
        dim = output.get_shape()[1].value*output.get_shape()[2].value
        batch_size = tf.shape(output)[0]
        
        kernel = define_variable('conv_weights', [dim, params['channel_len']], 5e-2)
        biases = define_variable('conv_biases', [params['channel_len']], 5e-3)
        weights.extend([kernel, biases])
        # apply network
        output = tf.reshape(output, [batch_size, -1])
        output = tf.nn.sigmoid(tf.matmul(output, kernel) + biases)
    
    outputs=[output]
    
    loss = define_loss([output, channel_true], params['loss'])
    train = define_optimizer(loss, weights, params['optimizer'], params['lr'])

    return inputs, outputs, weights, loss, train

"""Builds the network [model 2] -- a basic convolution network; use as a base
   for the next network models. [difference from model 1: share initial convolutions]
   
   preamble -> conv1 -> conv2 -> conv3 ->\
                                          concat -> conv1 -> fc2 -> channel
   received -> conv1 -> conv2 -> conv3 ->/
   
   Elements in PARAMS:
   
   * 'preamble_len' : length of preamble; [default = 100]
   * 'channel_len' : length of channel; [default = 20]
   * 'use_max_pool': True to use max pooling in first part of net; [default = False]
   * 'loss' : loss function to use
   * 'optimizer' : optimizer to use
   * 'lr' : base learning rate
   
   """
def build_network2(params=None):
    default_params = {'preamble_len':100, 'channel_len':20,
                  'use_max_pool':False, 'loss':"", 'optimizer':"", 'lr':4e-5}
    if params == None:
        params = {}
    for k in default_params.keys():
        if k not in params:
            params[k] = default_params[k]
            
    preamble = tf.placeholder(tf.float32, [1, params['preamble_len'], 1], name="preamble_input")
    # use same length as preamble as per discussion on April 12
    received = tf.placeholder(tf.float32, [None, params['preamble_len'], 1], name="received_preamble")
    channel_true = tf.placeholder(tf.float32, [None, params['channel_len']])
    batch_size = tf.shape(received)[0]
    
    inputs=[preamble,received,channel_true]
    outputs=[]
    weights=[]
    
    nets=[preamble,received]
    
    # Process PREAMBLE and RECEIVED separately through convolutions
    num_filters = [1, 30, 30, 10]
    
    for j in range(1, len(num_filters)):
        num_filter = num_filters[j]
        prev = num_filters[j-1]
        with tf.variable_scope("conv%d" % j) as scope:
            # use same weight initializer for all, and always use 3x_ convolutions
            kernel = define_variable('conv_weights', [3, prev, num_filter], 5e-2)
            biases = define_variable('conv_biases', [num_filter], 5e-3)
            weights.extend([kernel, biases])
            # apply network
            for i in range(2):
                net = nets[i]
                net = tf.nn.conv1d(net, kernel, stride=1, padding='SAME')
                net = tf.nn.bias_add(net, biases)
                net = tf.nn.relu(net)
                if params['use_max_pool']:
                    net = tf.nn.max_pool(net, [1, 3, 1], [1, 2, 1], padding='SAME')
                nets[i] = net

    # Concatenate
    nets[0] = tf.tile(nets[0], [batch_size, 1, 1])
    output = tf.concat(nets, axis=1)
    with tf.variable_scope("conv1_concat") as scope:
        kernel = define_variable('conv_weights', [3, num_filters[-1], 10], 5e-2)
        biases = define_variable('conv_biases', [10], 5e-3)
        weights.extend([kernel, biases])
        # apply network
        net = tf.nn.conv1d(net, kernel, stride=1, padding='SAME')
        net = tf.nn.bias_add(net, biases)
        net = tf.nn.relu(net)
    with tf.variable_scope("fc2_concat") as scope:
        dim = output.get_shape()[1].value*output.get_shape()[2].value
        batch_size = tf.shape(output)[0]
        
        kernel = define_variable('conv_weights', [dim, params['channel_len']], 5e-2)
        biases = define_variable('conv_biases', [params['channel_len']], 5e-3)
        weights.extend([kernel, biases])
        # apply network
        output = tf.reshape(output, [batch_size, -1])
        output = tf.nn.sigmoid(tf.matmul(output, kernel) + biases)
    
    outputs=[output]
    
    loss = define_loss([output, channel_true], params['loss'])
    train = define_optimizer(loss, weights, params['optimizer'], params['lr'])

    return inputs, outputs, weights, loss, train

"""Builds the network [model 3] -- a basic convolution network; use as a base
   for the next network models. [difference from model 1: wide convolutions]
   
     [   |        |    ]
     [preamble received] -> conv1 -> conv2 -> conv3 -> conv4 -> fc2 -> channel
     [   |        |    ]
   
   Elements in PARAMS:
   
   * 'preamble_len' : length of preamble; [default = 100]
   * 'channel_len' : length of channel; [default = 20]
   * 'use_max_pool': True to use max pooling in first part of net; [default = False]
   * 'loss' : loss function to use
   * 'optimizer' : optimizer to use
   * 'lr' : base learning rate
   
   """
def build_network3(params=None):
    default_params = {'preamble_len':100, 'channel_len':20,
                  'use_max_pool':False, 'loss':"", 'optimizer':"", 'lr':4e-5}
    if params == None:
        params = {}
    for k in default_params.keys():
        if k not in params:
            params[k] = default_params[k]
            
    preamble = tf.placeholder(tf.float32, [1, params['preamble_len'], 1], name="preamble_input")
    # use same length as preamble as per discussion on April 12
    received = tf.placeholder(tf.float32, [None, params['preamble_len'], 1], name="received_preamble")
    channel_true = tf.placeholder(tf.float32, [None, params['channel_len']])
    batch_size = tf.shape(received)[0]
    
    temp = tf.tile(preamble, [batch_size, 1, 1])
    network_input = tf.concat([temp, received], axis=2)
    
    inputs=[preamble,received,channel_true]
    outputs=[]
    weights=[]
    
    output = network_input
    
    # Process PREAMBLE and RECEIVED separately through convolutions
    num_filters = [2, 30, 30, 10, 10]
    
    for j in range(1, len(num_filters)):
        num_filter = num_filters[j]
        prev = num_filters[j-1]
        with tf.variable_scope("conv%d" % j) as scope:
            # use same weight initializer for all, and always use 3x_ convolutions
            kernel = define_variable('conv_weights', [3, prev, num_filter], 5e-2)
            biases = define_variable('conv_biases', [num_filter], 5e-3)
            weights.extend([kernel, biases])
            # apply network
            output = tf.nn.conv1d(output, kernel, stride=1, padding='SAME')
            output = tf.nn.bias_add(output, biases)
            output = tf.nn.relu(output)
            if params['use_max_pool']:
                output = tf.nn.max_pool(output, [1, 3, 1], [1, 2, 1], padding='SAME')
                
    with tf.variable_scope("fc5") as scope:
        dim = output.get_shape()[1].value*output.get_shape()[2].value
        batch_size = tf.shape(output)[0]
        
        kernel = define_variable('conv_weights', [dim, params['channel_len']], 5e-2)
        biases = define_variable('conv_biases', [params['channel_len']], 5e-3)
        weights.extend([kernel, biases])
        # apply network
        output = tf.reshape(output, [batch_size, -1])
        output = tf.nn.sigmoid(tf.matmul(output, kernel) + biases)
    
    outputs=[output]
    
    loss = define_loss([output, channel_true], params['loss'])
    train = define_optimizer(loss, weights, params['optimizer'], params['lr'])

    return inputs, outputs, weights, loss, train

In [28]:
# training a network
def train(params=None):
    if params == None:
        params = {}
    if 'network_option' not in params:
        params['network_option'] = build_network1
    reset()
    inputs, outputs, weights, loss, train = params['network_option']()
    num_iter=10000
    batch_size=10
    # use a single fixed preamble
    preamble=gen_preamble()
    sess = tf.InteractiveSession()
    sess.run(tf.global_variables_initializer())
    for i in range(0,num_iter):
        # generate data
        channels = gen_channel(N=batch_size)
        received = apply_channel(channels, preamble, snr=-1)
        #channels = channels.reshape((batch_size,-1,1))
        received = received.reshape((batch_size,-1,1))[:,:100,:]

        # train
        sess.run(train, feed_dict={inputs[0]:preamble.reshape((1, 100, 1)),
                         inputs[1]:received,
                         inputs[2]:channels})
        if i % 100 == 0:
            l = sess.run(loss, feed_dict={inputs[0]:preamble.reshape((1, 100, 1)),
                         inputs[1]:received,
                         inputs[2]:channels})
            print(i,l)

In [29]:
train({'network_option':build_network1})

0 5.9497995
100 5.5608754
200 2.5000556
300 1.2042078
400 1.0180413
500 1.055712
600 1.0114251
700 1.0015025
800 1.0162882
900 1.0142034
1000 0.99003667
1100 1.0115359
1200 1.0056458
1300 1.0010555
1400 1.0045365
1500 0.99774206
1600 1.0010484
1700 1.0026467
1800 0.9963769
1900 0.9995969
2000 1.0046695
2100 0.99699056
2200 1.0051769
2300 0.9978325
2400 1.0029647
2500 0.996123
2600 0.99439776
2700 0.99363124
2800 0.99960697
2900 0.9914936
3000 1.0069927
3100 0.99567187
3200 0.9916027
3300 0.92836636
3400 0.8294134
3500 0.74126995
3600 0.58991754
3700 0.62285465
3800 0.8311445
3900 0.67047155
4000 0.6616765
4100 0.6276475
4200 0.69173604
4300 0.5623376
4400 0.6010535
4500 0.2964145
4600 0.46129045
4700 0.7588934
4800 0.5523388
4900 0.498071
5000 0.41340083
5100 0.47271952
5200 0.4388209
5300 0.40014333
5400 0.69837344
5500 0.4745117
5600 0.4347182
5700 0.6253753
5800 0.68599856
5900 0.6995894
6000 0.58479536
6100 0.50548327
6200 0.6553062
6300 0.59846944
6400 0.5149986
6500 0.55119723
66

In [30]:
train({'network_option':build_network2})

0 5.9537416
100 5.5144796
200 2.6487117
300 1.229145
400 1.0241702
500 1.0570612
600 1.0078924
700 1.0016247
800 1.0141098
900 1.0134608
1000 0.98717594
1100 1.0131205
1200 1.0031105
1300 1.0005233
1400 1.0050215
1500 0.99412537
1600 1.0012652
1700 1.0020328
1800 0.99519217
1900 0.9996247
2000 1.0042546
2100 0.99511206
2200 1.0066788
2300 0.99723226
2400 1.0033973
2500 0.99491024
2600 0.99264127
2700 0.99336326
2800 1.0014827
2900 0.99426377
3000 1.008313
3100 0.99956703
3200 1.0009232
3300 0.99502146
3400 0.996088
3500 0.99943316
3600 0.9937148
3700 0.9903437
3800 0.99765426
3900 0.9999674
4000 0.9985086
4100 1.0005165
4200 1.0045638
4300 0.98722684
4400 0.9837267
4500 0.9652734
4600 0.96658695
4700 1.0003734
4800 0.98452556
4900 0.89373606
5000 0.89616984
5100 0.83555365
5200 0.7841519
5300 0.6744598
5400 0.89712447
5500 0.70996726
5600 0.57625747
5700 0.745232
5800 0.7323413
5900 0.7462337
6000 0.6045487
6100 0.5163443
6200 0.6833571
6300 0.63307345
6400 0.5472447
6500 0.55563325
66

In [31]:
train({'network_option':build_network3})

0 5.945472
100 5.860025
200 3.8928902
300 1.3568672
400 1.0253246
500 1.053317
600 1.0060494
700 1.0030323
800 1.0134175
900 1.011519
1000 0.99103147
1100 1.0103197
1200 1.0026392
1300 1.0007324
1400 1.0048769
1500 0.9968483
1600 1.001244
1700 1.0027201
1800 0.9963989
1900 0.999159
2000 1.0030798
2100 0.9969751
2200 1.0059751
2300 0.9978571
2400 1.0031506
2500 0.996288
2600 0.9954206
2700 0.99410266
2800 0.99964416
2900 0.99498004
3000 1.0056146
3100 1.0009472
3200 1.0019062
3300 0.9947456
3400 0.9943921
3500 1.0006399
3600 0.9932451
3700 0.98828566
3800 0.99624527
3900 0.99993944
4000 0.9971197
4100 0.9956706
4200 1.0029645
4300 0.9770211
4400 0.9747633
4500 0.9028694
4600 0.9429701
4700 0.99490106
4800 0.9266677
4900 0.89065266
5000 0.8614389
5100 0.8155961
5200 0.7712646
5300 0.69665504
5400 0.91341794
5500 0.7687214
5600 0.68674135
5700 0.8282282
5800 0.7966994
5900 0.8614728
6000 0.7318518
6100 0.7059592
6200 0.80043954
6300 0.81124496
6400 0.73199934
6500 0.69481206
6600 0.624460

In [123]:
preamble=gen_preamble()
channels = gen_channel(N=30)
received = apply_channel(channels, preamble, snr=-1)
print(preamble.shape, channels.shape, received.shape)

(1, 100) (30, 20) (30, 119)


In [134]:
for i in range(3):
    print (i, inputs[i].shape)

0 (1, 100, 1)
1 (?, 100, 1)
2 (?, 20)


In [139]:
print (channels.shape)

(100, 20)


In [138]:
print(received.shape)

(100, 100, 1)
