**Architecture **

<img src="http://drive.google.com/uc?export=view&id=12JomC2IswVbNGdE0IIvPpUk8vPjP-MBQ"  alt="artchtecture">


---



Figure 1: Architecture of the one-stream models. First seven layers are followed by parametric
softplus units [25], and the last layer is followed by sigmoid units. The architecture is similar to that
of AlexNet [26] except for the following modifications: (i) The number of convolutional kernels
are halved. (ii) The (convolutional and pooling) kernels and strides are flattened. That is, an n  n
kernel is changed to an n2  1 kernel and an m  m stride is changed to an m2  1 stride. (iii)
Local response normalization is replaced with batch normalization [27]. (iv) Rectified linear units are
replaced with parametric softplus units with initial  = 0.2 and initial  = 0.5. (v) Softmax units are
replaced with sigmoid units.

We used Adam [28] with parameters  = 0:0002, 1 = 0:5, 2 = 0:999,  = 1e􀀀8 and a mini batch
size of 36 to train the models by minimizing the binary cross-entropy loss function. Initial model
parameters were drawn from a uniform distribution as described in [29]. Songs in each training
mini-batch were randomly cropped to six seconds (96000 samples). The epoch in which the validation
performance was the highest was taken as the final model (53, 12 and 12 for T, F and TF models,
respectively). The DNN models were implemented in Keras [30].



---



(1) Importing dependency

In [0]:
import keras
from keras.models import Sequential
from keras.layers import Dense, Activation, Dropout, Flatten,\
                         Conv1D, MaxPooling1D
from keras.layers.normalization import BatchNormalization
import numpy as np
np.random.seed(1000)

Using TensorFlow backend.


(2) Get Data

In [0]:
# x, y = oxflower17.load_data(one_hot=True)
# from sklearn.model_selection import train_test_split
# x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.3,
#                                                     random_state=42)
# print(x_train.shape, x_test.shape, y_train.shape, y_test.shape)


# code test setting
input_shape = (96000, 1)

sample_x = np.zeros([100]+list(input_shape))
sample_y = np.zeros([100]+[50])

print(sample_x.shape, sample_y.shape)

(100, 96000, 1) (100, 50)


(3) Create a sequential model

In [0]:
### Define Parametric Softplus

# alpha * log(1 + exp(beta * x))
def ParametricSoftplus(alpha=0.2, beta=5.0):
  return lambda x: alpha * keras.activations.softplus(beta * x)  

In [0]:
model=Sequential()

# 1st Convolutional Layer (conv1)
model.add(Conv1D (kernel_size=121, filters=48, strides=16, padding='same',
                  input_shape=input_shape))
model.add(Activation(ParametricSoftplus(alpha=0.2, beta=0.5)))
model.add(BatchNormalization())

# Pooling (pool1)
model.add(MaxPooling1D(pool_size=9, strides=4, padding='same'))

# 2nd Convolutional Layer (conv2)
model.add(Conv1D (kernel_size=25, filters=128, padding='same'))
model.add(Activation(ParametricSoftplus(alpha=0.2, beta=0.5)))
model.add(BatchNormalization())

# Pooling (pool2)
model.add(MaxPooling1D(pool_size=9, strides=4, padding='same'))

# 3rd Convolutional Layer (conv3)
model.add(Conv1D (kernel_size=9, filters=192, padding='same'))
model.add(Activation(ParametricSoftplus(alpha=0.2, beta=0.5)))

# 4rd Convolutional Layer (conv4)
model.add(Conv1D (kernel_size=9, filters=192, padding='same'))
model.add(Activation(ParametricSoftplus(alpha=0.2, beta=0.5)))

# 5rd Convolutional Layer (conv5)
model.add(Conv1D (kernel_size=9, filters=128, padding='same'))
model.add(Activation(ParametricSoftplus(alpha=0.2, beta=0.5)))

# Pooling (pool5)
model.add(MaxPooling1D(pool_size=9, strides=4, padding='same'))

# 1st Dense Layer (full6)
model.add(Flatten())
model.add(Dense(4096))
model.add(Activation(ParametricSoftplus(alpha=0.2, beta=0.5)))
model.add(Dropout(0.5)) # Drop-out value is not specified in the paper

# 2nd Dense Layer (full7)
model.add(Dense(4096))
model.add(Activation(ParametricSoftplus(alpha=0.2, beta=0.5)))
model.add(Dropout(0.5))

# Output Layer (full8)
model.add(Dense(50))
model.add(Activation('sigmoid'))

model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv1d_1 (Conv1D)            (None, 6000, 48)          5856      
_________________________________________________________________
activation_1 (Activation)    (None, 6000, 48)          0         
_________________________________________________________________
batch_normalization_1 (Batch (None, 6000, 48)          192       
_________________________________________________________________
max_pooling1d_1 (MaxPooling1 (None, 1500, 48)          0         
_________________________________________________________________
conv1d_2 (Conv1D)            (None, 1500, 128)         153728    
_________________________________________________________________
activation_2 (Activation)    (None, 1500, 128)         0         
_________________________________________________________________
batch_normalization_2 (Batch (None, 1500, 128)         512       
__________

(4) Compile 

In [0]:
adam_with_params = keras.optimizers.Adam(lr=0.0002, beta_1=0.1, beta_2=0.999, 
                                         epsilon=1e-8)


model.compile(loss='binary_crossentropy', optimizer=adam_with_params,
              metrics=['accuracy'])

(5) Train

In [0]:
model.fit(sample_x, sample_y, batch_size=36, epochs=10, verbose=1, validation_split=0.2, shuffle=True)

Train on 80 samples, validate on 20 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x7f9b7ef03128>

In [0]:
# %time pred = model.predict_classes(np.expand_dims(x_test[0], axis=0))
# print('Predict:', pred)

# [loss, accuracy] = model.evaluate(x_test, y_test)
# print('Loss:', loss, 'Accuracy:', accuracy)