The functional API makes it easy to manipulate a large number of intertwined datastreams.

Let's consider the following model. We seek to predict how many retweets and likes a news headline will receive on Twitter. The main input to the model will be the headline itself, as a sequence of words, but to spice things up, our model will also have an auxiliary input, receiving extra data such as the time of day when the headline was posted, etc. The model will also be supervised via two loss functions. Using the main loss function earlier in a model is a good regularization mechanism for deep models.

The main input will receive the headline, as a sequence of integers (each integer encodes a word). The integers will be between 1 and 10,000 (a vocabulary of 10,000 words) and the sequences will be 100 words long.

![multi-input-multi-output-graph.png](multi-input-multi-output-graph.png)

In [6]:
from keras.layers import Input, Embedding, LSTM, Dense
from keras.models import Model
import keras

In [3]:
# Headline input: meant to receive sequences of 100 integers, between 1 and 10000.
# Note that we can name any layer by passing it a "name" argument.
main_input = Input(shape=(100,), dtype='int32', name='main_input')

# This embedding layer will encode the input sequence
# into a sequence of dense 512-dimensional vectors.
x = Embedding(output_dim=512, input_dim=1000, input_length=100)(main_input)

# A LSTM will transform the vector sequence into a single vector,
# containing information about the entire sequence
lstm_out = LSTM(32)(x)

Here we insert the auxiliary loss, allowing the LSTM and Embedding layer to be trained smoothly even though the main loss will be much higher in the model.



In [4]:
auxiliary_output = Dense(1, activation='sigmoid', name='aux_output')(lstm_out)

At this point, we feed into the model our auxiliary input data by concatenating it with the LSTM output:

In [7]:
auxiliary_input = Input(shape=(5,), name='aux_input')
x = keras.layers.concatenate([lstm_out, auxiliary_input])

# We stack a deep densely-connected network on top
x = Dense(64, activation='relu')(x)
x = Dense(64, activation='relu')(x)
x = Dense(64, activation='relu')(x)

# And finally we add the main logistic regression layer
main_output = Dense(1, activation='sigmoid', name='main_output')(x)

### This defines a model with two inputs and two outputs:

In [10]:
model = Model(inputs=[main_input, auxiliary_input], outputs=[main_output, auxiliary_output])

### We compile the model and assign a weight of 0.2 to the auxiliary loss.

In [9]:
model.compile(optimizer='rmsprop', loss='binary_crossentropy', loss_weights=[1.0,0.2])

### Train

In [None]:
model.fit([headline_data, additional_data], [labels, labels], epochs=50, batch_size=32)

Since our inputs and outputs are named (we passed them a "name" argument), we could also have compiled the model via:

In [13]:
#model.compile(optimizer='rmsprop',
#              loss={'main_output': 'binary_crossentropy', 'aux_output': 'binary_crossentropy'},
#              loss_weights={'main_output': 1., 'aux_output': 0.2})

In [None]:
# And trained it via:
#model.fit({'main_input': headline_data, 'aux_input': additional_data},
#          {'main_output': labels, 'aux_output': labels},
#          epochs=50, batch_size=32)