<a href="https://colab.research.google.com/github/jeet-yadav27/Deep_learning/blob/main/8_How_to_Develop_CNN_MODEL.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

##1. Univariate CNN Models
2. Multivariate CNN Models
3. Multi-step CNN Models
# 4. Multivariate Multi-step CNN Models

1. ###Univariate CNN Architecture

1. Input Layer: 1D sequence of univariate time series values.
2. Convolutional Layers: 1D convolutional filters scan the input sequence.
3. Activation Functions: ReLU, Tanh, or others.
4. Pooling Layers: Downsample the feature maps.
5. Flatten Layer: Convert feature maps to 1D.
6. Dense Layers: Output forecasted values.

Key Components

1. Convolutional Filters: Learn local patterns and features.
2. Stride: Control the filter's movement along the sequence.
3. Padding: Handle boundary effects.
4. Dilation: Increase receptive field.

Univariate CNN Advantages

1. Automatic Feature Extraction: Learn relevant features.
2. Handling Non-Linear Relationships: Non-linear activation functions.
3. Robustness to Noise: Pooling and convolutional operations.

Theoretical Considerations

1. Receptive Field: Number of input values seen by each filter.
2. Filter Size: Controls the number of learnable parameters.
3. Number of Filters: Controls the number of feature maps.

Common Architectures

1. CNN-FCN (Convolutional Neural Network-Fully Convolutional Network)
2. TCN (Temporal Convolutional Network)
3. ResCNN (Residual Convolutional Neural Network)

Training Considerations

1. Loss Functions: Mean Squared Error (MSE), Mean Absolute Error (MAE).
2. Optimization Algorithms: Adam, SGD.
3. Regularization Techniques: Dropout, L1/L2 regularization.

Real-World Applications

1. Time Series Forecasting: Financial markets, weather, traffic.
2. Anomaly Detection: Fault detection, intrusion detection.
3. Signal Processing: Audio, image processing.

In [None]:
!pip install tensorflow



In [None]:
 !pip install tensorflow==2.13.0

Collecting tensorflow==2.13.0
  Downloading tensorflow-2.13.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.4 kB)
Collecting gast<=0.4.0,>=0.2.1 (from tensorflow==2.13.0)
  Downloading gast-0.4.0-py3-none-any.whl.metadata (1.1 kB)
Collecting keras<2.14,>=2.13.1 (from tensorflow==2.13.0)
  Downloading keras-2.13.1-py3-none-any.whl.metadata (2.4 kB)
Collecting numpy<=1.24.3,>=1.22 (from tensorflow==2.13.0)
  Downloading numpy-1.24.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (5.6 kB)
Collecting tensorboard<2.14,>=2.13 (from tensorflow==2.13.0)
  Downloading tensorboard-2.13.0-py3-none-any.whl.metadata (1.8 kB)
Collecting tensorflow-estimator<2.14,>=2.13.0 (from tensorflow==2.13.0)
  Downloading tensorflow_estimator-2.13.0-py2.py3-none-any.whl.metadata (1.3 kB)
Collecting typing-extensions<4.6.0,>=3.6.6 (from tensorflow==2.13.0)
  Downloading typing_extensions-4.5.0-py3-none-any.whl.metadata (8.5 kB)
Collecting google-auth-oauthlib<1.1,>=0

In [None]:

# Univariate CNN example

from numpy import array
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv1D, MaxPooling1D, Flatten, Dense

# Split a univariate sequence into samples
def split_sequence(sequence, n_steps):
    X, y = list(), list()
    for i in range(len(sequence)):
        # Find the end of this pattern
        end_ix = i + n_steps

        # Check if we are beyond the sequence
        if end_ix > len(sequence) - 1:
            break

        # Gather input and output parts of the pattern
        seq_x, seq_y = sequence[i:end_ix], sequence[end_ix]
        X.append(seq_x)
        y.append(seq_y)

    return array(X), array(y)

# Define input sequence
raw_seq = [10, 20, 30, 40, 50, 60, 70, 80, 90]

# Choose a number of time steps
n_steps = 3

# Split into samples
X, y = split_sequence(raw_seq, n_steps)

# Reshape from [samples, timesteps] into [samples, timesteps, features]
n_features = 1
X = X.reshape((X.shape[0], X.shape[1], n_features))

# Define model
model = Sequential()
model.add(Conv1D(filters=64, kernel_size=2, activation='relu', input_shape=(n_steps, n_features)))
model.add(MaxPooling1D(pool_size=2))
model.add(Flatten())
model.add(Dense(50, activation='relu'))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')

# Fit model
model.fit(X, y, epochs=1000, verbose=0)

# Demonstrate prediction
x_input = array([70, 80, 90])
x_input = x_input.reshape((1, n_steps, n_features))
yhat = model.predict(x_input, verbose=0)
print(yhat)

[[101.41136]]


Second type to do Univarient TSF- Multi-headed CNN model

ere's a theoretical explanation of a Multi-Headed CNN model for Univariate Time Series Forecasting (TSF):

Architecture:

1. Input Layer: 1D sequence of univariate time series values.
2. Convolutional Block:
    - Multiple parallel convolutional layers (heads) with different kernel sizes.
    - Each head extracts features at different scales.
3. Concatenation Layer: Combine feature maps from all heads.
4. Pooling Layer: Downsample the concatenated feature maps.
5. Flatten Layer: Convert feature maps to 1D.
6. Dense Layers: Output forecasted values.

Multi-Headed CNN Components:

1. Multiple Heads: Each head is a convolutional layer with a different kernel size, allowing the model to capture features at multiple scales.
2. Kernel Sizes: Typically, small (e.g., 3), medium (e.g., 5), and large (e.g., 11) kernel sizes are used.
3. Feature Concatenation: Combining features from all heads allows the model to leverage information from multiple scales.

Theoretical Advantages:

1. Multi-Scale Feature Extraction: Captures both short-term and long-term dependencies.
2. Increased Receptive Field: Larger kernel sizes increase the receptive field, allowing the model to consider more historical values.
3. Improved Feature Representation: Concatenating features from multiple heads provides a richer representation of the input data.

Training Considerations:

1. Loss Function: Mean Squared Error (MSE) or Mean Absolute Error (MAE) are common choices.
2. Optimization Algorithm: Adam or SGD with momentum.
3. Regularization Techniques: Dropout, L1/L2 regularization.

Real-World Applications:

1. Financial Forecasting: Predicting stock prices, currency exchange rates.
2. Weather Forecasting: Predicting temperature, precipitation.
3. Traffic Forecasting: Predicting traffic volume.

Comparison to Other Models:

1. Single-Headed CNN: Limited to a single scale, may miss important features.
2. RNNs/LSTMs: May struggle with long-term dependencies, require more parameters.
3. ARIMA: Limited to linear relationships.


In [None]:
# Multivariate Multi-Headed 1D CNN example

from numpy import array
from numpy import hstack
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import Flatten
from tensorflow.keras.layers import Conv1D
from tensorflow.keras.layers import MaxPooling1D
from tensorflow.keras.layers import concatenate

# Split a multivariate sequence into samples
def split_sequences(sequences, n_steps):
    X, y = list(), list()
    for i in range(len(sequences)):
        # Find the end of this pattern
        end_ix = i + n_steps

        # Check if we are beyond the dataset
        if end_ix > len(sequences):
            break

        # Gather input and output parts of the pattern
        seq_x, seq_y = sequences[i:end_ix, :-1], sequences[end_ix-1, -1]
        X.append(seq_x)
        y.append(seq_y)

    return array(X), array(y)

# Define input sequence
in_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90])
in_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95])
out_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))])

# Convert to [rows, columns] structure
in_seq1 = in_seq1.reshape((len(in_seq1), 1))
in_seq2 = in_seq2.reshape((len(in_seq2), 1))
out_seq = out_seq.reshape((len(out_seq), 1))

# Horizontally stack columns
dataset = hstack((in_seq1, in_seq2, out_seq))

# Choose a number of time steps
n_steps = 3

# Convert into input/output
X, y = split_sequences(dataset, n_steps)

# One time series per head
n_features = 1

# Separate input data
X1 = X[:, :, 0].reshape(X.shape[0], X.shape[1], n_features)
X2 = X[:, :, 1].reshape(X.shape[0], X.shape[1], n_features)

# First input model
visible1 = Input(shape=(n_steps, n_features))
cnn1 = Conv1D(filters=64, kernel_size=2, activation='relu')(visible1)
cnn1 = MaxPooling1D(pool_size=2)(cnn1)
cnn1 = Flatten()(cnn1)

# Second input model
visible2 = Input(shape=(n_steps, n_features))
cnn2 = Conv1D(filters=64, kernel_size=2, activation='relu')(visible2)
cnn2 = MaxPooling1D(pool_size=2)(cnn2)
cnn2 = Flatten()(cnn2)

# Merge input models
merge = concatenate([cnn1, cnn2])
dense = Dense(50, activation='relu')(merge)
output = Dense(1)(dense)

model = Model(inputs=[visible1, visible2], outputs=output)
model.compile(optimizer='adam', loss='mse')

# Fit model
model.fit([X1, X2], y, epochs=1000, verbose=0)

# Demonstrate prediction
x_input = array([[80, 85], [90, 95], [100, 105]]) #6, 3,3 - input 3 features, 3 timestep
x1 = x_input[:, 0].reshape((1, n_steps, n_features))
x2 = x_input[:, 1].reshape((1, n_steps, n_features))
yhat = model.predict([x1, x2], verbose=0)
print(yhat)

[[205.75526]]


Multiple Parallel Series

theoretical overview of Univariate Multiple Parallel Series forecasting:

Problem Statement:

Given a univariate time series dataset with multiple parallel series, forecast future values.

Definition:

- Univariate: Single variable or feature.
- Multiple Parallel Series: Multiple time series with the same frequency and related to the same phenomenon.

Example:

- Daily temperature readings from multiple cities.
- Monthly sales data from multiple stores.

Theoretical Framework:

1. Single-Model Approach: Train a single model on all parallel series.
    - Advantages: Simplifies training and prediction.
    - Disadvantages: May not capture series-specific patterns.
2. Multi-Model Approach: Train separate models for each parallel series.
    - Advantages: Captures series-specific patterns.
    - Disadvantages: Increases model complexity and training time.
3. Hybrid Approach: Combine single-model and multi-model approaches.
    - Advantages: Balances simplicity and series-specific pattern capture.

Architectures:

1. Shared-Weights CNN: Single CNN with shared weights across all series.
2. Separate-Weights CNN: Separate CNNs for each series.
3. Hierarchical CNN: Shared CNN followed by series-specific CNNs.

Training Strategies:

1. Joint Training: Train all models simultaneously.
2. Sequential Training: Train each model separately.
3. Transfer Learning: Pre-train on one series, fine-tune on others.

Evaluation Metrics:

1. Mean Absolute Error (MAE).
2. Mean Squared Error (MSE).
3. Root Mean Squared Percentage Error (RMSPE).

Real-World Applications:

1. Climate Modeling: Forecast temperature, precipitation.
2. Economics: Forecast GDP, inflation.
3. Finance: Forecast stock prices.

Comparison to Other Methods:

1. ARIMA: Limited to linear relationships.
2. LSTM: May struggle with long-term dependencies.
3. Prophet: May not capture complex patterns.

Theoretical Considerations:

1. Series Correlation: Account for correlations between series.
2. Series Heterogeneity: Handle varying series characteristics.
3. Overfitting: Regularization techniques.

Future Research Directions:

1. Deep Learning Architectures: Explore novel architectures.
2. Transfer Learning: Investigate pre-training strategies.
3. Explainability: Develop techniques for interpreting forecasts.

This theoretical overview provides a foundation for understanding univariate multiple parallel series forecasting.

Multiple Parallel Serie

In [18]:


# Multivariate Output 1D CNN Example

from numpy import array
from numpy import hstack
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import Flatten
from tensorflow.keras.layers import Conv1D
from tensorflow.keras.layers import MaxPooling1D

# Split a multivariate sequence into samples
def split_sequences(sequences, n_steps):
    X, y = list(), list()
    for i in range(len(sequences)):
        # Find the end of this pattern
        end_ix = i + n_steps

        # Check if we are beyond the dataset
        if end_ix > len(sequences) - 1:
            break

        # Gather input and output parts of the pattern
        seq_x, seq_y = sequences[i:end_ix, :], sequences[end_ix, :]
        X.append(seq_x)
        y.append(seq_y)

    return array(X), array(y)

# Define input sequence
in_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90])
in_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95])
out_seq = array([in_seq1[i] + in_seq2[i] for i in range(len(in_seq1))])

# Convert to [rows, columns] structure
in_seq1 = in_seq1.reshape((len(in_seq1), 1))
in_seq2 = in_seq2.reshape((len(in_seq2), 1))
out_seq = out_seq.reshape((len(out_seq), 1))

# Horizontally stack columns
dataset = hstack((in_seq1, in_seq2, out_seq))

# Choose a number of time steps
n_steps = 3

# Convert into input/output
X, y = split_sequences(dataset, n_steps)

# The dataset knows the number of features, e.g. 3
n_features = X.shape[2]

# Define model
model = Sequential()
model.add(Conv1D(filters=64, kernel_size=2, activation='relu', input_shape=(n_steps, n_features)))
model.add(MaxPooling1D(pool_size=2))
model.add(Flatten())
model.add(Dense(50, activation='relu'))
model.add(Dense(n_features))
model.compile(optimizer='adam', loss='mse')

# Fit model
model.fit(X, y, epochs=3000, verbose=0)

# Demonstrate prediction
x_input = array([[70, 75, 145], [80, 85, 165], [90, 95, 185]])
x_input = x_input.reshape((1, n_steps, n_features))
yhat = model.predict(x_input, verbose=0)
print(yhat)

[[100.16192 105.37202 205.47534]]


 Meta AI: Here's a theoretical explanation of a Univariate Multi-Output CNN model:

Problem Statement:

Given a univariate time series dataset, predict multiple future values simultaneously.

Architecture:

1. Input Layer: 1D sequence of univariate time series values.
2. Convolutional Block:
    - Multiple convolutional layers with different kernel sizes.
    - Each layer extracts features at different scales.
3. Flatten Layer: Convert feature maps to 1D.
4. Dense Layers:
    - Multiple dense layers with different output dimensions.
    - Each layer predicts a specific future value.
5. Output Layer: Multiple outputs, each corresponding to a predicted future value.

Theoretical Advantages:

1. Multi-Scale Feature Extraction: Captures both short-term and long-term dependencies.
2. Simultaneous Prediction: Predicts multiple future values, reducing computational complexity.
3. Shared Knowledge: Shared convolutional features across outputs improve overall performance.

Key Components:

1. Convolutional Layers: Extract local patterns and trends.
2. Dilation: Increases receptive field, capturing longer-term dependencies.
3. Residual Connections: Enhance feature propagation and training stability.
4. Output Layers: Separate dense layers for each predicted value.

Training Considerations:

1. Loss Function: Combine losses for each output (e.g., MSE, MAE).
2. Optimization Algorithm: Adam, SGD with momentum.
3. Regularization Techniques: Dropout, L1/L2 regularization.

Real-World Applications:

1. Financial Forecasting: Predict stock prices, currency exchange rates.
2. Weather Forecasting: Predict temperature, precipitation.
3. Energy Demand Forecasting: Predict energy consumption.

Comparison to Other Models:

1. ARIMA: Limited to linear relationships.
2. LSTM: May struggle with long-term dependencies.
3. Prophet: May not capture complex patterns.

Theoretical Considerations:

1. Output Correlation: Account for correlations between predicted values.
2. Output Heterogeneity: Handle varying output characteristics.
3. Overfitting: Regularization techniques.

In [19]:
# Multivariate Output 1D CNN Example

from numpy import array
from numpy import hstack
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import Flatten
from tensorflow.keras.layers import Conv1D
from tensorflow.keras.layers import MaxPooling1D

# Split a multivariate sequence into samples
def split_sequences(sequences, n_steps):
    X, y = list(), list()
    for i in range(len(sequences)):
        # Find the end of this pattern
        end_ix = i + n_steps

        # Check if we are beyond the dataset
        if end_ix > len(sequences) - 1:
            break

        # Gather input and output parts of the pattern
        seq_x, seq_y = sequences[i:end_ix, :], sequences[end_ix, :]
        X.append(seq_x)
        y.append(seq_y)

    return array(X), array(y)

# Define input sequence
in_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90])
in_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95])
out_seq = array([in_seq1[i] + in_seq2[i] for i in range(len(in_seq1))])

# Convert to [rows, columns] structure
in_seq1 = in_seq1.reshape((len(in_seq1), 1))
in_seq2 = in_seq2.reshape((len(in_seq2), 1))
out_seq = out_seq.reshape((len(out_seq), 1))

# Horizontally stack columns
dataset = hstack((in_seq1, in_seq2, out_seq))

# Choose a number of time steps
n_steps = 3

# Convert into input/output
X, y = split_sequences(dataset, n_steps)

# The dataset knows the number of features, e.g. 3
n_features = X.shape[2]

# Separate output
y1 = y[:, 0].reshape((y.shape[0], 1))
y2 = y[:, 1].reshape((y.shape[0], 1))
y3 = y[:, 2].reshape((y.shape[0], 1))

# Define model
visible = Input(shape=(n_steps, n_features))
cnn = Conv1D(filters=64, kernel_size=2, activation='relu')(visible)
cnn = MaxPooling1D(pool_size=2)(cnn)
cnn = Flatten()(cnn)
cnn = Dense(50, activation='relu')(cnn)

# Define outputs
output1 = Dense(1)(cnn)
output2 = Dense(1)(cnn)
output3 = Dense(1)(cnn)

# Tie together
model = Model(inputs=visible, outputs=[output1, output2, output3])
model.compile(optimizer='adam', loss='mse')

# Fit model
model.fit(X, [y1, y2, y3], epochs=2000, verbose=0)

# Demonstrate prediction
x_input = array([[70, 75, 145], [80, 85, 165], [90, 95, 185]])
x_input = x_input.reshape((1, n_steps, n_features))
yhat = model.predict(x_input, verbose=0)
print(yhat)

[array([[101.35368]], dtype=float32), array([[107.16554]], dtype=float32), array([[208.35492]], dtype=float32)]


Multi-step CNN models are a type of deep learning architecture used for time series forecasting and sequence prediction tasks. Here's a theoretical explanation:

Architecture:

1. Input Layer: 1D sequence of values.
2. Convolutional Block:
    - Multiple convolutional layers with different kernel sizes.
    - Each layer extracts features at different scales.
3. Flatten Layer: Convert feature maps to 1D.
4. Dense Layers:
    - Multiple dense layers with different output dimensions.
    - Each layer predicts a specific future value.
5. Output Layer: Multiple outputs, each corresponding to a predicted future value.

Theory:

1. Convolutional Neural Networks (CNNs): CNNs are designed to extract local patterns and trends in data.
2. Multi-Step Prediction: Instead of predicting a single future value, multi-step models predict multiple future values.
3. Sequence-to-Sequence (Seq2Seq) Modeling: Multi-step CNNs can be viewed as Seq2Seq models, where the input sequence is mapped to multiple output sequences.
4. Temporal Hierarchies: Multi-step CNNs can capture temporal hierarchies by using multiple convolutional layers with different kernel sizes.

Key Components:

1. Convolutional Layers: Extract local patterns and trends.
2. Dilation: Increases receptive field, capturing longer-term dependencies.
3. Residual Connections: Enhance feature propagation and training stability.
4. Output Layers: Separate dense layers for each predicted value.

Training Considerations:

1. Loss Function: Combine losses for each output (e.g., MSE, MAE).
2. Optimization Algorithm: Adam, SGD with momentum.
3. Regularization Techniques: Dropout, L1/L2 regularization.

Advantages:

1. Improved Accuracy: Multi-step models can capture complex patterns.
2. Increased Interpretability: Separate outputs provide insights into future values.
3. Flexibility: Can be used for various sequence prediction tasks.

Challenges:

1. Increased Complexity: Multi-step models require more parameters.
2. Overfitting: Regularization techniques are crucial.
3. Computational Cost: Increased computational requirements.

Real-World Applications:

1. Financial Forecasting: Predict stock prices, currency exchange rates.
2. Weather Forecasting: Predict temperature, precipitation.
3. Energy Demand Forecasting: Predict energy consumption.

Comparison to Other Models:

1. ARIMA: Limited to linear relationships.
2. LSTM: May struggle with long-term dependencies.
3. Prophet: May not capture complex patterns.

In [20]:
# Univariate Multi-Step Vector-Output 1D CNN Example

from numpy import array
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import Flatten
from tensorflow.keras.layers import Conv1D
from tensorflow.keras.layers import MaxPooling1D

# Split a univariate sequence into samples
def split_sequence(sequence, n_steps_in, n_steps_out):
    X, y = list(), list()
    for i in range(len(sequence)):
        # Find the end of this pattern
        end_ix = i + n_steps_in
        out_end_ix = end_ix + n_steps_out

        # Check if we are beyond the sequence
        if out_end_ix > len(sequence):
            break

        # Gather input and output parts of the pattern
        seq_x, seq_y = sequence[i:end_ix], sequence[end_ix:out_end_ix]
        X.append(seq_x)
        y.append(seq_y)

    return array(X), array(y)

# Define input sequence
raw_seq = array([10, 20, 30, 40, 50, 60, 70, 80, 90])

# Choose a number of time steps
n_steps_in, n_steps_out = 3, 2

# Split into samples
X, y = split_sequence(raw_seq, n_steps_in, n_steps_out)

# Reshape from [samples, timesteps] into [samples, timesteps, features]
n_features = 1
X = X.reshape((X.shape[0], X.shape[1], n_features))

# Define model
model = Sequential()
model.add(Conv1D(filters=64, kernel_size=2, activation='relu', input_shape=(n_steps_in, n_features)))
model.add(MaxPooling1D(pool_size=2))
model.add(Flatten())
model.add(Dense(50, activation='relu'))
model.add(Dense(n_steps_out))
model.compile(optimizer='adam', loss='mse')

# Fit model
model.fit(X, y, epochs=2000, verbose=0)

# Demonstrate prediction
x_input = array([70, 80, 90])
x_input = x_input.reshape((1, n_steps_in, n_features))
yhat = model.predict(x_input, verbose=0)
print(yhat)



[[102.53493 115.45038]]


### Multivarient Multi-Step CNN MODEL
Here's a theoretical overview of 1D Multivariate Multi-Step CNN models:

Architecture:

1. Input Layer: 1D multivariate sequence (e.g., time series data with multiple variables).
2. Convolutional Block:
    - Multiple convolutional layers with different kernel sizes.
    - Each layer extracts features from different scales.
3. Flatten Layer: Convert feature maps to 1D.
4. Dense Layers:
    - Multiple dense layers with different output dimensions.
    - Each layer predicts a specific future value.
5. Output Layer: Multiple outputs, each corresponding to a predicted future value.

Theory:

1. Multivariate Input: Model takes multiple variables as input, capturing relationships between them.
2. Multi-Step Prediction: Model predicts multiple future values, considering temporal dependencies.
3. Convolutional Neural Networks (CNNs): Effective for extracting local patterns and trends.
4. 1D CNNs: Suitable for sequential data, such as time series.
5. Dilation: Increases receptive field, capturing longer-term dependencies.
6. Residual Connections: Enhance feature propagation and training stability.

Key Components:

1. Convolutional Layers: Extract local patterns and trends.
2. Max Pooling: Reduces spatial dimensions, retaining important features.
3. Flatten Layer: Converts feature maps to 1D.
4. Dense Layers: Predict future values.
5. Output Layers: Separate dense layers for each predicted value.

Training Considerations:

1. Loss Function: Combine losses for each output (e.g., MSE, MAE).
2. Optimization Algorithm: Adam, SGD with momentum.
3. Regularization Techniques: Dropout, L1/L2 regularization.

Advantages:

1. Improved Accuracy: Captures complex patterns and relationships.
2. Increased Interpretability: Separate outputs provide insights into future values.
3. Flexibility: Suitable for various multivariate time series forecasting tasks.

Multiple Input Multi-Step Output models:

Architecture:

1. Multiple Input Layers: Each input layer processes a different input sequence.
2. Concatenate or Merge Layer: Combines input features from different sequences.
3. Convolutional Block:
    - Multiple convolutional layers with different kernel sizes.
    - Each layer extracts features from different scales.
4. Flatten Layer: Convert feature maps to 1D.
5. Dense Layers:
    - Multiple dense layers with different output dimensions.
    - Each layer predicts a specific future value.
6. Output Layer: Multiple outputs, each corresponding to a predicted future value.

Theory:

1. Multiple Input: Model takes multiple input sequences, capturing relationships between them.
2. Multi-Step Prediction: Model predicts multiple future values, considering temporal dependencies.
3. Feature Fusion: Combines input features from different sequences to capture complex relationships.
4. Convolutional Neural Networks (CNNs): Effective for extracting local patterns and trends.

Key Components:

1. Input Layers: Process different input sequences.
2. Concatenate or Merge Layer: Combines input features.
3. Convolutional Layers: Extract local patterns and trends.
4. Dense Layers: Predict future values.
5. Output Layers: Separate dense layers for each predicted value.

Training Considerations:

1. Loss Function: Combine losses for each output (e.g., MSE, MAE).
2. Optimization Algorithm: Adam, SGD with momentum.
3. Regularization Techniques: Dropout, L1/L2 regularization.

Advantages:

1. Improved Accuracy: Captures complex relationships between input sequences.
2. Increased Interpretability: Separate outputs provide insights into future values.
3. Flexibility: Suitable for various multi-input multi-step forecasting tasks.

Challenges:

1. Increased Complexity: Requires careful architecture design.
2. Overfitting: Regularization techniques are crucial.
3. Computational Cost: Increased computational requirements.

In [21]:
# Multivariate Multi-Step 1D CNN Example

from numpy import array
from numpy import hstack
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import Flatten
from tensorflow.keras.layers import Conv1D
from tensorflow.keras.layers import MaxPooling1D

# Split a multivariate sequence into samples
def split_sequences(sequences, n_steps_in, n_steps_out):
    X, y = list(), list()
    for i in range(len(sequences)):
        # Find the end of this pattern
        end_ix = i + n_steps_in
        out_end_ix = end_ix + n_steps_out - 1

        # Check if we are beyond the dataset
        if out_end_ix > len(sequences):
            break

        # Gather input and output parts of the pattern
        seq_x, seq_y = sequences[i:end_ix, :-1], sequences[end_ix-1:out_end_ix, -1]
        X.append(seq_x)
        y.append(seq_y)

    return array(X), array(y)

# Define input sequence
in_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90])
in_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95])
out_seq = array([in_seq1[i] + in_seq2[i] for i in range(len(in_seq1))])

# Convert to [rows, columns] structure
in_seq1 = in_seq1.reshape((len(in_seq1), 1))
in_seq2 = in_seq2.reshape((len(in_seq2), 1))
out_seq = out_seq.reshape((len(out_seq), 1))

# Horizontally stack columns
dataset = hstack((in_seq1, in_seq2, out_seq))

# Choose a number of time steps
n_steps_in, n_steps_out = 3, 2

# Convert into input/output
X, y = split_sequences(dataset, n_steps_in, n_steps_out)

# The dataset knows the number of features, e.g. 2
n_features = X.shape[2]

# Define model
model = Sequential()
model.add(Conv1D(filters=64, kernel_size=2, activation='relu', input_shape=(n_steps_in, n_features)))
model.add(MaxPooling1D(pool_size=2))
model.add(Flatten())
model.add(Dense(50, activation='relu'))
model.add(Dense(n_steps_out))
model.compile(optimizer='adam', loss='mse')

# Fit model
model.fit(X, y, epochs=2000, verbose=0)

# Demonstrate prediction
x_input = array([[70, 75], [80, 85], [90, 95]])
x_input = x_input.reshape((1, n_steps_in, n_features))
yhat = model.predict(x_input, verbose=0)
print(yhat)




[[185.61382 207.25533]]


Multiple Parallel Input and Multi-step Output

Multiple Parallel Input and Multi-step Output models are a type of deep learning architecture designed for sequential data with multiple input variables and multiple output variables. Here's a theoretical overview:

Architecture:

1. Multiple Input Layers: Each input layer processes a different input sequence.
2. Concatenate or Merge Layer: Combines input features from different sequences.
3. Encoder Layer: Extracts relevant features from input sequences.
4. Decoder Layer: Generates output sequences based on extracted features.
5. Output Layer: Produces multiple output sequences.

Theory:

1. Multiple Input: Model takes multiple input sequences, capturing relationships between them.
2. Multi-step Output: Model predicts multiple future values, considering temporal dependencies.
3. Feature Fusion: Combines input features from different sequences to capture complex relationships.
4. Sequence-to-Sequence (Seq2Seq) Modeling: Encoder-decoder architecture for sequential data.

Key Components:

1. Input Layers: Process different input sequences.
2. Encoder Layer: Extracts relevant features from input sequences.
3. Decoder Layer: Generates output sequences based on extracted features.
4. Output Layer: Produces multiple output sequences.
5. Activation Functions: Used in encoder and decoder layers (e.g., ReLU, LSTM).

Training Considerations:

1. Loss Function: Combine losses for each output sequence (e.g., MSE, MAE).
2. Optimization Algorithm: Adam, SGD with momentum.
3. Regularization Techniques: Dropout, L1/L2 regularization.

Advantages:

1. Improved Accuracy: Captures complex relationships between input sequences.
2. Increased Interpretability: Separate output sequences provide insights into future values.
3. Flexibility: Suitable for various multi-input multi-step forecasting tasks.

Challenges:

1. Increased Complexity: Requires careful architecture design.
2. Overfitting: Regularization techniques are crucial.
3. Computational Cost: Increased computational requirements.

Real-World Applications:

1. Financial Forecasting: Predict stock prices, currency exchange rates.
2. Weather Forecasting: Predict temperature, precipitation.
3. Energy Demand Forecasting: Predict energy consumption.

Example Architecture:


                      +---------------+
                      |  Input Layer  |
                      +---------------+
                             |
                             |
                             v
                      +---------------+
                      |  Encoder Layer  |
                      |  (LSTM/Conv1D)  |
                      +---------------+
                             |
                             |
                             v
                      +---------------+
                      |  Decoder Layer  |
                      |  (LSTM/Conv1D)  |
                      +---------------+
                             |
                             |
                             v
                      +---------------+
                      |  Output Layer  |
                      |  (Dense)       |
                      +---------------+



In [22]:

# Multivariate Output Multi-Step 1D CNN Example

from numpy import array
from numpy import hstack
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.layers import Flatten
from tensorflow.keras.layers import Conv1D
from tensorflow.keras.layers import MaxPooling1D

# Split a multivariate sequence into samples
def split_sequences(sequences, n_steps_in, n_steps_out):
    X, y = list(), list()
    for i in range(len(sequences)):
        # Find the end of this pattern
        end_ix = i + n_steps_in
        out_end_ix = end_ix + n_steps_out

        # Check if we are beyond the dataset
        if out_end_ix > len(sequences):
            break

        # Gather input and output parts of the pattern
        seq_x, seq_y = sequences[i:end_ix, :], sequences[end_ix:out_end_ix, :]
        X.append(seq_x)
        y.append(seq_y)

    return array(X), array(y)

# Define input sequence
in_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90])
in_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95])
out_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))])

# Convert to [rows, columns] structure
in_seq1 = in_seq1.reshape((len(in_seq1), 1))
in_seq2 = in_seq2.reshape((len(in_seq2), 1))
out_seq = out_seq.reshape((len(out_seq), 1))

# Horizontally stack columns
dataset = hstack((in_seq1, in_seq2, out_seq))

# Choose a number of time steps
n_steps_in, n_steps_out = 3, 2

# Convert into input/output
X, y = split_sequences(dataset, n_steps_in, n_steps_out)

# Flatten output
n_output = y.shape[1] * y.shape[2]
y = y.reshape((y.shape[0], n_output))

# The dataset knows the number of features
n_features = X.shape[2]

# Define model
model = Sequential()
model.add(Conv1D(filters=64, kernel_size=2, activation='relu', input_shape=(n_steps_in, n_features)))
model.add(MaxPooling1D(pool_size=2))
model.add(Flatten())
model.add(Dense(50, activation='relu'))
model.add(Dense(n_output))
model.compile(optimizer='adam', loss='mse')

# Fit model
model.fit(X, y, epochs=7000, verbose=0)

# Demonstrate prediction
x_input = array([[60, 65, 125], [70, 75, 145], [80, 85, 165]])
x_input = x_input.reshape((1, n_steps_in, n_features))
yhat = model.predict(x_input, verbose=0)
print(yhat)



[[ 90.56951  96.18933 187.33656 101.28927 106.82775 208.40642]]



Changes made:

1. Fixed indentation.
2. Removed unnecessary comments.
3. Renamed variables for clarity.
4. Checked for consistency in variable naming conventions.

This code defines a multivariate output multi-step 1D CNN model, trains it on a sample sequence, and demonstrates prediction. The model takes 3 input time steps and predicts 2 output time steps for 3 variables.