Skip to content

Gated Recurrent Units (GRUs) are a type of RNN designed to capture long-term dependencies in sequential data

Notifications You must be signed in to change notification settings

Nishant2018/GRU-Scratch-

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

Gated Recurrent Unit (GRU)

Introduction

A Gated Recurrent Unit (GRU) is a type of recurrent neural network (RNN) architecture that is widely used in the field of deep learning for sequence-based tasks. GRUs are designed to handle the vanishing gradient problem, which can hamper the performance of traditional RNNs, making them more effective for learning long-term dependencies.

Key Features

  • Gates: GRUs have gating units that control the flow of information. The main gates are the update gate and the reset gate.
  • Update Gate: This gate determines how much of the past information needs to be passed along to the future.
  • Reset Gate: This gate determines how much of the past information to forget.
  • No Separate Memory Cell: Unlike LSTM (Long Short-Term Memory), GRUs do not have a separate memory cell. The hidden state itself carries the information.

How GRUs Work

A GRU's architecture can be summarized with the following steps:

  1. Update Gate: The update gate decides what information to keep from the past and what new information to add. It is computed as:

    z_t = σ(W_z * [h_{t-1}, x_t])

    where z_t is the update gate, W_z is the weight matrix, h_{t-1} is the previous hidden state, x_t is the current input, and σ is the sigmoid function.

  2. Reset Gate: The reset gate decides what part of the past information to forget. It is computed as:

    r_t = σ(W_r * [h_{t-1}, x_t])

    where r_t is the reset gate, and W_r is the weight matrix.

  3. New Memory Content: The new memory content is created using the reset gate and the new input:

    h'_t = tanh(W * [r_t * h_{t-1}, x_t])

    where h'_t is the new memory content.

  4. Final Memory at Current Time Step: The final memory is a combination of the previous memory and the new memory content, controlled by the update gate:

    h_t = (1 - z_t) * h_{t-1} + z_t * h'_t

Applications

  • Language Modeling: Predicting the next word in a sentence.
  • Machine Translation: Translating text from one language to another.
  • Speech Recognition: Converting spoken language into text.
  • Time Series Prediction: Forecasting future values in a sequence of data points.

Advantages

  • Simpler Architecture: Compared to LSTMs, GRUs have a simpler structure with fewer gates.
  • Efficient Training: GRUs are computationally efficient and train faster due to fewer parameters.
  • Effective for Sequential Data: GRUs handle sequential data and long-term dependencies well.

Conclusion

GRUs are powerful and efficient recurrent neural networks that address the limitations of traditional RNNs, making them suitable for various sequential data tasks. Their ability to manage long-term dependencies with a simpler architecture has made them a popular choice in the field of deep learning.

About

Gated Recurrent Units (GRUs) are a type of RNN designed to capture long-term dependencies in sequential data

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published