positional-encoding

Learnable Positional Encoding with PyTorch

Overview

This project demonstrates the implementation of a learnable positional encoding method using PyTorch. Positional encoding is critical for sequential data processing in Transformer models since they lack an inherent sense of order. Instead of fixed encodings (like sine and cosine), this solution leverages learnable embeddings, which adapt to the data distribution and potentially enhance performance.

The project also includes a basic Transformer-based model applied to a dummy dataset for training and evaluation.

Problem Statement

In Transformer architectures, stacking self-attention layers with positional encoding can introduce following challenges:

Loss of positional information in deeper layers.
Difficulty optimizing long sequences or large datasets due to computational costs.
The rigidity of fixed positional encodings in handling diverse sequence lengths or structures.

This project explores a learnable positional encoding method to mitigate these issues and adapts it within a Transformer model.

Architecture

The solution consists of:

Learnable Positional Encoding Layer:

Adds position-specific embeddings to input sequences.

Can generalize better than fixed sine/cosine encodings.

Transformer Encoder:

Stacked self-attention layers for sequence modeling.

Fully Connected Layer:

Aggregates sequence-level information for classification.

Main Components

Learnable Positional Encoding:

A layer that adds position-specific embeddings to the input sequence. Implemented using PyTorch nn.Parameter.

Transformer Model:

Composed of multiple self-attention layers for sequence representation. Uses the learnable positional encoding layer before passing data to attention layers.

Dummy Dataset:

Randomly generated data simulates sequential inputs of length 10 with 16 features per step.

Training:

The model is trained on the dummy dataset for 5 epochs with Cross-Entropy Loss and Adam optimizer.

Results

The project successfully demonstrates:

Integration of learnable positional encodings in a Transformer model.

Training on a dummy dataset with minimal overfitting.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
README.md		README.md
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

positional-encoding

Learnable Positional Encoding with PyTorch

Problem Statement

Architecture

Main Components

Results

About

Uh oh!

Releases

Packages

Languages

Oshintiwari/positional-encoding

Folders and files

Latest commit

History

Repository files navigation

positional-encoding

Learnable Positional Encoding with PyTorch

Problem Statement

Architecture

Main Components

Results

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages