DecoderOnlyTransformer

Decoder-Only Transformer

This repository contains an implementation of a Decoder-Only Transformer, a foundational architecture for large language models (LLMs). The model processes input text by passing it through several key components:

Word Embedding: Converts input tokens (e.g., words) into dense vector representations.
Positional Encoding: Adds positional information to embeddings, allowing the model to understand token order.
Masked Self-Attention: Focuses on previous tokens during training to enable autoregressive text generation.
Residual Connections and Fully Connected Layers: Enhance model efficiency and depth for better learning.
Softmax Layer: Outputs probabilities for the next token in the sequence.

This implementation is ideal for experimenting with LLM concepts and serves as a learning resource for understanding decoder-only transformer models. I have used PyTorch + Lightning to create and optimize a Decoder-Only Transformer.

Feel free to explore, modify, and contribute!

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md
SimpleGPT.ipynb		SimpleGPT.ipynb
attention.py		attention.py
dataset.py		dataset.py
decoder_only_transformer.py		decoder_only_transformer.py
position_encoding.py		position_encoding.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DecoderOnlyTransformer

About

Uh oh!

Releases

Packages

Languages

rahmanidashti/DecoderOnlyTransformer

Folders and files

Latest commit

History

Repository files navigation

DecoderOnlyTransformer

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages