My 'tiny-scale' implementation of decoder (ChatGPT)
Description/Docs
I have spent most of last 2 summers (2022, 2023) studying artificial intelligence. For the past 2 months ([6/7]/2023) I was studying deep learning, especially very popular transformer architecture. This project is inauguration of my studies. It's simple, small scale implementation of the decoder part of the transformer.
It can be trained on .txt files. This is very primitive approach, nontheless it results in visible results. I have trained a small 100M model on old polish texts, and the output was striking. It wasn't perfectly logical, or beautiful, but:
- it was polish
- it sounded like archaic polish (which was the goal)
- proof that my code works
I used Python, and it's machine learning library PyTorch, although it's important to notice, that i didn't use stock implmentation for everything, mainly so I could get more experience and a better understanding.
To tinker with my code you will first need to clone to repository
git clone https://github.com/CptNemo0/MyFirstTransformer
Than install requirements from the txt file
pip install -r requirements.txt
You're good to go!!
Work in progress. TBA
- Barebones logic
- Working small scale models
- Designed gui
- Large Language Model
- Research more pretrainig techniques
- Research finetuning (chain-of-thought, intruction finetuning, meta-learning (few shot))
- Working gui
- Training gui
- Inferece gui
- Documentation
- Usage section of README
Paweł Stus - pawel.j.stus@gmail.com Project Link: https://github.com/CptNemo0/MyFirstTransformer
- Attention is all you need
- MIT Transformer Course - especially scientific papers listed in the "Recommended Reading" sections
- MIT Intro to deep learning course
- Andrej Karpathy aka the "Big Boss"
- OpenAI
Distributed under the MIT License. See LICENSE.txt
for more information.