GitHub

Intelli GEN

This repository contains a series of code aimed at enhancing performance through Intelligent Generative Training using Transformer model. Typically, Transformers are well-suited for large-scale parallel processing training based on attention mechanism, employing Teacher Forcing during this process. However, during actual inference, they rely solely on predictions from the model's previous time step. This creates a discrepancy between the training and inference processes, making it challenging to improve inference performance. However, applying the same logic during training as in inference would negate the advantages of Transformers. Therefore, there is a need for a more intelligent approach to generative learning. Detailed training strategies are discussed below.

Fine-Tuning Strategy

1. Standard Fine-Tuning

Teacher forcing training is the most fundamental training method for Transformer Seq2Seq models used in natural language generation. During the training process, it ensures the stability of learning through Teacher Forcing, employing masking.

2. Generative Fine-Tuning

Generative Training is a training approach that follows a self-auto-regressive method without Teacher Forcing. However, to enhance efficiency during the generation process, it utilizes caching.

3. Slow Sequence GAN Fine-Tuning

Generative Training is a training approach that follows a self-auto-regressive method without Teacher Forcing. However, to enhance efficiency during the generation process, it utilizes caching.

4. Intelli GEN Fine-Tuning

Generative Training is a training approach that follows a self-auto-regressive method without Teacher Forcing. However, to enhance efficiency during the generation process, it utilizes caching.

Experimental Setups

Data Setup	Model Setup	Training Setup
`Dataset:` WMT14 En-De	`Architecture:` Transformer	`Num Epochs:` 10

Results

Strategy	BLUE Score	Epoch Time
Baseline	-	-
Standard Fine-Tuning	-	-
Generative Fine-Tuning	-	-
SlowGAN Fine-Tuning	-	-
IntelliGEN Fine-Tuning	-	-

How to use

Clone git on your local env

git clone https://github.com/moon23k/GEN_Training.git

Setup Dataset and Tokenizer via setup.py file

python3 setup.py

Actual Process via run.py file

python3 run.py -mode ['train', 'finetune', 'test', 'inference']
               -strategy ['std', 'gen', 'slow', 'intelli']
               -search(Optional) ['greedy', 'beam']

Reference

Attention Is All You Need

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
ckpt		ckpt
data		data
model		model
module		module
README.md		README.md
config.yaml		config.yaml
run.py		run.py
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ckpt

ckpt

data

data

model

model

module

module

README.md

README.md

config.yaml

config.yaml

run.py

run.py

setup.py

setup.py

Repository files navigation

Intelli GEN

Fine-Tuning Strategy

1. Standard Fine-Tuning

2. Generative Fine-Tuning

3. Slow Sequence GAN Fine-Tuning

4. Intelli GEN Fine-Tuning

Experimental Setups

Results

How to use

Reference

About

Releases

Packages

Languages

moon23k/IntelliGEN

Folders and files

Latest commit

History

Repository files navigation

Intelli GEN

Fine-Tuning Strategy

1. Standard Fine-Tuning

2. Generative Fine-Tuning

3. Slow Sequence GAN Fine-Tuning

4. Intelli GEN Fine-Tuning

Experimental Setups

Results

How to use

Reference

About

Resources

Stars

Watchers

Forks

Languages