Skip to content

moon23k/IntelliGEN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Intelli GEN

  This repository contains a series of code aimed at enhancing performance through Intelligent Generative Training using Transformer model. Typically, Transformers are well-suited for large-scale parallel processing training based on attention mechanism, employing Teacher Forcing during this process. However, during actual inference, they rely solely on predictions from the model's previous time step. This creates a discrepancy between the training and inference processes, making it challenging to improve inference performance. However, applying the same logic during training as in inference would negate the advantages of Transformers. Therefore, there is a need for a more intelligent approach to generative learning. Detailed training strategies are discussed below.



Fine-Tuning Strategy

1.  Standard  Fine-Tuning

Teacher forcing training is the most fundamental training method for Transformer Seq2Seq models used in natural language generation. During the training process, it ensures the stability of learning through Teacher Forcing, employing masking.


2.  Generative  Fine-Tuning

Generative Training is a training approach that follows a self-auto-regressive method without Teacher Forcing. However, to enhance efficiency during the generation process, it utilizes caching.


3.  Slow Sequence GAN  Fine-Tuning

Generative Training is a training approach that follows a self-auto-regressive method without Teacher Forcing. However, to enhance efficiency during the generation process, it utilizes caching.


4.  Intelli GEN  Fine-Tuning

Generative Training is a training approach that follows a self-auto-regressive method without Teacher Forcing. However, to enhance efficiency during the generation process, it utilizes caching.



Experimental Setups

Data Setup Model Setup Training Setup
Dataset: WMT14 En-De Architecture: Transformer Num Epochs: 10



Results

Strategy BLUE Score Epoch Time
Baseline - -
Standard Fine-Tuning - -
Generative Fine-Tuning - -
SlowGAN Fine-Tuning - -
IntelliGEN Fine-Tuning - -



How to use

Clone git on your local env

git clone https://github.com/moon23k/GEN_Training.git

Setup Dataset and Tokenizer via setup.py file

python3 setup.py

Actual Process via run.py file

python3 run.py -mode ['train', 'finetune', 'test', 'inference']
               -strategy ['std', 'gen', 'slow', 'intelli']
               -search(Optional) ['greedy', 'beam']



Reference


About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages