-
Notifications
You must be signed in to change notification settings - Fork 4
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
11 changed files
with
833 additions
and
480 deletions.
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,4 +1,5 @@ | ||
| ## AdaVAE: Exploring Adaptive GPT-2s in VAEs for Language Modeling | ||
|
|
||
| **[Repo In progress]** Official implementation for **AdaVAE**, check the paper on arxiv [https://arxiv.org/pdf/2205.05862.pdf](https://arxiv.org/pdf/2205.05862.pdf). | ||
| **[Repo In Progress]** Official implementation for **AdaVAE**, check the paper on arxiv [https://arxiv.org/pdf/2205.05862.pdf](https://arxiv.org/pdf/2205.05862.pdf). | ||
|
|
||
| *This repo takes some practices from [https://github.com/fangleai/TransformerCVAE](https://github.com/fangleai/TransformerCVAE)*. Many thanks !! |
Binary file not shown.
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,4 +1,3 @@ | ||
| python adaVAE.py --batch-sizes 100 --dataset yelp_data --max_length 32 --pre_enc_iter start --add_attn --beta_0 1 --fb 1 --adapter_size 128 --iterations 15000 --weighted_sample --latent_size 32 --encoder_n_layer 8 --decoder_n_layer 12 --adapter_init bert --attn_mode none --kl_rate 0.50 &&\ | ||
| python adaVAE.py --batch-sizes 100 --dataset yahoo_data --max_length 32 --pre_enc_iter start --add_attn --beta_0 1 --fb 1 --adapter_size 128 --iterations 22000 --weighted_sample --latent_size 32 --encoder_n_layer 8 --decoder_n_layer 12 --adapter_init bert --attn_mode none --latent_gen linear --kl_rate 0.50 &&\ | ||
| python adaVAE.py --batch-sizes 100 --dataset yahoo_data --max_length 32 --pre_enc_iter start --add_attn --beta_0 1 --fb 1 --adapter_size 128 --iterations 22000 --weighted_sample --latent_size 32 --encoder_n_layer 8 --decoder_n_layer 12 --adapter_init bert --attn_mode none --latent_gen linear --add_mem --kl_rate 0.50 &&\ | ||
| python adaVAE.py --batch-sizes 100 --dataset yahoo_data --max_length 32 --pre_enc_iter start --add_mem --beta_0 1 --fb 1 --adapter_size 128 --iterations 22000 --weighted_sample --latent_size 32 --encoder_n_layer 8 --decoder_n_layer 12 --adapter_init bert --attn_mode none --latent_gen linear --kl_rate 0.50 | ||
| python adaVAE.py --batch-sizes 90 --dataset yelp_data --max_length 32 --pre_enc_iter start --add_attn --add_z2adapters --beta_0 1 --fb 1 --adapter_size 128 --iterations 22000 --weighted_sample --latent_size 32 --encoder_n_layer 8 --decoder_n_layer 12 --adapter_init bert --attn_mode none --kl_rate 0.50 &&\ | ||
| python adaVAE.py --batch-sizes 90 --dataset yelp_data --max_length 32 --pre_enc_iter start --add_attn --beta_0 1 --fb 1 --adapter_size 128 --iterations 22000 --latent_gen mean_max_linear --weighted_sample --latent_size 32 --encoder_n_layer 8 --decoder_n_layer 12 --adapter_init bert --attn_mode none --kl_rate 0.50 &&\ | ||
| python adaVAE.py --batch-sizes 90 --dataset yelp_data --max_length 32 --pre_enc_iter start --add_attn --beta_0 1 --fb 1 --adapter_size 128 --iterations 22000 --latent_gen linear --weighted_sample --latent_size 32 --encoder_n_layer 8 --decoder_n_layer 12 --adapter_init bert --attn_mode none --kl_rate 0.50 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,2 +1 @@ | ||
| python adaVAE.py --batch-sizes 100 --dataset cola --max_length 32 --pre_enc_iter start --add_attn --beta_0 1 --fb 1 --adapter_size 128 --iterations 15000 --weighted_sample --latent_size 768 --encoder_n_layer 8 --decoder_n_layer 12 --adapter_init bert --attn_mode none --kl_rate 0.10 &&\ | ||
| python adaVAE.py --batch-sizes 100 --dataset sst-2 --max_length 32 --pre_enc_iter start --add_attn --beta_0 1 --fb 1 --adapter_size 128 --iterations 15000 --weighted_sample --latent_size 768 --encoder_n_layer 8 --decoder_n_layer 12 --adapter_init bert --attn_mode none --kl_rate 0.10 | ||
| python adaVAE.py --batch-sizes 90 --dataset yelp_data --max_length 32 --pre_enc_iter start --add_attn --add_z2adapters --beta_0 1 --fb 1 --adapter_size 128 --iterations 22000 --weighted_sample --latent_size 32 --encoder_n_layer 8 --decoder_n_layer 12 --adapter_init bert --attn_mode none --kl_rate 0.50 |