Fine tunig hugingface models

Fine tuning on parallel data

First we fine tune the model with the limited paralel data we have. In this case we are traning two separate models, one for translating from negative to positive and one for translating from positive to negative.

Negative to positive
Positive to negative

Using reinforcement learning to improve the model further

Fine tuning already improved the model a lot, but we can still improve it further by using reinforcement learning. We use the fine tuned model to generate translations for the test set and then use the BLEU score to calculate the reward. Also inside the reward function we check if the output has correct sentiment. If it does not we give a negative reward. We then use the reward to train the model further.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
DualRLStyleTransfer		DualRLStyleTransfer
img		img
textrl		textrl
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
RunContainer.sh		RunContainer.sh
neg_to_pos_traning.png		neg_to_pos_traning.png
neg_to_pos_traning2.png		neg_to_pos_traning2.png
pos_to_neg_traning.png		pos_to_neg_traning.png
pos_to_neg_traning_2.png		pos_to_neg_traning_2.png
requirement.txt		requirement.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fine tunig hugingface models

Fine tuning on parallel data

Using reinforcement learning to improve the model further

About

Releases

Packages

Languages

License

dominuszagare/RL_style_transfer

Folders and files

Latest commit

History

Repository files navigation

Fine tunig hugingface models

Fine tuning on parallel data

Using reinforcement learning to improve the model further

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages