Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is mT0 suitable for continued training on span corruption task? #2

Closed
junwang-wish opened this issue Nov 4, 2022 · 2 comments
Closed

Comments

@junwang-wish
Copy link

Is mT0 suitable / recommended for continued training on mixture of denoising (span corruption, extreme span corruption, prefix LM) tasks similar to UL2? Like below

# span_corruption
{
"text_input": "The <extra_id_0> walks in <extra_id_1> park", 
"text_output": "<extra_id_0> cute dog <extra_id_1> the <extra_id_2>"
}

# extreme_span_corruption
{
"text_input": "The <extra_id_0> park", 
"text_output": "<extra_id_0> cute dog walks in the <extra_id_1>"
}

# prefix LM
{
"text_input": "The cute <extra_id_0>", 
"text_output": "<extra_id_0> dog walks in the park"
}

My domain text is quite different from internet text so I assume span corruption task would help mT0 learn special syntax / semantics of my domain.

@sbmaruf
Copy link

sbmaruf commented Nov 8, 2022

I think that BLOOM might be a good candidate for that. After UL2 training you might want to try instruction tuning like BLOOMZ, FLAN or T0. But a good workaround could be (i) include instruction tuning samples (xp3mt, p3 etc) in the "prefix LM" objective function, (ii) include other objective function like span_corruption and continue UL2 training.

@junwang-wish
Copy link
Author

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants