generated from fastai/nbdev_template
-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix QA task preprocessing #19
Comments
Yah if you want to make a PR go for it.
The project is built of on nbdev and so the process for developing and
submitting PRs is the same as for libraries like fastai. See
https://docs.fast.ai/dev-setup.
In particular, make sure you run `nbdev_install_git_hooks` right after you
git clone the library. If you want to add some tests that would be great
too. Check out the nbdev docs for how to do that and work on any project
based on it: https://nbdev.fast.ai/.
Thanks and lmk if you have any questions.
…-wg
On Sun, Nov 8, 2020 at 12:59 AM RKorzeniowski ***@***.***> wrote:
Hi,
very cool lib. Just wanted to say that pre_process_squad
<https://github.com/ohmeow/blurr/blob/master/blurr/data/question_answering.py>
function is not working correctly when following docs
<https://ohmeow.github.io/blurr/modeling-question-answering/>. There are
two problems when huggingface datasets (updated nlp package) is used like
that nlp.load_dataset('squad_v2')
<https://huggingface.co/docs/datasets/package_reference/loading_methods.html>
.
- column names differ, to be exact "anwsers" and "anwser_text".
- answers are given in dict(list(str)) format and tokenization that
sets end and start token targets works as if it was dict(str). This ends up
setting all targets as (0,0). I had to fix that for my usecase so if you
want I can make a PR with fixes. Let me know if there are things that I
should do before like running tests
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#19>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAADNMAON377IMLFRDBSPQTSOZMVNANCNFSM4TOGA72A>
.
|
I think this is fixed now so I'm closing it out. If you're still seeing issues, feel free to reopen. |
Closed
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi,
very cool lib. Just wanted to say that
pre_process_squad
function is not working correctly when following docs. There are two problems when nlp package is used like thatnlp.load_dataset('squad_v2')
.I had to fix that for my usecase so if you want I can make a PR with fixes. Let me know if there are things that I should do before like running tests
The text was updated successfully, but these errors were encountered: