Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix QA task preprocessing #19

Closed
RKorzeniowski opened this issue Nov 8, 2020 · 2 comments
Closed

Fix QA task preprocessing #19

RKorzeniowski opened this issue Nov 8, 2020 · 2 comments

Comments

@RKorzeniowski
Copy link

RKorzeniowski commented Nov 8, 2020

Hi,
very cool lib. Just wanted to say that pre_process_squad function is not working correctly when following docs. There are two problems when nlp package is used like that nlp.load_dataset('squad_v2').

  • Column names differ, to be exact "anwsers" and "anwser_text".
  • Answers are given in dict(list(str)) format and tokenization that sets end and start token targets works as if it was dict(str). This ends up setting all targets as (0,0).

I had to fix that for my usecase so if you want I can make a PR with fixes. Let me know if there are things that I should do before like running tests

@RKorzeniowski RKorzeniowski changed the title Fix QA Fix QA task preprocessing Nov 8, 2020
@ohmeow
Copy link
Owner

ohmeow commented Nov 8, 2020 via email

@ohmeow
Copy link
Owner

ohmeow commented Dec 26, 2020

I think this is fixed now so I'm closing it out. If you're still seeing issues, feel free to reopen.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants