Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Smaller text fragments for question annotations #105

Open
nyxpho opened this issue Jun 11, 2021 · 5 comments
Open

Smaller text fragments for question annotations #105

nyxpho opened this issue Jun 11, 2021 · 5 comments
Assignees
Labels
enhancement New feature or request

Comments

@nyxpho
Copy link

nyxpho commented Jun 11, 2021

Problem description
Currently when we are trying to create questions to increase our dataset, we are presented with a very long text. We have agreed for the first round of annotations to just look at the first 5 - give or take - sentences and to create a question where the answer would be in that paragraph. It would be best however if the text comes already split, perhaps with some overlap between consecutive fragments so that we keep context. Next week a very junior researcher will join our team and he could help us with some additional questions.

Solution description
I can do this, just let me know what is the file you are currently using as an input for the annotation platform.

@guillim @Rob192 @psorianom @AbdenourCh

@guillim guillim added the enhancement New feature or request label Jun 14, 2021
@nyxpho
Copy link
Author

nyxpho commented Jul 29, 2021

Update on this: we had an intern in the team that annotated 9 more texts, with 5 questions for each. As there were issues in the annotation platform, I could not proceed with splitting the text in smaller parts. The new squad file, containing the old questions plus the new ones is this one. The documents with new questions are: 953, 1109, 1268, 1606, 1881, 1974, 2062, 2383, 2833. Does the document look ok to you, should I make a pull request to add it?
squad++.zip

@guillim
Copy link
Contributor

guillim commented Jul 30, 2021

Are we talking about the dataset from service-public.fr ?

@nyxpho
Copy link
Author

nyxpho commented Jul 30, 2021

Yes

@guillim
Copy link
Contributor

guillim commented Jul 30, 2021

Does your intern plan to make more annotations in August ? (I am thinking maybe to make one PR only when finished)

@nyxpho
Copy link
Author

nyxpho commented Aug 2, 2021

No, he will not make more annotations. Sure, we can make just one PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants