Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Could you provide some parts of the data? #1

Closed
GuangChen2016 opened this issue Dec 28, 2016 · 5 comments
Closed

Could you provide some parts of the data? #1

GuangChen2016 opened this issue Dec 28, 2016 · 5 comments

Comments

@GuangChen2016
Copy link

@marekrei Hello, marekrei. It's very nice work, and I want to reproduce your work. But I have some problem in preparing the required format. So could you please provide some parts of the data for me? Thank you very much.

@marekrei
Copy link
Owner

Hi @GuangChen2016
I'm in the process of releasing the error detection dataset. It should be online by next week.
Until then, the input files should look like this:

This        c
is          c
an          i
sentence    c

In this case the task is error detection and there are two possible labels - c and i. But you can replace that with POS tags, NER tags, etc. Make sure to leave an empty line between sentences.
I hope this helps.

@GuangChen2016
Copy link
Author

@marekrei Thank you.

@liuyichaosoftware
Copy link

hi marekrei, I would like to reproduce your work, but I have a problem to get the FCE data, could you please give me a help with the data? thank you very much.

@marekrei
Copy link
Owner

The error detection data is now available here:
http://www.ilexir.co.uk/datasets/index.html

You might also be interested in the blog post I wrote about this topic:
http://www.marekrei.com/blog/attending-to-characters-in-neural-sequence-labeling-models/

@liuyichaosoftware
Copy link

thank you very much~

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants