Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
Review of submission 17tizard #18
The data is as described with the attached README.md, containing three tables (POS, labelled_sentences, verbs). The description of table labelled_sentences is consistent with the actual dataset. However, the POS and verbs table are lacking documentation (they contain more than simply an ID column), especially with respect to column names, which are missing in the README.md. In addition, it might be beneficial to provide a more verbose name for the POS table. Of course, some of these details might be covered in the paper itself (of which I do not yet have access to).
In addition, perhaps an entity relationship diagram would help with the documentation of the tables, since they are relational -- this should include the primary/foreign keys for each table.
Overall, the data appear to be interesting; although the data could benefit from being normalized (industry standard for relational databases); mainly for consistency:
Finally, I do not think the availability aspect of the the artifacts is met, as it is only available on THIS GitHub repository (as opposed to something like zenodo). So if the authors upload the artifacts to a publicly available location then perhaps we can consider upgrading the badge to available.
For now, provided that the documentation is updated to be "very carefully documented" as per my aforementioned comments, I recommend the badge of reusable.
Sorry for the slow reply.
I have updated the readme on the original repo with the following additions
We have made the data available on Zenodo with a link provided in the paper (https://zenodo.org/record/3315707#%23.XSZ8_-gzZPYz)
Feedback classifications: We propose a new set of classification in our paper, so the data follows these.
Please let me know if anything additional is needed.