Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to print one final label per data instance #6

Closed
ghost opened this issue Nov 23, 2018 · 4 comments
Closed

How to print one final label per data instance #6

ghost opened this issue Nov 23, 2018 · 4 comments

Comments

@ghost
Copy link

ghost commented Nov 23, 2018

Hey,
I am currently running the command python scripts/fast_dawid_skene.py --dataset toy --mode aggregate --algorithm FDS --print_result successfully for my dataset, but when I check the results (either in the output file or the shell output), I see one final annotation per annotator as a result. I thought that one label per document will be created. I checked what happens at the end of main.py for printing and also the code in utils.to_csv, but I am unsure what to change to make it work. Could you please explain how to get the final labels? Thanks!

@sukrutrao
Copy link
Owner

Hi,
The output should indeed be a label per document. Could you please check if the first column in your dataset is annotator ID, and the second is document ID, and not the reverse? If it is correct, could you please provide a small example input where the problem arises? Thanks!

@ghost
Copy link
Author

ghost commented Nov 24, 2018

Yes, that was the problem. Thanks! I was generating a csv file with the input of fast_dawid_skene.py using the csv package, but the column order I specified was not preserved. Now my correct header is

annotator_id,question_id,annotation_id

and the output of the algorithm is OK. The answers here helped: https://stackoverflow.com/questions/15653688/preserving-column-order-in-python-pandas-dataframe

@sukrutrao
Copy link
Owner

sukrutrao commented Nov 24, 2018

Thanks for the update. To clarify, does the code in the repository or the pandas version used have a bug in conversion to CSV? The link you shared seems to suggest that there should be no issues in pandas 0.20.2. However, if there is any please let us know. Thanks!

@sukrutrao
Copy link
Owner

Closing since the issue was resolved. Please comment/reopen if the problem persists. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant