This repository has been archived by the owner on Jan 29, 2022. It is now read-only.
Create crowdsourcing project in PyBossa for deduplicating trials #398
Labels
Comments
vitorbaptista
added a commit
to opentrials/processors
that referenced
this issue
Sep 29, 2016
vitorbaptista
changed the title
Create crowdsourcing task for deduplicating trials
Create crowdsourcing project in PyBossa for deduplicating trials
Sep 30, 2016
vitorbaptista
added a commit
to opentrials/processors
that referenced
this issue
Sep 30, 2016
We're comparing trials two by two. To understand how we create the tasks, consider a database with trials A, B, C and D. The tasks created will be (A, B), (B, C) and (C, D). This won't test all possible cases, because they are in the millions with our current database. This is just an initial pass. With this logic, we'll create NUMBER_OF_TRIALS - 1 tasks. There's a challenge here on how to upload this to CrowdCrafting, as it only allows 300 requests per 15 minutes. In that speed, it'll take more than 10 days to add the ~330k tasks. Not to say that 330k tasks is already a lot. We'll need to filter out more trials to make it feasible, specially considering that deciding if two trials are the same isn't a trivial task. opentrials/opentrials#398
1 task
@vitorbaptista WONTFIX or doing? |
@pwalsh This is already done in opentrials/processors#64, but wasn't merged because there're too many tasks for crowdsourcing. We would need a way to filter the tasks, which we don't have yet. I'll close this and the related PR, but we should revisit it when we have a way to filter for potentially wrongly deduplicated trials. |
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
This issue will be done to fix #75 and #76
The task should show the user links to two trials, asking her if they are the same.
The text was updated successfully, but these errors were encountered: