-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid duplicate records in poll answers #5539
Open
javierm
wants to merge
7
commits into
poll_duplicate_voters
Choose a base branch
from
add_option_id_to_poll_answers
base: poll_duplicate_voters
Could not load branches
Branch not found: {{ refName }}
Could not load tags
Nothing to show
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
javierm
added
Polls
security
Pull requests that address a security vulnerability
labels
May 14, 2024
javierm
force-pushed
the
add_option_id_to_poll_answers
branch
3 times, most recently
from
May 14, 2024 04:16
295ddcf
to
08bd5ee
Compare
javierm
force-pushed
the
poll_duplicate_voters
branch
from
May 14, 2024 17:04
1756667
to
bf87bb7
Compare
javierm
force-pushed
the
add_option_id_to_poll_answers
branch
from
May 14, 2024 17:08
08bd5ee
to
f2f9e6e
Compare
javierm
force-pushed
the
poll_duplicate_voters
branch
2 times, most recently
from
May 14, 2024 17:39
386b39d
to
845f031
Compare
javierm
force-pushed
the
add_option_id_to_poll_answers
branch
3 times, most recently
from
May 14, 2024 18:08
c6561b3
to
0e15140
Compare
javierm
force-pushed
the
poll_duplicate_voters
branch
from
May 14, 2024 18:10
845f031
to
a8583b7
Compare
javierm
force-pushed
the
add_option_id_to_poll_answers
branch
from
May 14, 2024 18:34
0e15140
to
3686f17
Compare
javierm
force-pushed
the
poll_duplicate_voters
branch
2 times, most recently
from
May 15, 2024 16:33
9a2a393
to
ffdb562
Compare
javierm
force-pushed
the
add_option_id_to_poll_answers
branch
2 times, most recently
from
May 15, 2024 17:22
60014d9
to
9a004e6
Compare
javierm
force-pushed
the
poll_duplicate_voters
branch
from
May 15, 2024 18:35
ffdb562
to
d7efc4b
Compare
javierm
force-pushed
the
add_option_id_to_poll_answers
branch
from
May 15, 2024 18:36
9a004e6
to
6424024
Compare
javierm
force-pushed
the
add_option_id_to_poll_answers
branch
from
May 15, 2024 18:47
6424024
to
dab5fef
Compare
javierm
force-pushed
the
add_option_id_to_poll_answers
branch
from
May 15, 2024 22:58
34a4647
to
bb61b36
Compare
javierm
force-pushed
the
add_option_id_to_poll_answers
branch
2 times, most recently
from
May 15, 2024 23:26
4243798
to
3c2920c
Compare
javierm
force-pushed
the
add_option_id_to_poll_answers
branch
3 times, most recently
from
May 16, 2024 15:45
03d7b49
to
111cabd
Compare
javierm
force-pushed
the
add_option_id_to_poll_answers
branch
from
May 16, 2024 16:17
111cabd
to
c9afa38
Compare
javierm
force-pushed
the
add_option_id_to_poll_answers
branch
2 times, most recently
from
May 16, 2024 20:52
5860e9b
to
98aec84
Compare
javierm
force-pushed
the
add_option_id_to_poll_answers
branch
from
May 17, 2024 20:51
98aec84
to
df1b744
Compare
The routes for poll questions were accidentally deleted in commit 5bb831e when deleting the `:show` action, and restored in commit 9871503. However, the deleted code was: ``` resources :questions, only: [:show], controller: 'polls/questions' (...) ``` While the restored code was: ``` resources :questions, controller: 'polls/questions' (...) ``` Meaning we forgot to add the `only: []` option when restoring the routes. We also forgot to remove the `before_action` code when deleting the `:show` action, so we're removing it now.
It was confusing to have the action to create an answer in `QuestionsController#answer` while the action to destroy it was `AnswersController#destroy`.
Until now, we've stored the text of the answer somebody replied to. The idea was to handle the scenarios where the user voters for an option but then that option is deleted and restored, or the texts of the options are accidentally edited and so the option "Yes" is now "Now" and vice versa. However, since commit 3a6e99c, options can no longer be edited once the poll starts, so there's no risk of the option changing once somebody has voted. This means we can now store the ID of the option that has been voted. That'll also help us deal with a bug introduced int 673ec07, since answers in different locales are not counted as the same answer. Note we aren't dealing with this bug right now. We're still keeping (and storing) the answer as well. There are two reasons for that. First, we might add an "open answer" type of questions in the future and use this column for it. Second, we've still got logic depending on the answer, and we need to be careful when changing it because there are existing installations where the answer is present but the option_id is not. Note that we're using `dependent: nullify`. The reasoning is that, since we're storing both the option_id and the answer text, we can still use the answer text when removing the option. In practice, this won't matter much, though, since we've got a validation rule that makes it impossible to destroy options once the poll has started. Also note we're still allowing duplicate records when the option is nil. We need to do that until we've removed every duplicate record in the database.
Note: to avoid confusion, "answer" will mean a row in the poll_answers table and "choice" will mean whatever is in the "answer" column of that table (I'm applying the same convention in the code of the task). In order make this task perform reasonably on installations with millions of votes, we're using `update_all` to update all the answers with the same choice at once. In order to do that, we first need to check the existing choices and what are the possible option_ids for those choices. Note that, in order for this task to work, we need to remote the duplicate answers first. Otherwise, we will run into a RecordNotUnique exception when trying to add the same option_id to two duplicate answers. So we're making this task depend on the one that removes duplicate answers. That means we no longer need to specify the task to remove duplicate answers in the release tasks; it will automatically be executed when running the task to add an option_id.
javierm
force-pushed
the
poll_duplicate_voters
branch
2 times, most recently
from
May 17, 2024 21:18
63b25c4
to
be02718
Compare
javierm
force-pushed
the
add_option_id_to_poll_answers
branch
from
May 17, 2024 21:20
df1b744
to
091b247
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
References
Objectives
Notes
Since the unique index is added on a new column, the migration will run even if there are existing duplicate answers in the database at the time of running it. We're also adding a task to remove existing duplicate answers.
Even if all duplicate answers were removed here, there's still one scenario we don't handle in this pull request (and not in #5532 either): until now, due to race conditions, it was possible to select multiple different options in a single-choice question. We aren't providing a task to delete these records, although we might do so in the future. Note this only applies to single-choice questions; mutiple-choice questions shouldn't have this issue because we added a pessimistic lock when we added multiple-choice questions.
Release notes
option_id
field to thepoll_answers
table; a task that populates this field is part of the release tasks. If you've added custom votation types to your code, you might need to modify this task. Also, this task doesn't handle an edge case where the options for a question were renamed after the poll started (which isn't possible since pull request #4904) and an extremely rare case that only applies to multi-language polls and that we're not even sure is possible in practice. For either of these cases, you'll get a warning when running the task, so you can choose what to do about it (you should handle these cases, since future version will use theoption_id
column to calculate results and stats). The task doesn't change records which already have anoption_id
, so it can be run as many times as you'd like to. Also not that, if you've got custom code to save records in thepoll_answers
table, you'll have to make sure that, from now on, that code populates both theanswer
column and theoption_id
column (unless you've implemented open answers in your custom code; in that case, you can skip theoption_id
when dealing with open answers).