Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Show all combinations prompt selection algorithm #43

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

lukebaker
Copy link
Member

@lukebaker lukebaker commented Jun 29, 2018

@msalganik we had a client ask for this customization to All Our Ideas and also asked that we see if it makes sense to merge into the main repository.

This adds a new prompt selection algorithm called "all-combos". The goal of this selection algorithm is to show a particular visitor all possible combinations before showing any duplicates. This algorithm is concerned with combinations not permutations, so it considers a comparison between choice A and B equivalent to a comparison between choice B and A.

After a visitor has seen all combinations, the algorithm essentially starts over with a new "round". The visitor will see duplicates from the previous round, but the new round will complete with no duplicates within this round. There's randomness built-in to the algorithm so that each round is not identical and each visitor will see the prompts in different orders.

The mechanics of the algorithm is as follows:

  1. It counts how many times each choices has been seen by this visitor.
  2. It randomly selects one of the choices that has been seen least often.
  3. It counts how many times each choice has been paired with the choice selected in step 2.
  4. It randomly selects one of the choices that has been least often paired with the choice selected in step 2.
  5. The choices selected in step 2 and step 4 make up the prompt to show the visitor.

This algorithm is only used when the API caller (All Our Ideas) explicitly requests this algorithm by name: all-combos. Related All Our Ideas pull request: allourideas/allourideas.org#60

Does this look like the sort of thing that you'd like merged into the main repository?

@msalganik
Copy link
Member

Cool. Thanks @lukebaker. This sounds great, and I'd be interested in having it merged in, but I have a few questions first.

  1. When you first described the algorithm in the text, I had thought that it would be generate all pairs, randomly sort them, and then draw them in order. This seems like the easiest way to implement sampling without replacement. But, then the steps below seem more complicated. Is it fair to say that this is like the catch-up algorithm with sampling without replacement?
  2. Do you think that using this on the server will cause load problems? What tests have been run? I remember that with our current algorithm we have to do caching and stuff like that.
  3. What tests are in place to make sure that it is working correctly?
  4. Would you or the client be willing to write a blog post about it? That's how we document new stuff and then we can link to that blog post from the admin page.

Thanks.

@lukebaker
Copy link
Member Author

When you first described the algorithm in the text, I had thought that it would be generate all pairs, randomly sort them, and then draw them in order. This seems like the easiest way to implement sampling without replacement.

I was hesitant to go down the route of generating all pairs due to the way that the number of pairs grows exponentially as the number of choices increase. I recalled the catchup algorithm having troubles with surveys with lots of choices. IIRC, our largest question has somewhere in the range of 18,000 choices. I think this algorithm scales linearly with the number of choices.

But, then the steps below seem more complicated. Is it fair to say that this is like the catch-up algorithm with sampling without replacement?

You bring up a good point. This algorithm doesn't fare as well when new ideas are added during a survey. The use-case is more targeted at the survey having all the ideas seeded initially and then folks are told to vote on them. This was the particular use-case of this client and similar to others that we've heard where creators want people to vote on all the options.

Do you think that using this on the server will cause load problems? What tests have been run? I remember that with our current algorithm we have to do caching and stuff like that.

I haven't tested this on questions with lots of ideas, but did keep that performance in mind when creating the algorithm.

What tests are in place to make sure that it is working correctly?

There is a test to verify that when the all-combos algorithm is requested that the generated appearance includes the proper algorithm name (since we store some details about the algorithm that was used to generated an appearance with the appearance).

There's also a test that votes on all the combinations, confirms that it saw no duplicates and votes one more time and confirms that it did see a duplicate.

Would you or the client be willing to write a blog post about it? That's how we document new stuff and then we can link to that blog post from the admin page.

Sure. I'll see if the client is interested, otherwise I can.

@msalganik
Copy link
Member

Sorry for the slow reply. This all sounds good. Once we have the blog post up, let's merge and deploy.

@akshitkrnagpal
Copy link

@lukebaker

This adds a new prompt selection algorithm called "all-combos". The goal of this selection algorithm is to show a particular visitor all possible combinations before showing any duplicates. This algorithm is concerned with combinations not permutations, so it considers a comparison between choice A and B equivalent to a comparison between choice B and A.

This is exactly what my requirement was modulo duplicates. Thanks for your work on this.

After a visitor has seen all combinations, the algorithm essentially starts over with a new "round". The visitor will see duplicates from the previous round, but the new round will complete with no duplicates within this round. There's randomness built-in to the algorithm so that each round is not identical and each visitor will see the prompts in different orders.

Can you point me to the part of code where I can modify this to show the user that you have voted for all the combinations? Is it a simple change?

@lukebaker
Copy link
Member Author

Can you point me to the part of code where I can modify this to show the user that you have voted for all the combinations? Is it a simple change?

@akshitkrnagpal, the algorithm here will not notify the user of this. This sort of change would happen in the pairwise client (e.g., the allourideas.org code). If I remember correctly, there is a total vote count for a visitor that gets incremented and displayed to the user. You could trigger the message when their total vote count indicates that they've seen all the combinations using the formula the following formula to determine how many votes they need to see all of them, where n is the number of total choices available:

n! / (2! * (n - 2)!)

@akshitkrnagpal
Copy link

@lukebaker
Thanks for giving me the approach. Much appreciated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants