In [3]:
import numpy as np
import pandas as pd
import random

# User Input Generator

The purpose of this ipynb is to create functions that will allow the generation of (fake) user inputs based on a given `agreement` value. The `agreement` value is a probabilty (between 0 and 1, inclusive).

## Functions

### A single question and a single (pseudo) user answer

Inputs
- `choices`: the list of possible answer choices as strings
- `consensus`: the consensus answer(s) as a list. Must be included in `choices`
- `agreement`: the probability that a user agrees with the consensus (between 0 and 1, a double)
- `numAnswers`: the number of possible answers to the question. A tuple of 2 integers, which correspond to the range of the number of possible answers. Cannot be greater than the total number of choices. Cannot have 0 answer choices possible. Example: (1, 1) means that there is 1 answer; (2, 5) means that there is 2 to 5 answers.

Output
- the user inputted answer as a list

In [7]:
def singleQuestionSingleUser(choices, consensus, agreement, numAnswers):
    # TODO: Rethink where to put the checks/function design so that they are only run once for optimal runtime
    if type(numAnswers[0]) != int or type(numAnswers[1]) != int:
        raise ValueError("Number of answers must be integers.")
    if numAnswers[0] > numAnswers[1] or numAnswers[1] > len(choices) or numAnswers[0] <= 0:
        raise ValueError("Please check the range of number of possible answers.")
        
    for c in consensus:
        if not c in choices:
            raise ValueError("Value(s) in `consensus` must be in `choices`.")
        
    value = np.random.rand()
    if value <= agreement:
        return consensus
    
    numAns = np.random.randint(numAnswers[0], numAnswers[1] + 1)
    choices.remove(random.sample(consensus, 1))
    return random.sample(choices, numAns)

### A single question with multiple (pseudo) user answers -- aka multiple users

Loop over `singleQuestionSingleUser`.

Inputs
- `choices`: the list of possible answer choices as strings
- `consensus`: the consensus answer(s) as a list. Must be included in `choices`
- `agreement`: the list of probabilities that each individual user agrees with the consensus (between 0 and 1, a double)
- `numAnswers`: the number of possible answers to the question. A tuple of 2 integers, which correspond to the range of the number of possible answers. Cannot be greater than the total number of choices. Cannot have 0 answer choices possible. Example: (1, 1) means that there is 1 answer; (2, 5) means that there is 2 to 5 answers.

Output
- the user answers as a list of lists

In [16]:
def singleQuestionMultUsers(choices, consensus, agreement, numAnswers):
    answers = []
    for a in agreement:
        answers.append(singleQuestionSingleUser(choices, consensus, a, numAnswers))
    return answers

### Multiple Questions & Multiple Users

Loop over `singleQuestionMultUsers`.

Inputs
- `choices`: the list of lists of possible answer choices as strings
- `consensus`: the consensus answer(s) as a list of lists. Must be included in `choices`
- `agreement`: the list of probabilities that each individual user agrees with the consensus (between 0 and 1, a double)
- `numQuestions`: the number of questions, should be equivalent to `len(choices)` and `len(consensus)`
- `numAnswers`: the number of possible answers to the question. A list of tuples of 2 integers, which correspond to the range of the number of possible answers. Cannot be greater than the total number of choices. Cannot have 0 answer choices possible. Example: (1, 1) means that there is 1 answer; (2, 5) means that there is 2 to 5 answers.

Output
- the user answers as a list of lists of lists

In [17]:
def MultQuestionMultUsers(choices, consensus, agreement, numQuestions, numAnswers):
    if numQuestions != len(choices) or len(choices) != len(consensus) or len(consensus) != len(numAnswers):
        raise ValueError("Number of questions is not reflected in either `choices` or `consensus` or `numAnswers` or some combination.")
    
    answers = []
    for a in agreement:
        for i in range(numQuestions):
            answers.append(singleQuestionSingleUser(choices[i], consensus[i], a, numAnswers[i]))
    return answers

### Notes

- Consider a DataFrame or Series as output to the functions? OR write a toDataFrame function that will convert the list of lists to a DataFrame
- How to redesign functions so that the checks are only done once. 
- Consider condensing the functions (only one single user function -- agreement list of length one is equivalent to only 1 user)
- Consider the output of MultQMultU: a list of lists of lists is . . . very confusing

## Example Pseudo Data