Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multisets under constraint with integer vectors #12

Closed
jwood000 opened this issue Jan 28, 2020 · 1 comment
Closed

Multisets under constraint with integer vectors #12

jwood000 opened this issue Jan 28, 2020 · 1 comment

Comments

@jwood000
Copy link
Owner

@jwood000 jwood000 commented Jan 28, 2020

When trying to replicate the results of this question: Remove repeating numbers from for loop, I found an issue with how version 2.3.5 handles multisets and their respective frequencies when the vector passed isn't sorted. Observe:

set.seed(42)
vals <- sample(-50:50, 20)
vals
 [1]  42  43 -22  31  12  -1  19 -38  11  14  -9  41  33 -28 -10  30
[17]  38 -41 -11  -5

vals <- c(0L, vals)
myfreqs <- c(length(vals) - 1, rep(1, length(vals) - 1))

myfreqs
 [1] 20  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1

combs <- comboGeneral(vals, length(vals), freqs = myfreqs, 
                                            constraintFun = "sum", 
                                            comparisonFun = "==", 
                                            limitConstraints = 170)

dim(combs)
[1] 6561   21

## Brute force verification
allCombs <- comboGeneral(sort(vals), length(vals),
                         freqs = myfreqs[order(vals)], constraintFun = "sum")

verifiedCombs <- allCombs[which(allCombs[,22] == 170), ]

dim(verifiedCombs)
[1] 2135   22

Clearly, something is not right. We obtained 2135 solutions with brute force and 6561 with the constraint algorithm. Let's have a look at the actual results.

head(combs[,1:12])
     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
[1,]  -41  -38  -28  -22  -11  -10   -5    0   11    11    11    11
[2,]  -41  -38  -28  -22  -11  -10   -5   11   11    11    11    11
[3,]  -41  -38  -28  -22  -11  -10   -5   11   11    11    11    11
[4,]  -41  -38  -28  -22  -11  -10   -1   11   11    11    11    11
[5,]  -41  -38  -28  -22  -11  -10    0   11   11    11    11    11
[6,]  -41  -38  -28  -22  -11  -10    0   11   11    11    11    11

head(verifiedCombs)
     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
[1,]  -41  -38  -28  -22  -10   -5    0    0    0     0    11    12
[2,]  -41  -38  -28  -22   -9   -5   -1    0    0     0    11    12
[3,]  -41  -38  -28  -22   -1    0    0    0    0     0     0    11
[4,]  -41  -38  -28  -11  -10   -5    0    0    0     0     0    12
[5,]  -41  -38  -28  -11   -9   -5   -1    0    0     0     0    12
[6,]  -41  -38  -28  -11   -9   -5    0    0    0     0     0    11
     [,13] [,14] [,15] [,16] [,17] [,18] [,19] [,20] [,21] [,22]
[1,]    14    19    30    31    33    38    41    42    43   170
[2,]    14    19    30    31    33    38    41    42    43   170
[3,]    12    19    30    31    33    38    41    42    43   170
[4,]    14    19    30    31    33    38    41    42    43   170
[5,]    14    19    30    31    33    38    41    42    43   170
[6,]    14    19    30    31    33    38    41    42    43   170

We can see that the brute force solution correctly repeats zero multiple times whereas the constraint algo repeats 11 multiple times. This sorting is taken care of in GetPartitionCase :

if (IsMult) {
for (int i = 0; i < (lenV - 1); ++i) {
for (int j = (i + 1); j < lenV; ++j) {
if (v[i] > v[j]) {
std::swap(v[i], v[j]);
std::swap(Reps[i], Reps[j]);
}
}
}
} else {
std::sort(v.begin(), v.end());
}

And we see that it is a templated function that can accept vectors of different types:

template <typename typeVector>
void GetPartitionCase(const std::vector<std::string> &compFunVec, std::vector<typeVector> &v,
const std::string &mainFun, const std::vector<typeVector> &target,
PartitionType &PartType, distinctType &distinctTest, const SEXP &Rlow,
std::vector<int> &Reps, int lenV, int &m, double tolerance, bool IsMult,
bool IsRep, bool IsBet, bool mIsNull) {

Now, ConstraintsMaster.cpp calls GetPartitionCase here:

Rtolerance, compFunVec, tolerance, mainFun, vNum);
GetPartitionCase(compFunVec, vNum, mainFun, targetVals,
PartType, distinctTest, Rlow, myReps, n, m,
tolerance, IsMult, IsRep, IsBetweenComp, Rf_isNull(Rm));
}

And here is the problem. We are only taking care of numeric vectors (vNum and targetVals are vectors of type double). We can verify this by simply wrapping vals above with as.numeric:

combsNum <- comboGeneral(as.numeric(vals), length(vals),
                         freqs = myfreqs, 
                         constraintFun = "sum", 
                         comparisonFun = "==", 
                         limitConstraints = 170)

dim(combsNum)
[1] 2135   21

all.equal(combsNum, verifiedCombs[, 1:21])
[1] TRUE

We simply need to ensure we handle the integer case appropriately.

@jwood000
Copy link
Owner Author

@jwood000 jwood000 commented Jan 29, 2020

Fixed by e0b4909

@jwood000 jwood000 closed this Jan 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
1 participant
You can’t perform that action at this time.