Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mode is not calculated correctly #18

Closed
ginodeis opened this issue Dec 16, 2015 · 3 comments
Closed

Mode is not calculated correctly #18

ginodeis opened this issue Dec 16, 2015 · 3 comments
Labels

Comments

@ginodeis
Copy link
Contributor

Hi,
First of all, congratulations and thanks for your work.

Using the function for calculate the mode, I've found that when the mode is a low value compared with the rest, and the data array is relatively long, an incorrect result occurs:

Example:

var data = []float64{1, 2, 3, 4, 4, 4, 4, 4, 5, 3, 6, 7, 5, 0, 8, 8, 7, 6, 9, 9}
mode, _ := stats.Mode(data)
fmt.Printf("%v\n", data)

// Result: [4 8]

As we can see the result is incorrect, because the function should return [4]

After analyze the results and study the function code, I've located the problem and this is the fix:

File: https://github.com/montanaflynn/stats/blob/master/mode.go

package stats

// Mode gets the mode [most frequent value(s)] of a slice of float64s
func Mode(input Float64Data) (mode []float64, err error) {
    // Return the input if there's only one number
    l := input.Len()
    if l == 1 {
        return input, nil
    } else if l == 0 {
        return nil, EmptyInput
    }

    c := sortedCopyDif(input)
    // Traverse sorted array,
    // tracking the longest repeating sequence
    mode = make([]float64, 5)
    cnt, maxCnt := 1, 1
    for i := 1; i < l; i++ {
        switch {
        case c[i] == c[i-1]:
            cnt++
        case cnt == maxCnt && maxCnt != 1:
            mode = append(mode, c[i-1])
            cnt = 1
        case cnt > maxCnt:
            mode = append(mode[:0], c[i-1])
            maxCnt, cnt = cnt, 1
        // :: the fix - reset the counter ::
        default:
            cnt = 1
        // :: end fix ::
        }
    }
    switch {
    case cnt == maxCnt:
        mode = append(mode, c[l-1])
    case cnt > maxCnt:
        mode = append(mode[:0], c[l-1])
        maxCnt = cnt    
    }


    // Since length must be greater than 1,
    // check for slices of distinct values
    if maxCnt == 1 {
        return Float64Data{}, nil
    }

    return mode, nil
}

I don't know if the solution convinces you, but it works. If you see it well, I can make the changes and emit a pull request to correct it sooner.

@Kunde21
Copy link
Contributor

Kunde21 commented Dec 17, 2015

You're absolutely correct. The counter is not being properly reset.

Nice catch. I'd suggest adding a test along with the code correction, then send a PR.

@montanaflynn
Copy link
Owner

Thanks for the bug report @ginodeis and figuring out the solution!

I welcome a PR and if possible a test case as suggested by @Kunde21.

@montanaflynn
Copy link
Owner

Closed with #19

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants