Navigation Menu

Skip to content
This repository has been archived by the owner on Mar 21, 2024. It is now read-only.

cub::DeviceHistogram::HistogramEven: Incorrect result when LevelT does not exactly match SampleT #479

Closed
Melirius opened this issue May 11, 2022 · 4 comments · Fixed by #487
Assignees
Labels
P1: should have Necessary, but not critical. release: breaking change Include in "Breaking Changes" section of release notes. repro: verified The provided repro has been validated. type: bug: functional Does not work as intended.
Milestone

Comments

@Melirius
Copy link

Passing into cub::DeviceHistogram::HistogramEven bounds of type different from type of samples leads to wrong histogramming. Example constructed from the library code is below. If you change type of lower_level and upper_level to float, all is OK, but int as in the example produces wrong histogram [1, 0, 0, 0, 4, 0]. Applicable to 1.16.

#include <cstdlib>
#include <vector>

#include <cub/cub.cuh>   // or equivalently <cub/device/device_histogram.cuh>

// Declare, allocate, and initialize device-accessible pointers for input samples and
// output histogram
int      num_samples = 10;//32 << 20;    // e.g., 10
float*   d_samples;      // e.g., [2.2, 6.1, 7.1, 2.9, 3.5, 0.3, 2.9, 2.1, 6.1, 999.5]
int*     d_histogram;    // e.g., [ -, -, -, -, -, -]
int      num_levels = 7;     // e.g., 7       (seven level boundaries for six bins)
int      lower_level = 0;
int      upper_level = 12;
//float    lower_level = 0.0;    // e.g., 0.0     (lower sample value boundary of lowest bin)
//float    upper_level = 12.0;    // e.g., 12.0    (upper sample value boundary of upper bin)

int main()
{
    std::vector<float> s {2.2, 6.1, 7.1, 2.9, 3.5, 0.3, 2.9, 2.1, 6.1, 999.5};
    std::vector<int> h(num_levels - 1);

    s.resize(num_samples);
    cudaMalloc(&d_samples, num_samples * sizeof(float));
    cudaMalloc(&d_histogram, (num_levels - 1) * sizeof(int));
    cudaMemcpy(d_samples, s.data(), num_samples * sizeof(float), cudaMemcpyHostToDevice);

    // Determine temporary device storage requirements
    void*    d_temp_storage = NULL;
    size_t   temp_storage_bytes = 0;
    cub::DeviceHistogram::HistogramEven(d_temp_storage, temp_storage_bytes,
        d_samples, d_histogram, num_levels, lower_level, upper_level, num_samples);

    // Allocate temporary storage
    cudaMalloc(&d_temp_storage, temp_storage_bytes);

    // Compute histograms
    cub::DeviceHistogram::HistogramEven(d_temp_storage, temp_storage_bytes,
        d_samples, d_histogram, num_levels, lower_level, upper_level, num_samples);

    // d_histogram   <-- [1, 5, 0, 3, 0, 0];

    cudaMemcpy(h.data(), d_histogram, (num_levels - 1) * sizeof(int), cudaMemcpyDeviceToHost);

    for(int a : h)
        std::cout << a << ", ";

    return 0;
}
@alliepiper alliepiper added type: bug: functional Does not work as intended. P1: should have Necessary, but not critical. release: breaking change Include in "Breaking Changes" section of release notes. labels May 11, 2022
@alliepiper alliepiper added this to the 2.0.0 milestone May 11, 2022
@alliepiper alliepiper added the repro: verified The provided repro has been validated. label May 11, 2022
@alliepiper
Copy link
Collaborator

We should either fix this behavior in the backend or require that LevelT is the same as SampleIteratorT's value type. Moving to 2.0 since it may be a breaking change.

@alliepiper alliepiper changed the title Error in cub::DeviceHistogram::HistogramEven cub::DeviceHistogram::HistogramEven: Incorrect result when LevelT does not exactly match SampleT May 16, 2022
@alliepiper
Copy link
Collaborator

alliepiper commented May 16, 2022

Let's update the API to remove the deduced LevelT and replace it with something like std::type_identity_t<...> of the SampleIteratorT's value type

@alliepiper alliepiper self-assigned this May 16, 2022
@Melirius
Copy link
Author

Let's update the API to remove the deduced LevelT and replace it with something like std::type_identity_t<...> of the SampleIteratorT's value type

That is an idea that have consequences, for example: when you histogram uint8_t fully - with 256 bins - upper level 256 cannot be represented by uint8_t. So there is no possibility to histogram full range, that is needed at least for shorts and chars.

@alliepiper
Copy link
Collaborator

That's a good point, since the upper bound is exclusive. We'll find a different fix.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
P1: should have Necessary, but not critical. release: breaking change Include in "Breaking Changes" section of release notes. repro: verified The provided repro has been validated. type: bug: functional Does not work as intended.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants