Skip to content

Conversation

edsavage
Copy link
Contributor

Also fixes a couple more warnings picked up by clang_tidy

closes #845

Also fixes a couple more warnings picked up by clang_tidy
* Reduce memory usage of {ml} native processes on Windows. (See {ml-pull}844[#844].)

=== Bug Fixes
* Fixes memory errors picked up by Valgrind. (See {ml-pull}852[#852].)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t think the release notes should mention Valgrind. The end user won’t care how the problem was discovered (or if they really care they can drill down to the GitHub issue).

Maybe something like, “Fixes potential memory corruption when determining seasonality.”


TComplexVec f;
f.reserve(n);
TComplexVec f(n, TComplex{});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The loop below contains push_back and emplace_back calls on this vector. Is the final size meant to be n? If so I think it will end up too big unless the loop contents are changed too.

Having said that, given that a couple of branches inside the loop set values to (0, 0) you’re probably right that it’s clearest to just initialize a correctly sized vector to all (0, 0) and then change the elements that need changing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I've stepped through this several times. The call to emplace_back happens after the vector has been truncated by resize.

Actually, on the face of it I think the code was probably fine as it stood but the change keeps Valgrind happy. The other option then is to leave this bit of the code alone and to simply add the offending stack trace to a Valgrind suppression file (that would need to be created and committed as part of this PR).

Which leads me to another suggestion. Should we include Valgrind checks as part of the Jenkins CI builds for PRs? Maybe we could also run a full Valgrind check once a week or so, over the weekend?

@edsavage
Copy link
Contributor Author

Looks like there's a few test case failures due to changed threshold values mainly, investigating.

Copy link
Contributor

@tveasey tveasey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change to the bound check looks right to me. However, I don't think it is necessary to pad the value vector with default initialised values.


TComplexVec f;
f.reserve(n);
TComplexVec f(n, TComplex{});
Copy link
Contributor

@tveasey tveasey Nov 27, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem right to me. Consider the case that count(value[i]) > 0 for all i then the vector contains a pad of n TComplex{} followed by n TComplex{CBasicStatistics::mean(values[i]) - mean, 0.0} values. Basically, I'm not sure why this is now resized. Also, I can't see why it could cause a valgrind error since we only resize or append to f in the following.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this change was just an attempt to shut Valgrind up about the following...

Conditional jump or move depends on uninitialised value(s)
std::__1::complex<double> std::__1::operator*<double>(std::__1::complex<double> const&, std::__1::complex<double> const&)
std::__1::complex<double>& std::__1::complex<double>::operator*=<double>(std::__1::complex<double> const&)
ml::maths::CSignal::hadamard(std::__1::vector<std::__1::complex, std::__1::allocator> const&, std::__1::vector<std::__1::complex, std::__1::allocator>&)
ml::maths::CSignal::fft(std::__1::vector<std::__1::complex, std::__1::allocator>&)
ml::maths::CSignal::autocorrelations(std::__1::vector<ml::maths::CBasicStatistics::SSampleCentralMoments, std::__1::allocator> const&, std::__1::vector<double, std::__1::allocator>&)
ml::maths::(anonymous namespace)::mostSignificantPeriodicComponent(std::__1::vector<ml::maths::CBasicStatistics::SSampleCentralMoments, std::__1::allocator>)
ml::maths::testForPeriods(ml::maths::CPeriodicityHypothesisTestsConfig const&, long, long, std::__1::vector<ml::maths::CBasicStatistics::SSampleCentralMoments, std::__1::allocator> const&)
ml::maths::CTimeSeriesDecompositionDetail::CPeriodicityTest::test(ml::maths::CTimeSeriesDecompositionDetail::SAddValue const&)
ml::maths::CTimeSeriesDecompositionDetail::CPeriodicityTest::handle(ml::maths::CTimeSeriesDecompositionDetail::SAddValue const&)
ml::maths::CTimeSeriesDecomposition::addPoint(long, double, std::__1::array<double, 4ul> const&, std::__1::function<void (std::__1::vector)> const&)
CTimeSeriesDecompositionTest::testMixedSmoothAndSpikeyDataProblemCase::test_method()
CTimeSeriesDecompositionTest::testMixedSmoothAndSpikeyDataProblemCase_invoker()
Uninitialised value was created by a heap allocation
malloc
operator new(unsigned long)
std::__1::__libcpp_allocate(unsigned long, unsigned long)
std::__1::allocator<std::__1::complex>::allocate(unsigned long, void const*)
std::__1::allocator_traits<std::__1::allocator>::allocate(std::__1::allocator<std::__1::complex>&, unsigned long)
std::__1::__split_buffer<std::__1::complex, std::__1::allocator&>::__split_buffer(unsigned long, unsigned long, std::__1::allocator<std::__1::complex>&)
std::__1::__split_buffer<std::__1::complex, std::__1::allocator&>::__split_buffer(unsigned long, unsigned long, std::__1::allocator<std::__1::complex>&)
std::__1::vector<std::__1::complex, std::__1::allocator>::reserve(unsigned long)
ml::maths::CSignal::autocorrelations(std::__1::vector<ml::maths::CBasicStatistics::SSampleCentralMoments, std::__1::allocator> const&, std::__1::vector<double, std::__1::allocator>&)
ml::maths::(anonymous namespace)::mostSignificantPeriodicComponent(std::__1::vector<ml::maths::CBasicStatistics::SSampleCentralMoments, std::__1::allocator>)
ml::maths::testForPeriods(ml::maths::CPeriodicityHypothesisTestsConfig const&, long, long, std::__1::vector<ml::maths::CBasicStatistics::SSampleCentralMoments, std::__1::allocator> const&)
ml::maths::CTimeSeriesDecompositionDetail::CPeriodicityTest::test(ml::maths::CTimeSeriesDecompositionDetail::SAddValue const&)

Maybe Valgrind just got confused? If so, as I said in a comment above, I can always just add a suppression for this warning.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me study the code a bit and have a think about this. I can't at present see any reason why the loop should be triggering an error so may be a false positive, but I think we would need an alternative fix if we wanted to tackle this. One possibility would be to presize then not extend, but just write values into place.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me study the code a bit and have a think about this. I can't at present see any reason why the loop should be triggering an error so may be a false positive, but I think we would need an alternative fix if we wanted to tackle this. One possibility would be to presize then not extend, but just write values into place.

Thanks Tom. Welcome back btw :-)

Copy link
Contributor

@tveasey tveasey Nov 28, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Along these lines, how about the following:

TComplexVec f(n, TComplex{0.0, 0.0});
for (std::size_t i = 0; i < n; ++i) {
    std::size_t j = i;
    for (/**/; j < n && CBasicStatistics::count(values[j]) == 0; ++j) {
        // no-op
    }
    if (i < j) {
        // Infer missing values by linearly interpolating.
        if (j == n) {
            break;
        } else if (i > 0) {
            for (std::size_t k = i; k < j; ++k) {
                double alpha{static_cast<double>(k - i + 1) / static_cast<double>(j - i + 1)};
                double fj{CBasicStatistics::mean(values[j]) - mean};
                f[k] = (1.0 - alpha) * f[i - 1] + alpha * TComplex{fj, 0.0};
            }
        }
        i = j;
    }
    f[i] = TComplex{CBasicStatistics::mean(values[i]) - mean, 0.0};
}

This is actually clearer IMO and may incidentally fix the valgrind error (which still seems spurious to me).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That also looked odd to me. The offending call was from fft, but it is applied to a and b which are initialised as

TComplexVec a(m, TComplex(0.0));
TComplexVec b(m, TComplex(0.0));

so must be of the same size. The only other thing I thought on this was does calling TComplex(0.0) leave the imaginary part uninitialised, I haven't checked? May be clearer anyway to change to TComplex{0.0, 0.0} in these definitions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you see the error above before or after the error in radix2fft was fixed?

@droberts195 I think the error was visible after the radix2fft was fixed (at least I didn't notice it in the initial Valgrind report)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also @tveasey, your suggested change above ^^^ looks good. No Valgrind errors in CTimeSeriesDecompositionTest/testMixedSmoothAndSpikeyDataProblemCase at least.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so must be of the same size

hadamard is a public method, so not necessarily called from there. Also, a bug in another method called in between initialization and the call to hadamard could accidentally change the size of one of the vectors. Saying "so must be of the same size" assumes the reader is intimately familiar with all possible code paths.

does calling TComplex(0.0) leave the imaginary part uninitialised

No, it initializes it to 0.0. I checked the library code when I first opened #845. But I agree it's clearer and equally efficient to explicitly pass both arguments.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hadamard is a public method, so not necessarily called from there.

But the stack trace includes CSignal::fft

Copy link
Contributor

@tveasey tveasey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@droberts195 droberts195 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@edsavage edsavage merged commit 8fd6ed3 into elastic:master Nov 28, 2019
edsavage added a commit to edsavage/ml-cpp that referenced this pull request Nov 28, 2019
 * Fixes Valgrind errors in `CSignal`
 * Also fixes a couple more warnings picked up by clang_tidy

(cherry picked from commit 8fd6ed3)
@edsavage edsavage deleted the valgrind_error branch November 28, 2019 17:28
edsavage added a commit that referenced this pull request Nov 28, 2019
 * Fixes Valgrind errors in CSignal
 * Also fixes a couple more warnings picked up by clang_tidy

Backports #852
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Valgrind error in CTimeSeriesDecompositionDetail
3 participants