Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crashes in test-{smith-form-valence,regression} compiling with -D_FORTIFY_SOURCE=3 #304

Closed
collares opened this issue Jul 9, 2023 · 7 comments · Fixed by #307
Closed
Assignees

Comments

@collares
Copy link

collares commented Jul 9, 2023

NixOS now builds all packages with -D_FORTIFY_SOURCE=3. This causes two tests, test-smith-form-valence and test-regression, to fail with buffer overflows. To avoid duplication, I will only post the relevant details for test-smith-form-valence. The test prints

Expected smith form SL: {{1,8}{1440000,1}{0,2}}
Computed Smith form SL: {{1,8}{1440000,1}{0,2}}
PASSED.
*** buffer overflow detected ***: terminated

and the stack trace is

#0  0x00007ffff5f01a8c in __pthread_kill_implementation () from /nix/store/ayg065nw0xi1zsyi8glfh5pn4sfqd8xg-glibc-2.37-8/lib/libc.so.6
#1  0x00007ffff5eb2c86 in raise () from /nix/store/ayg065nw0xi1zsyi8glfh5pn4sfqd8xg-glibc-2.37-8/lib/libc.so.6
#2  0x00007ffff5e9c8ba in abort () from /nix/store/ayg065nw0xi1zsyi8glfh5pn4sfqd8xg-glibc-2.37-8/lib/libc.so.6
#3  0x00007ffff5e9d5f5 in __libc_message.cold () from /nix/store/ayg065nw0xi1zsyi8glfh5pn4sfqd8xg-glibc-2.37-8/lib/libc.so.6
#4  0x00007ffff5f91679 in __fortify_fail () from /nix/store/ayg065nw0xi1zsyi8glfh5pn4sfqd8xg-glibc-2.37-8/lib/libc.so.6
#5  0x00007ffff5f8fea4 in __chk_fail () from /nix/store/ayg065nw0xi1zsyi8glfh5pn4sfqd8xg-glibc-2.37-8/lib/libc.so.6
#6  0x00000000004c2483 in memcpy (__len=8, __src=0x7fffffff82e0, __dest=<optimized out>) at /nix/store/0ccvlygpc7p5zyfsyz8mmg9ycqkvrcp2-glibc-2.37-8-dev/include/bits/string_fortified.h:29
#7  LinBox::BlasMatrixApplyDomain<Givaro::ZRing<Givaro::Integer>, LinBox::BlasMatrix<Givaro::ZRing<Givaro::Integer>, std::vector<Givaro::Integer, std::allocator<Givaro::Integer> > > >::applyV (this=0x7fffffff8918, y=..., x=..., b=...) at ../linbox/blackbox/apply.h:638
#8  0x00000000004c3105 in LinBox::LiftingContainerBase<Givaro::ZRing<Givaro::Integer>, LinBox::BlasMatrix<Givaro::ZRing<Givaro::Integer>, std::vector<Givaro::Integer, std::allocator<Givaro::Integer> > > >::const_iterator::next (this=this@entry=0x7fffffff85b0, digit=...) at ../linbox/algorithms/lifting-container.h:229
#9  0x00000000004c3398 in LinBox::RationalReconstruction<LinBox::DixonLiftingContainer<Givaro::ZRing<Givaro::Integer>, Givaro::Modular<int, int, void>, LinBox::BlasMatrix<Givaro::ZRing<Givaro::Integer>, std::vector<Givaro::Integer, std::allocator<Givaro::Integer> > >, LinBox::BlasMatrix<Givaro::Modular<int, int, void>, std::vector<int, std::allocator<int> > > >, LinBox::RReconstruction<Givaro::ZRing<Givaro::Integer>, LinBox::ClassicMaxQRationalReconstruction<Givaro::ZRing<Givaro::Integer> > > >::getRational3<LinBox::BlasVector<Givaro::ZRing<Givaro::Integer>, std::vector<Givaro::Integer, std::allocator<Givaro::Integer> > > > (this=this@entry=0x7fffffff86f0, num=..., den=...) at ../linbox/algorithms/rational-reconstruction.h:790
#10 0x00000000004eb746 in LinBox::RationalReconstruction<LinBox::DixonLiftingContainer<Givaro::ZRing<Givaro::Integer>, Givaro::Modular<int, int, void>, LinBox::BlasMatrix<Givaro::ZRing<Givaro::Integer>, std::vector<Givaro::Integer, std::allocator<Givaro::Integer> > >, LinBox::BlasMatrix<Givaro::Modular<int, int, void>, std::vector<int, std::allocator<int> > > >, LinBox::RReconstruction<Givaro::ZRing<Givaro::Integer>, LinBox::ClassicMaxQRationalReconstruction<Givaro::ZRing<Givaro::Integer> > > >::getRational<LinBox::BlasVector<Givaro::ZRing<Givaro::Integer>, std::vector<Givaro::Integer, std::allocator<Givaro::Integer> > > > (switcher=0, den=..., num=..., this=0x7fffffff86f0) at ../linbox/algorithms/rational-reconstruction.h:136
#11 LinBox::DixonSolver<Givaro::ZRing<Givaro::Integer>, Givaro::Modular<int, int, void>, LinBox::PrimeIterator<LinBox::IteratorCategories::HeuristicTag>, LinBox::Method::DenseElimination>::solveNonsingular<LinBox::BlasMatrix<Givaro::ZRing<Givaro::Integer>, std::vector<Givaro::Integer, std::allocator<Givaro::Integer> > >, LinBox::BlasVector<Givaro::ZRing<Givaro::Integer>, std::vector<Givaro::Integer, std::allocator<Givaro::Integer> > >, LinBox::BlasVector<Givaro::ZRing<Givaro::Integer>, std::vector<Givaro::Integer, std::allocator<Givaro::Integer> > > > (
    this=this@entry=0x7fffffff8d20, num=..., den=..., A=..., b=..., oldMatrix=oldMatrix@entry=false, maxPrimes=5) at ../linbox/algorithms/./dixon-solver/./dixon-solver-dense.inl:134
#12 0x00000000004ecc25 in LinBox::DixonSolver<Givaro::ZRing<Givaro::Integer>, Givaro::Modular<int, int, void>, LinBox::PrimeIterator<LinBox::IteratorCategories::HeuristicTag>, LinBox::Method::DenseElimination>::solve<LinBox::BlasMatrix<Givaro::ZRing<Givaro::Integer>, std::vector<Givaro::Integer, std::allocator<Givaro::Integer> > >, LinBox::BlasVector<Givaro::ZRing<Givaro::Integer>, std::vector<Givaro::Integer, std::allocator<Givaro::Integer> > >, LinBox::BlasVector<Givaro::ZRing<Givaro::Integer>, std::vector<Givaro::Integer, std::allocator<Givaro::Integer> > > >
    (this=this@entry=0x7fffffff8d20, num=..., den=..., A=..., b=..., old=old@entry=false, maxP=5, level=LinBox::SL_LASVEGAS)
    at ../linbox/algorithms/./dixon-solver/./dixon-solver-dense.inl:42
#13 0x00000000004ece73 in LinBox::RationalSolverAdaptiveClass<Givaro::ZRing<Givaro::Integer>, LinBox::BlasVector<Givaro::ZRing<Givaro::Integer>, std::vector<Givaro::Integer, std::allocator<Givaro::Integer> > >, std::vector<int, std::allocator<int> > >::solveNonsingular (num=..., den=..., M=..., b=...) at ../linbox/algorithms/rational-solver-adaptive.h:62
#14 0x00000000004ed167 in LinBox::RationalSolverAdaptive::solveNonsingular<Givaro::ZRing<Givaro::Integer>, LinBox::BlasVector<Givaro::ZRing<Givaro::Integer>, std::vector<Givaro::Integer, std::allocator<Givaro::Integer> > >, std::vector<int, std::allocator<int> > > (b=..., M=..., den=..., num=...) at ../linbox/algorithms/rational-solver-adaptive.h:95
#15 LinBox::LastInvariantFactor<Givaro::ZRing<Givaro::Integer>, LinBox::RationalSolverAdaptive>::lastInvariantFactor_Bonus<LinBox::BlasMatrix<Givaro::ZRing<Givaro::Integer>, std::vector<Givaro::Integer, std::allocator<Givaro::Integer> > >, LinBox::BlasVector<Givaro::ZRing<Givaro::Integer>, std::vector<Givaro::Integer, std::allocator<Givaro::Integer> > > > (
    this=this@entry=0x7fffffff9538, lif=..., Bonus=..., A=..., PrimeL=...) at ../linbox/algorithms/last-invariant-factor.h:192
#16 0x00000000004edab6 in LinBox::OneInvariantFactor<Givaro::ZRing<Givaro::Integer>, LinBox::LastInvariantFactor<Givaro::ZRing<Givaro::Integer>, LinBox::RationalSolverAdaptive>, LinBox::SCompose, LinBox::RandomMatrix>::oneInvariantFactor_Bonus<LinBox::BlasMatrix<Givaro::ZRing<Givaro::Integer>, std::vector<Givaro::Integer, std::allocator<Givaro::Integer> > >, LinBox::BlasVector<Givaro::ZRing<Givaro::Integer>, std::vector<Givaro::Integer, std::allocator<Givaro::Integer> > > > (this=this@entry=0x7fffffff9500, oif=..., bonus=..., A=..., i=i@entry=9, PrimeL=...)
    at ../linbox/algorithms/one-invariant-factor.h:254
#17 0x00000000004edc50 in LinBox::OneInvariantFactor<Givaro::ZRing<Givaro::Integer>, LinBox::LastInvariantFactor<Givaro::ZRing<Givaro::Integer>, LinBox::RationalSolverAdaptive>, LinBox::SCompose, LinBox::RandomMatrix>::oneInvariantFactor_Bonus<LinBox::BlasMatrix<Givaro::ZRing<Givaro::Integer>, std::vector<Givaro::Integer, std::allocator<Givaro::Integer> > > > (
    this=this@entry=0x7fffffff9500, oif=..., bonus=..., A=..., i=9) at ../linbox/algorithms/one-invariant-factor.h:280
#18 0x0000000000504157 in LinBox::SmithFormAdaptive::smithForm<Givaro::ZRing<Givaro::Integer>, std::vector<Givaro::Integer, std::allocator<Givaro::Integer> > > (s=..., A=...)
    at ../linbox/algorithms/smith-form-adaptive.inl:550
#19 0x0000000000431711 in LinBox::smithForm (V=..., A=..., tag=..., M=...) at ../linbox/solutions/smith-form.h:224
#20 0x000000000050474f in LinBox::smithForm<LinBox::SparseMatrix<Givaro::ZRing<Givaro::Integer>, LinBox::SparseMatrixFormat::SparseSeq>, LinBox::Method::Auto> (V=..., A=..., M=...)
    at ../linbox/solutions/smith-form.h:134
#21 0x0000000000504804 in LinBox::smithForm<LinBox::SparseMatrix<Givaro::ZRing<Givaro::Integer>, LinBox::SparseMatrixFormat::SparseSeq> > (V=..., A=...)
    at ../linbox/solutions/smith-form.h:152
#22 0x0000000000431907 in testValenceSmith (name=name@entry=0x516174 "data/sms.matrix", correctSL=...) at test-smith-form-valence.C:63
#23 0x0000000000431be2 in main (argc=<optimized out>, argv=<optimized out>) at test-smith-form-valence.C:78

This happens on the latest release (tag v1.7.0 to be precise).

@jgdumas
Copy link
Member

jgdumas commented Sep 13, 2023

Thank you for the report.
Hum this seems to come form a memcpy in ./linbox/blackbox/apply.h:638,
This was introduced to solve another issue in the PR #290.
We'll investigate this further ...

@musicinmybrain
Copy link
Contributor

musicinmybrain commented Oct 17, 2023

I looked into this a bit, since we’re seeing this in Fedora Linux too. I was able to reproduce it with optimization flags -Og and show exactly how the memcpy is overrunning the buffer by one byte. What I haven’t done is reason through the capacity formula and determine why this is happening.

Detailed gdb output is available here, but in summary:

At the time of the segfault in test-smith-form-valence, combined was allocated with size (size_t)rc*_n*(size_t)rclen = 468 bytes,

unsigned char* combined = new unsigned char[(size_t)rc*_n*(size_t)rclen];

where rc = 4 (with chunk_size = 16),

int rc = int(52 / chunk_size) + 1; //constant at 4 for now

_n = 9, and rclen = 13 (with num_chunks = 4).

int rclen = (int)num_chunks*2 + 5;

At the segfault, bitDest is offset (size_t)rclen*((i % (size_t)rc)*_n+j) = 455 bytes into combined, where j = 8,

bitDest += (size_t)rclen*((i % (size_t)rc)*_n+j);

and then another 2*i = 6, where i = 3, for a total offset of 461 bytes.

bitDest += 2*i;

Since 8 bytes are accessed, this overruns the buffer by one byte.

Now the question is, where is the error?

@musicinmybrain
Copy link
Contributor

I can confirm that adjusting the formula for the size of combined from (size_t)rc*_n*(size_t)rclen to (size_t)rc*(_n+1)*(size_t)rclen is sufficient for the tests to pass with -D_FORTIFY_SOURCE=3, but I don’t have an analytical justification for that change, nor do I necessarily claim it’s the minimum correct size.

@jamesjer
Copy link
Contributor

To find the size needed, we need the maximum value of i % (size_t)rc, which is rc - 1, the maximum value of i, which is num_chunks - 1, and the maximum value of j, which is _n - 1. Then the size will need to be ((rc - 1) * _n + (_n - 1)) * rclen + 2 * (num_chunks - 1) + 8. That simplifies to (rc * _n - 1) * rclen + 2 * num_chunks + 6. Substituting the definition of rclen gives us (rc * _n - 1) * rclen + rclen + 1, which simplifies to (rc * _n) * rclen + 1. However, the actual allocated size of combined is missing the + 1. Just add that and be happy.

@collares
Copy link
Author

collares commented Oct 17, 2023

An alternative solution would be to add 1 to the value of rclen instead, given that 2 * num_chunks + 6 appeared in the expression. Is there a way to determine which one is preferable? The reason I ask is that the source comment on how large rclen needs to be ("* needs room to hold (max long long) << (num_chunks * chunksize)") doesn't explain the current choice of + 5; going by the explanation, I would really expect it to be 2*(num_chunks-1) + 8, where + 8 accounts for the number of bytes in a long long.

@collares
Copy link
Author

collares commented Jun 18, 2024

Just add that and be happy.

I see this was used as justification for the actual fix. Just checking, does this take care of possible overlaps inside the array? The argument might be correct in this case, but the same argument would also "prove" you only need 7*n+1 bits of space to store n 8-bit values, since in this case checkers would also only see a 1-bit-past-the-bounds access (in this contrived example, the factor multiplying n is meant to be analogous to rclen in the code, so what I am claiming is that we should check rclen is correct as well). The counterargument is that tests would probably catch this "inner overlap".

The comment indicates rclen needs enough space to hold (max long long) << (num_chunks * chunk_size). Since the max long log has 63 bits set and chunk_size is 16, that would be 63 + num_chunks * 16 bits, or 2*num_chunks + 8 bytes rounding up. My previous comment argued for 2*(num_chunks-1) + 8 based on another code comment (the one saying result = -cstd[i]; result <<= (num_chunks-1)*16;).

It might be that the upper bound indicated by the code is not tight, or ir could be that the upper bound is tight and just very unlikely to be witnessed by random tests. In the latter case there would be "inner overlap" in the array in extreme cases. That being said, this "inner overlap" probably would mix low-order bits with high-order bits of the previous number, and low-order bits are easy to test randomly, so it's likely the bound in the code comment is not tight, so I wouldn't worry about it.

cc @jgdumas (tagged jamesjer by accident in a previous version, sorry)

@collares
Copy link
Author

Ok, I think I see. max long long is not 63, but rather 53 because the code internally stores the numbers as doubles and probably enforces it all fits exactly in the 52+1 mantissa bits. So the correct bound for rclen in bits is 53 + 16*(num_chunks-1), which when divided by 8 and rounded up comes to 2*num_chunks + 5 as in the code.

Now that I think I'm convinced the current bound on rclen is valid, I agree that the implemented fix should be correct. Sorry for the noise @jamesjer and @jgdumas!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants