Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Utilize std::shuffle #9004

Closed
wants to merge 8 commits into from
Closed

Utilize std::shuffle #9004

wants to merge 8 commits into from

Conversation

krionbsd
Copy link
Contributor

@krionbsd krionbsd commented Apr 6, 2020

std::random_shuffle is deprecated and removed starting from C++17, so we
might better use std::shuffle which uses URBG:
https://en.cppreference.com/w/cpp/named_req/UniformRandomBitGenerator
under the hood for a better randoms generation.

Short description

utilize std::shuffle

Checklist

I have:

std::random_shuffle is deprecated and removed starting from C++17, so we
might better use std::shuffle which uses URBG:
https://en.cppreference.com/w/cpp/named_req/UniformRandomBitGenerator
under the hood for a better randoms generation.
@omoerbeek
Copy link
Member

@omoerbeek omoerbeek commented Apr 6, 2020

constant seed is not acceptable. Better not seeding at all.

pdns/misc.cc Outdated
@@ -597,7 +599,7 @@ void shuffle(vector<DNSZoneRecord>& rrs)
break;

if(second-first > 1)
random_shuffle(first,second);
shuffle(first, second, std::default_random_engine(seed));
Copy link
Collaborator

@zeha zeha Apr 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It appears default_random_engine is implementation defined, probably including the constructor signature. Should we really use this?

I'm also concerned about providing a static seed of zero to an implementation defined random engine.

https://www.cplusplus.com/reference/random/default_random_engine/ suggests that it could be a linear_congruential_engine
which has a default seed value (of 1u)...

@omoerbeek
Copy link
Member

@omoerbeek omoerbeek commented Apr 6, 2020

I much prefer to see a wrapper adding a random_engine interface to our existingdns_random function.

@omoerbeek
Copy link
Member

@omoerbeek omoerbeek commented Apr 6, 2020

Something like this:

struct dns_random_engine {

  typedef uint32_t result_type;
  static constexpr uint32_t min() { return 0; }
  static constexpr uint32_t max() { return std::numeric_limits<uint32_t>::max() - 1; }
  uint32_t operator()() { return dns_random(std::numeric_limits<uint32_t>::max()); }
}; 

Though using it in misc.cc will introduce some linker errors building the tools that have to be sorted out.

@omoerbeek
Copy link
Member

@omoerbeek omoerbeek commented Apr 8, 2020

This is how it could be done. I moved the shuffle functions to dns_random.cc since that avoid linking (indirectly) all auth tools with openssl.
If we don't think they belong in dns_random.hh they should probably move to a separate file.

Also, I'd really like a dns_random() without args that produces the full uint32_t range.

@rgacogne
Copy link
Member

@rgacogne rgacogne commented Apr 8, 2020

If we don't think they belong in dns_random.hh they should probably move to a separate file.

It seems wrong to pull in dnsparser.hh and the likes in the dns_random code indeed (which has not a lot to do with DNS, but meh).

Also, I'd really like a dns_random() without args that produces the full uint32_t range.

We used to have that and it led to parts of the code silently storing the result in a uint16_t, for example, not understanding that truncation is terrible for the entropy of the resulting value.

@omoerbeek
Copy link
Member

@omoerbeek omoerbeek commented Apr 8, 2020

I understand the concerns. It just feels wasteful to always compute de-biased value when we know the full 32-bits could be used used.

@krionbsd
Copy link
Contributor Author

@krionbsd krionbsd commented Apr 8, 2020

I understand the concerns. It just feels wasteful to always compute de-biased value when we know the full 32-bits could be used used.

We might use smth like:

    std::random_device rd;
    std::mt19937 g(rd());
    std::shuffle(v.begin(), v.end(), g);

which makes more sense than static seed of zero but it doesn't solve your wish to centralize this approach. I found out we have another brick in the wall in syncres.cc which utilizes random_shuffle as well.

@omoerbeek
Copy link
Member

@omoerbeek omoerbeek commented Apr 8, 2020

I understand the concerns. It just feels wasteful to always compute de-biased value when we know the full 32-bits could be used used.

We might use smth like:

    std::random_device rd;
    std::mt19937 g(rd());
    std::shuffle(v.begin(), v.end(), g);

which makes more sense than static seed of zero but it doesn't solve your wish to centralize this approach. I found out we have another brick in the wall in syncres.cc which utilizes random_shuffle as well.

An issue with random devices is that they might not be available in chroots.
Also, it also seems a bit wasteful to instantiate a mersenne twister (which has a lot of state) for just shuffling a few records. So using the existing random source has a big advantage imo.

@omoerbeek
Copy link
Member

@omoerbeek omoerbeek commented Apr 8, 2020

Checking what's going on with the unit test failures.

@omoerbeek
Copy link
Member

@omoerbeek omoerbeek commented Apr 8, 2020

The problem in the recurssor is coming from the a shuffleInSpeedOrder which trips this assertion:

basic_string& operator=(BOOST_RV_REF(basic_string) x)
      BOOST_NOEXCEPT_IF(allocator_traits_type::propagate_on_container_move_assignment::value
                                  || allocator_traits_type::is_always_equal::value)
   {
      //for move constructor, no aliasing (&x != this) is assummed.
      BOOST_ASSERT(this != &x);
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007ffff715c535 in __GI_abort () at abort.c:79
#2  0x00007ffff715c40f in __assert_fail_base (fmt=0x7ffff72beee0 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", 
    assertion=0x555555ac5fec "this != &x", file=0x555555ac5901 "/usr/include/boost/container/string.hpp", line=857, 
    function=<optimized out>) at assert.c:92
#3  0x00007ffff716a102 in __GI___assert_fail (assertion=0x555555ac5fec "this != &x", 
    file=0x555555ac5901 "/usr/include/boost/container/string.hpp", line=857, 
    function=0x555555ac5ff7 "basic_string<CharT, Traits, Allocator> &boost::container::basic_string<char, std::char_traits<char>, boost::container::new_allocator<char> >::operator=(basic_string<CharT, Traits, Allocator> &&) [Char"...) at assert.c:101
#4  0x0000555555a36dab in boost::container::basic_string<char, std::char_traits<char>, boost::container::new_allocator<char> >::operator= (this=<optimized out>, x=...) at /usr/include/boost/container/string.hpp:857
#5  DNSName::operator= (this=<optimized out>) at ./dnsname.hh:62
#6  std::swap<DNSName> (__a=..., __b=...) at /usr/bin/../lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8/bits/move.h:194
#7  0x0000555555a36906 in std::pair<DNSName, float>::swap (this=0x7fffac0cd410, __p=...)
    at /usr/bin/../lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8/bits/stl_pair.h:429
#8  std::swap<DNSName, float> (__x={...}, __y=...)
    at /usr/bin/../lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8/bits/stl_pair.h:497
#9  std::iter_swap<__gnu_cxx::__normal_iterator<std::pair<DNSName, float>*, std::vector<std::pair<DNSName, float>, std::allocator<std::pair<DNSName, float> > > >, __gnu_cxx::__normal_iterator<std::pair<DNSName, float>*, std::vector<std::pair<DNSName, float>, std::allocator<std::pair<DNSName, float> > > > > (__a=..., __b=...)
    at /usr/bin/../lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8/bits/stl_algobase.h:148
#10 std::shuffle<__gnu_cxx::__normal_iterator<std::pair<DNSName, float>*, std::vector<std::pair<DNSName, float>, std::allocator<std::pair<DNSName, float> > > >, pdns::dns_random_engine> (__first=..., __last=..., __g=...)
    at /usr/bin/../lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8/bits/stl_algo.h:3845
#11 0x0000555555a3284c in SyncRes::shuffleInSpeedOrder (this=0x7fffac09abb8, tnameservers=..., Python Exception <class 'gdb.error'> There is no member named _M_dataplus.: 
prefix=) at syncres.cc:1660

@omoerbeek
Copy link
Member

@omoerbeek omoerbeek commented Apr 8, 2020

What seems to be happening is a self-move that boost does not like. I'll comment out the offending syncres.cc change for now But we need to find out what's happening here. maybe this revealed some issue in DNSName?

@omoerbeek
Copy link
Member

@omoerbeek omoerbeek commented Apr 8, 2020

Rec bulk test still fails., with same assertion, but now coming from orderAndShuffle()

@omoerbeek
Copy link
Member

@omoerbeek omoerbeek commented Apr 8, 2020

not using boost::container::string in DNSName makes the problem go away.

@rgacogne
Copy link
Member

@rgacogne rgacogne commented Apr 8, 2020

Note that this assertion is gone in boost's trunk: boostorg/container@5f0ce60

@rgacogne
Copy link
Member

@rgacogne rgacogne commented Apr 8, 2020

What seems to be happening is a self-move that boost does not like. I'll comment out the offending syncres.cc change for now But we need to find out what's happening here. maybe this revealed some issue in DNSName?

It doesn't look like an issue in DNSName, AFAICT, most likely the algorithm is trying to swap an entry with itself (which is weird..) triggering a self-move assignment?

@omoerbeek
Copy link
Member

@omoerbeek omoerbeek commented Apr 8, 2020

What seems to be happening is a self-move that boost does not like. I'll comment out the offending syncres.cc change for now But we need to find out what's happening here. maybe this revealed some issue in DNSName?

It doesn't look like an issue in DNSName, AFAICT, most likely the algorithm is trying to swap an entry with itself (which is weird..) triggering a self-move assignment?

Agreed. I'm trying to push a workaround that avoids the self-assignment in DNSName buy github is giving me permission denied

@omoerbeek omoerbeek mentioned this pull request Apr 8, 2020
7 tasks
@omoerbeek
Copy link
Member

@omoerbeek omoerbeek commented Apr 8, 2020

Continued in own branch in #9016

@omoerbeek omoerbeek closed this Apr 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants