Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Squash merge of the multithreaded PM hash table construction #920

Merged
merged 2 commits into from Nov 5, 2019
Merged

Squash merge of the multithreaded PM hash table construction #920

merged 2 commits into from Nov 5, 2019

Conversation

pleblanc1976
Copy link
Contributor

I noticed a few diffs vs baseline in the working_tpch1_compareLogOnly/windowFunctions tests, but has nothing to do with the join code.

This was written initially as a proof of concept. I've improved it since. Feel free to reject if there is something that needs further cleaning. :D Huh. Github has emojis? 馃拑

Squashed commit of the following:

commit fe4cc37
Author: Patrick LeBlanc patrick.leblanc@mariadb.com
Date: Fri Nov 1 13:38:11 2019 -0400

Added some code comments to the new join code.

commit a7a82d0
Author: Patrick LeBlanc patrick.leblanc@mariadb.com
Date: Fri Nov 1 13:17:47 2019 -0400

Fixed an error down a path I think is unused.

commit 4e6c7c2
Author: Patrick LeBlanc patrick.leblanc@mariadb.com
Date: Fri Nov 1 13:12:12 2019 -0400

std::atomic doesn't exist in C7, -> boost::atomic.

commit ed0996c
Author: Patrick LeBlanc patrick.leblanc@mariadb.com
Date: Wed Oct 16 12:47:32 2019 -0500

Addition to the previous fix (join dependency projection).

commit 97bb806
Author: Patrick LeBlanc patrick.leblanc@mariadb.com
Date: Tue Oct 15 15:22:09 2019 -0500

Found and fixed a bad mem access, which may have been there for 8 years.

commit d8b0432
Author: Patrick LeBlanc patrick.leblanc@mariadb.com
Date: Tue Oct 15 14:04:48 2019 -0500

Minor optimization in some code I happened to look at.

commit b6ec820
Author: Patrick LeBlanc patrick.leblanc@mariadb.com
Date: Tue Oct 15 14:04:11 2019 -0500

Fixed a compiler warning.

commit 0bf3e52
Author: Patrick LeBlanc patrick.leblanc@mariadb.com
Date: Tue Oct 15 10:11:09 2019 -0500

Undid part of the previous commit.

commit 5dfa1d2
Author: Patrick LeBlanc patrick.leblanc@mariadb.com
Date: Mon Oct 14 18:00:21 2019 -0500

Proofread the diff vs base, added some comments, removed some debugging stuff.

commit 411fd95
Author: Patrick LeBlanc patrick.leblanc@mariadb.com
Date: Fri Oct 11 13:55:39 2019 -0500

If a dev build (SKIP_OAM_INIT), made postConfigure exit before trying
to start the system, because that won't work.

commit 634b1b8
Author: Patrick LeBlanc patrick.leblanc@mariadb.com
Date: Mon Sep 30 14:55:45 2019 -0500

Reduced crit section of BPP::addToJoiner a little.

commit 31f30c6
Author: Patrick LeBlanc patrick.leblanc@mariadb.com
Date: Wed Sep 18 11:09:27 2019 -0500

Checkpointing.  make the add joiner stuff free tmp mem quickly.

commit 9b7e788
Author: Patrick LeBlanc patrick.leblanc@mariadb.com
Date: Wed Sep 18 10:38:57 2019 -0500

Checkpoint.  Removed tmp hardcoding of bucket count.

commit fda4d8b
Author: Patrick LeBlanc patrick.leblanc@mariadb.com
Date: Wed Sep 18 10:20:09 2019 -0500

Checkpoint.  Adjusted unproductive loop wait time.

commit 7b9a67d
Author: Patrick LeBlanc patrick.leblanc@mariadb.com
Date: Wed Sep 18 10:10:43 2019 -0500

Checkpointing add'l optimizations.

If we promote bpp::processorThreads / bucket count to a power of 2, we can
use a bitmask instead of a mod operation to decide a bucket.

Also, boosted utilization by not waiting for a bucket lock to become free.
There are likely more gains to be had there with a smarter strategy.
Maybe have each thread generate a random bucket access pattern to reduce
chance of collision.  TBD.

commit abe7dab
Author: Patrick LeBlanc patrick.leblanc@mariadb.com
Date: Tue Sep 17 16:15:51 2019 -0500

Multithreaded PM hash table construction likely works here.

A couple more fixes.
 - missed a mod after a hash in one place.
 - Made the PoolAllocator thread safe (small degree of performance hit
   there in threaded env).  May need to circle back to the table
   construction code to eliminate contention for the allocators instead.

commit ab30876
Author: Patrick LeBlanc patrick.leblanc@mariadb.com
Date: Tue Sep 17 12:14:14 2019 -0500

Checkpointing.  Did some initial testing, fixed a couple things.

Not done testing yet.

commit 3b161d7
Author: Patrick LeBlanc patrick.leblanc@mariadb.com
Date: Tue Sep 17 11:24:55 2019 -0500

Checkpointing.  First cut of multithreaded PM join table building.

Builds but is untested.

commit cb7e6e1
Author: Patrick LeBlanc patrick.leblanc@mariadb.com
Date: Mon Sep 16 13:03:50 2019 -0500

Increase the STLPoolAllocator window size to reduce destruction time.

commit b0ddaaa
Author: Patrick LeBlanc patrick.leblanc@mariadb.com
Date: Fri Sep 13 11:52:51 2019 -0500

Fixed a bug preventing parallel table loading.  works now.

commit b870396
Author: Patrick LeBlanc patrick.leblanc@mariadb.com
Date: Thu Sep 12 22:04:15 2019 -0500

Checkpointing some experimental changes.

 - Made the allocator type used by PM joins the STLPoolAllocator
 - Changed the default chunk size used by STLPoolAlloc based on a few test
    runs
 - Made BPP-JL interleave the PM join data by join # to take advantage
    of new locking env on PM.
 - While I was at it, fixed MCOL-1758.

commit fd4b09c
Author: Patrick LeBlanc patrick.leblanc@mariadb.com
Date: Thu Sep 12 16:03:30 2019 -0500

Speculative change.  Row estimator was stopping at 20 extents.

Removed that limitation.

commit 7dcdd5b
Author: Patrick LeBlanc patrick.leblanc@mariadb.com
Date: Thu Sep 5 09:10:28 2019 -0500

Inlined some hot simpleallocator fcns.

commit 6d84dac
Author: Patrick LeBlanc patrick.leblanc@mariadb.com
Date: Wed Sep 4 17:02:29 2019 -0500

Some optimizations to PM hash table creation.

- made locks more granular.
- reduced logic per iteration when adding elements.

commit b20bf54
Author: Patrick LeBlanc patrick.leblanc@mariadb.com
Date: Wed Sep 4 15:32:32 2019 -0500

Reduced granularity of djLock in PrimProc.

commit 6273a8f
Author: Patrick LeBlanc patrick.leblanc@mariadb.com
Date: Wed Sep 4 14:45:58 2019 -0500

Added a timer to PM hash table construction

signal USR1 will print cumulative wall time to stdout & reset the timer.

Squashed commit of the following:

commit fe4cc37
Author: Patrick LeBlanc <patrick.leblanc@mariadb.com>
Date:   Fri Nov 1 13:38:11 2019 -0400

    Added some code comments to the new join code.

commit a7a82d0
Author: Patrick LeBlanc <patrick.leblanc@mariadb.com>
Date:   Fri Nov 1 13:17:47 2019 -0400

    Fixed an error down a path I think is unused.

commit 4e6c7c2
Author: Patrick LeBlanc <patrick.leblanc@mariadb.com>
Date:   Fri Nov 1 13:12:12 2019 -0400

    std::atomic doesn't exist in C7, -> boost::atomic.

commit ed0996c
Author: Patrick LeBlanc <patrick.leblanc@mariadb.com>
Date:   Wed Oct 16 12:47:32 2019 -0500

    Addition to the previous fix (join dependency projection).

commit 97bb806
Author: Patrick LeBlanc <patrick.leblanc@mariadb.com>
Date:   Tue Oct 15 15:22:09 2019 -0500

    Found and fixed a bad mem access, which may have been there for 8 years.

commit d8b0432
Author: Patrick LeBlanc <patrick.leblanc@mariadb.com>
Date:   Tue Oct 15 14:04:48 2019 -0500

    Minor optimization in some code I happened to look at.

commit b6ec820
Author: Patrick LeBlanc <patrick.leblanc@mariadb.com>
Date:   Tue Oct 15 14:04:11 2019 -0500

    Fixed a compiler warning.

commit 0bf3e52
Author: Patrick LeBlanc <patrick.leblanc@mariadb.com>
Date:   Tue Oct 15 10:11:09 2019 -0500

    Undid part of the previous commit.

commit 5dfa1d2
Author: Patrick LeBlanc <patrick.leblanc@mariadb.com>
Date:   Mon Oct 14 18:00:21 2019 -0500

    Proofread the diff vs base, added some comments, removed some debugging stuff.

commit 411fd95
Author: Patrick LeBlanc <patrick.leblanc@mariadb.com>
Date:   Fri Oct 11 13:55:39 2019 -0500

    If a dev build (SKIP_OAM_INIT), made postConfigure exit before trying
    to start the system, because that won't work.

commit 634b1b8
Author: Patrick LeBlanc <patrick.leblanc@mariadb.com>
Date:   Mon Sep 30 14:55:45 2019 -0500

    Reduced crit section of BPP::addToJoiner a little.

commit 31f30c6
Author: Patrick LeBlanc <patrick.leblanc@mariadb.com>
Date:   Wed Sep 18 11:09:27 2019 -0500

    Checkpointing.  make the add joiner stuff free tmp mem quickly.

commit 9b7e788
Author: Patrick LeBlanc <patrick.leblanc@mariadb.com>
Date:   Wed Sep 18 10:38:57 2019 -0500

    Checkpoint.  Removed tmp hardcoding of bucket count.

commit fda4d8b
Author: Patrick LeBlanc <patrick.leblanc@mariadb.com>
Date:   Wed Sep 18 10:20:09 2019 -0500

    Checkpoint.  Adjusted unproductive loop wait time.

commit 7b9a67d
Author: Patrick LeBlanc <patrick.leblanc@mariadb.com>
Date:   Wed Sep 18 10:10:43 2019 -0500

    Checkpointing add'l optimizations.

    If we promote bpp::processorThreads / bucket count to a power of 2, we can
    use a bitmask instead of a mod operation to decide a bucket.

    Also, boosted utilization by not waiting for a bucket lock to become free.
    There are likely more gains to be had there with a smarter strategy.
    Maybe have each thread generate a random bucket access pattern to reduce
    chance of collision.  TBD.

commit abe7dab
Author: Patrick LeBlanc <patrick.leblanc@mariadb.com>
Date:   Tue Sep 17 16:15:51 2019 -0500

    Multithreaded PM hash table construction likely works here.

    A couple more fixes.
     - missed a mod after a hash in one place.
     - Made the PoolAllocator thread safe (small degree of performance hit
       there in threaded env).  May need to circle back to the table
       construction code to eliminate contention for the allocators instead.

commit ab30876
Author: Patrick LeBlanc <patrick.leblanc@mariadb.com>
Date:   Tue Sep 17 12:14:14 2019 -0500

    Checkpointing.  Did some initial testing, fixed a couple things.

    Not done testing yet.

commit 3b161d7
Author: Patrick LeBlanc <patrick.leblanc@mariadb.com>
Date:   Tue Sep 17 11:24:55 2019 -0500

    Checkpointing.  First cut of multithreaded PM join table building.

    Builds but is untested.

commit cb7e6e1
Author: Patrick LeBlanc <patrick.leblanc@mariadb.com>
Date:   Mon Sep 16 13:03:50 2019 -0500

    Increase the STLPoolAllocator window size to reduce destruction time.

commit b0ddaaa
Author: Patrick LeBlanc <patrick.leblanc@mariadb.com>
Date:   Fri Sep 13 11:52:51 2019 -0500

    Fixed a bug preventing parallel table loading.  works now.

commit b870396
Author: Patrick LeBlanc <patrick.leblanc@mariadb.com>
Date:   Thu Sep 12 22:04:15 2019 -0500

    Checkpointing some experimental changes.

     - Made the allocator type used by PM joins the STLPoolAllocator
     - Changed the default chunk size used by STLPoolAlloc based on a few test
        runs
     - Made BPP-JL interleave the PM join data by join # to take advantage
        of new locking env on PM.
     - While I was at it, fixed MCOL-1758.

commit fd4b09c
Author: Patrick LeBlanc <patrick.leblanc@mariadb.com>
Date:   Thu Sep 12 16:03:30 2019 -0500

    Speculative change.  Row estimator was stopping at 20 extents.

    Removed that limitation.

commit 7dcdd5b
Author: Patrick LeBlanc <patrick.leblanc@mariadb.com>
Date:   Thu Sep 5 09:10:28 2019 -0500

    Inlined some hot simpleallocator fcns.

commit 6d84dac
Author: Patrick LeBlanc <patrick.leblanc@mariadb.com>
Date:   Wed Sep 4 17:02:29 2019 -0500

    Some optimizations to PM hash table creation.

    - made locks more granular.
    - reduced logic per iteration when adding elements.

commit b20bf54
Author: Patrick LeBlanc <patrick.leblanc@mariadb.com>
Date:   Wed Sep 4 15:32:32 2019 -0500

    Reduced granularity of djLock in PrimProc.

commit 6273a8f
Author: Patrick LeBlanc <patrick.leblanc@mariadb.com>
Date:   Wed Sep 4 14:45:58 2019 -0500

    Added a timer to PM hash table construction

    signal USR1 will print cumulative wall time to stdout & reset the timer.
}
// if this iteration did no useful work, everything we need is locked; wait briefly
// and try again.
if (!done && !didSomeWork)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would a conditional variable with a timeout be better? (not that it really matters)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what you have in mind. Do you mean using those instead of a 'back-off' timer when no work could be done? There are a lot of ways to improve this if we want. I would suggest tickets with ideas for now, which we can go back to after we've ticked the boxes on the release.

@drrtuy
Copy link
Collaborator

drrtuy commented Nov 4, 2019

Our develop code is supposed to be built using compiler that is C++11 capable so we must avoid boost elements as much as possible. Centos7 has 4.8.5 built in by default so it has to have an explicit flag to enable C++11 support.

memUsage += size;
memInfo.mem.reset(new uint8_t[size]);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if new fails? Do we have try-except on top of this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, if nothing else then threadpool or BPPSeeder will catch it. I don't remember off the top of my head if/how the query is aborted at that point.

if ((extentsSampled >= fExtentsToSample) && (idx > 0))
// XXXPAT: Modified this fcn to sample all extents. Leaving this here due to level of arcana
// involved. :)
if (false && (extentsSampled >= fExtentsToSample) && (idx > 0))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't forget to clean this and all other commented blocks later.

pthread_mutex_unlock(&objLock);
return 0;
}

if (ot == ROW_GROUP)
for (i = 0; i < joinerCount; i++)
{
if (!typelessJoin[i])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might have missed the difference b/w the if-else code blocks but do we need this if-else if they are the same?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they're only 99.99% identical. :D One block references the 'typeless' structures, and the other references the normal kind (tlJoiners vs tJoiners).

Copy link
Collaborator

@drrtuy drrtuy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest to avoid using boost in the new code b/c I envision the happy moment when we don't have boost as a dependancy.

@pleblanc1976 pleblanc1976 merged commit 0023277 into mariadb-corporation:develop Nov 5, 2019
@pleblanc1976 pleblanc1976 deleted the newPMjoin branch November 5, 2019 16:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants