Skip to content

Replace Refcounted with std::shared_ptr #845

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

dymk
Copy link
Contributor

@dymk dymk commented Dec 8, 2017

Replaces the intrusive yarpl::Refcounted with std::shared_ptr. Inherit enable_get_ref from a few more classes.

Benchmarks show ~5% or so perf improvement with std::shared_ptr. Finding the impact of/if there are any memory usage differences should be done via canary (this diff introduces 1 word of overhead per object).

Original Reference implementation

Benchmarks.cpp:11] Running benchmarks... (takes minutes)
StreamThroughputTcp.cpp:45] Running:
StreamThroughputTcp.cpp:46]   Server with 8 threads.
StreamThroughputTcp.cpp:47]   10 clients across 10 threads.
StreamThroughputTcp.cpp:49]   Running 1 streams of 1000000 items each.
============================================================================
/Users/dymk/code/c-c++/rsocket-cpp/benchmarks/StreamThroughputTcp.cpprelative  time/iter  iters/s
============================================================================
StreamThroughput                                              8.10s  123.40m
============================================================================
        8.54 real        18.39 user        28.62 sys

Benchmarks.cpp:11] Running benchmarks... (takes minutes)
FireForgetThroughputTcp.cpp:55] Running:
FireForgetThroughputTcp.cpp:56]   Server with 8 threads.
FireForgetThroughputTcp.cpp:57]   10 clients across 10 threads.
FireForgetThroughputTcp.cpp:59]   Running 1000000 requests in total.
============================================================================
/Users/dymk/code/c-c++/rsocket-cpp/benchmarks/FireForgetThroughputTcp.cpprelative  time/iter  iters/s
============================================================================
FireForgetThroughput                                         57.80s   17.30m
============================================================================
       58.24 real        41.56 user       147.67 sys

std::shared_ptr:

Benchmarks.cpp:11] Running benchmarks... (takes minutes)
StreamThroughputTcp.cpp:45] Running:
StreamThroughputTcp.cpp:46]   Server with 8 threads.
StreamThroughputTcp.cpp:47]   10 clients across 10 threads.
StreamThroughputTcp.cpp:49]   Running 1 streams of 1000000 items each.
============================================================================
/Users/dymk/code/c-c++/rsocket-cpp/benchmarks/StreamThroughputTcp.cpprelative  time/iter  iters/s
============================================================================
StreamThroughput                                              7.82s  127.95m
============================================================================
        8.20 real        19.17 user        28.43 sys

Benchmarks.cpp:11] Running benchmarks... (takes minutes)
FireForgetThroughputTcp.cpp:55] Running:
FireForgetThroughputTcp.cpp:56]   Server with 8 threads.
FireForgetThroughputTcp.cpp:57]   10 clients across 10 threads.
FireForgetThroughputTcp.cpp:59]   Running 1000000 requests in total.
============================================================================
/Users/dymk/code/c-c++/rsocket-cpp/benchmarks/FireForgetThroughputTcp.cpprelative  time/iter  iters/s
============================================================================
FireForgetThroughput                                         54.77s   18.26m
============================================================================
       55.21 real        41.35 user       155.63 sys

@dymk dymk changed the title Remove custom Refcounted type Replace Refcounted with std::shared_ptr Dec 8, 2017
@dymk dymk force-pushed the remove-refcounted branch 9 times, most recently from d4ea30b to 9451968 Compare December 11, 2017 22:40
@@ -258,7 +258,7 @@ void coldResumer(uint32_t port, uint32_t client_num) {
}
}

TEST(ColdResumptionTest, SuccessfulResumption) {
TEST(ColdResumptionTest, DISABLED_SuccessfulResumption) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test was constantly flaking on me. Disabling until we can rootcause why it's flaky.

Copy link
Contributor

@lexs lexs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a comment.

message("Compiler lacks support std::atomic<std::shared_ptr>; wrapping with a mutex")
elseif(YARPL_WRAP_SHARED_IN_ATOMIC)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -DYARPL_WRAP_SHARED_IN_ATOMIC")
message("Compiler lacks std::shared_ptr atomic overloads; wrapping in std::atomic")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the difference between this and the above? Both mention the same thing lacking but different workarounds?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Newer compilers/libstdc++ have std::atomic_{exchange,load,store} specialized for std::shared_ptr<T> directly, so the std::atomic structure never shows up. If that's not supported, the code falls back to wrapping an std::shared_ptr in std::atomic structure. If that isn't supported either (my local clang doesn't like it), it falls back further to locking with a mutex and shimming {exchange,load,store}.

@@ -9,7 +9,7 @@ namespace flowable {

class Subscription : public virtual Refcounted {
public:
virtual ~Subscription() = default;
// virtual ~Subscription() = default;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove?

@phoad
Copy link
Member

phoad commented Dec 12, 2017

So what are the side effects that we see when we use shared_ptr? Did we see any memory usage increase etc as @vjn mentioning?

Is it possible to get some numbers that you can put to a potential post that declares replacing refcounted with shared_ptr?

We can test this at the Presence Service and Ads Canary. They provide good stats.
When @alexmalyshev completes the land, we can take this diff without even checking in to github and start these tests. We had %20 memory improvement for Presence Service and no degradation for Ads, so we can validate its still the case..

@dymk
Copy link
Contributor Author

dymk commented Dec 12, 2017

Added benchmark numbers, @phoad. Looks like a perf win (or at least no regression). 1 word of overhead per object likely won't show up much, so extra word will just consume padding when that happens?

either that or, more likely, shared_ptr is just more optimized by the compiler/stdlib.

@@ -156,10 +156,35 @@ set(GMOCK_LIBS
set(CMAKE_CXX_STANDARD 14)

# Common configuration for all build modes.
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -std=c++14")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, I'm surprised we didn't already have this, but sure, makes sense.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this parameter is coming down from travis. Do we need to explicitly set it in cmake?

set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wall -Wextra -pedantic")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Woverloaded-virtual")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -g")

# does the compiler support std::atomic_load(&std::shared_ptr<T>) ?
try_compile(HAS_NATIVE_ATOMIC_SHARED_PTR ${CMAKE_BINARY_DIR}/has_shared_ptr_support_rs
${CMAKE_CURRENT_SOURCE_DIR}/yarpl/test/test_has_shared_ptr_support.cpp)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could use https://cmake.org/cmake/help/v3.0/module/CheckCXXSourceCompiles.html to write the C++ code inline here if you want. I wouldn't block the change on it though.


class enable_get_ref {
class enable_get_ref : public std::enable_shared_from_this<enable_get_ref> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this as opposed to good ol' fashioned std::enable_shared_from_this<T>? Is it because we need some base class to be the shared virtual class?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, we just need something to be the inherited class (not even virtual).

Copy link
Contributor

@alexmalyshev alexmalyshev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gonna bless this.

@dymk dymk force-pushed the remove-refcounted branch from 9451968 to 35ed41a Compare December 14, 2017 19:42
Copy link

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lexs has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot pushed a commit that referenced this pull request Jan 3, 2018
Summary: Manual import/rebase of #845

Reviewed By: lehecka

Differential Revision: D6620084

fbshipit-source-id: ba3e44960c1cc18318767ca479e1e35449c0d51b
facebook-github-bot pushed a commit to facebook/fbthrift that referenced this pull request Jan 3, 2018
Summary: Manual import/rebase of rsocket/rsocket-cpp#845

Reviewed By: lehecka

Differential Revision: D6620084

fbshipit-source-id: ba3e44960c1cc18318767ca479e1e35449c0d51b
@dymk
Copy link
Contributor Author

dymk commented Jan 3, 2018

Pulled in via b3f1927

@dymk dymk closed this Jan 3, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants