[5.0] extend life of `http_plugin_state` to ensure no invalid memory accesses during nodeos shutdown #2042

spoonincode · 2024-01-05T04:03:11Z

During the destruction of beast_http_session,

leap/plugins/http_plugin/include/eosio/http_plugin/beast_http_session.hpp

Lines 270 to 281 in 04774eb

    
           virtual ~beast_http_session() { 
        
              is_send_exception_response_ = false; 
        
              plugin_state_.requests_in_flight -= 1; 
        
              if(plugin_state_.get_logger().is_enabled(fc::log_level::all)) { 
        
                 auto session_time = steady_clock::now() - session_begin_; 
        
                 auto session_time_us = std::chrono::duration_cast<std::chrono::microseconds>(session_time).count(); 
        
                 plugin_state_.get_logger().log(FC_LOG_MESSAGE(all, "session time    ${t}", ("t", session_time_us))); 
        
                 plugin_state_.get_logger().log(FC_LOG_MESSAGE(all, "        read    ${t}", ("t", read_time_us_))); 
        
                 plugin_state_.get_logger().log(FC_LOG_MESSAGE(all, "        handle  ${t}", ("t", handle_time_us_))); 
        
                 plugin_state_.get_logger().log(FC_LOG_MESSAGE(all, "        write   ${t}", ("t", write_time_us_))); 
        
              } 
        
           }

there are accesses to plugin_state_ which is simply a reference to an object that lives in http_plugin_impl,

leap/plugins/http_plugin/include/eosio/http_plugin/beast_http_session.hpp

Line 86 in 04774eb

http_plugin_state& plugin_state_;

Additionally, notice that beast_http_session's socket_ has been constructed with the io_context from http_plugin_impl::plugin_state's thread_pool. So destruction of the socket_ will also require valid access to the plugin_state object.

But it is possible for a beast_http_session to outlive http_plugin which causes these accesses during destruction to be invalid. Consider a situation where appbase's executor (priority_queue_executor) has a queued callback holding the sole reference count of a shared_ptr<beast_http_session> and appbase is commanded to quit(). appbase's exec() upon exiting its run loop will then call appbase's shutdown(). application_base::shutdown() will then effectively call http_plugin::plugin_shutdown() (stopping http threads) and then destroy http_plugin. However, priority_queue_executor will still be keeping alive a beast_http_session via some pending callback. As appbase continues to destruct eventually priority_queue_executor is destroyed which destroys all pending callbacks. It's at that time the last reference to the shared_ptr<beast_http_session> is removed causing it to be destroyed and expecting access to http_plugin_impl::plugin_state.

But wait! When appbase's io_context stops it first clears its pending queue before application_base::shutdown(), right?
https://github.com/AntelopeIO/appbase/blob/b75b31e14f966fa3de6246e120dcba36c6ce5264/include/appbase/application_base.hpp#L148-L150
afaict this is racy: there is nothing to stop code in some other non-main thread (like http_plugin's threads!) from queueing more callbacks on to appbase's executor between the time exec.clear() is called and, say, http_plugin::shutdown() is called which halts its threads.

Not immediately seeing any good approaches to resolve this issue (especially in a patch release) beyond migrating http_plugin_impl::plugin_state to a shared_ptr so that every beast_http_session can keep it alive as long as it needs to. The socket_ needing http_plugin_impl::plugin_state's thread_pool to remain alive is seemingly what makes this non-negotiable without more significant refactoring.

I am considering this resolving #2022 because while I can't get the exact fatal malloc error to print, the failure I can reproduce here is at the same point in time.

Was unable to reproduce the issue in 3.2 or 4.0. Curiously, those versions already do have a std::shared_ptr<http_plugin_state> so maybe that's why.

plugins/http_plugin/include/eosio/http_plugin/beast_http_session.hpp

linh2931

Like your detailed analysis.

ericpassmore · 2024-01-05T23:33:21Z

Note:start
group: STABILITY
category: INTERNALS
summary: Extend life of http_plugin_state to ensure no invalid memory accesses during nodeos shutdown.
Note:end

ericpassmore · 2024-01-17T03:54:22Z

Note:start
group: STABILITY
category: BUG
summary: Prevent error on shutdown by extending the life of http_plugin_state thereby ensuring no invalid memory accesses.
Note:end

hold a shared_ptr reference to http_plugin_state in beast_http_session

5fcf5b4

heifner approved these changes Jan 5, 2024

View reviewed changes

plugins/http_plugin/include/eosio/http_plugin/beast_http_session.hpp Outdated Show resolved Hide resolved

linh2931 approved these changes Jan 5, 2024

View reviewed changes

greg7mdp approved these changes Jan 5, 2024

View reviewed changes

avoid a copy on a shared_ptr in the ctor initializer list

89fb016

greg7mdp approved these changes Jan 12, 2024

View reviewed changes

heifner approved these changes Jan 13, 2024

View reviewed changes

spoonincode merged commit 76c3b63 into release/5.0 Jan 16, 2024
29 checks passed

spoonincode deleted the http_plugin_state_life_50 branch January 16, 2024 15:36

spoonincode mentioned this pull request Jan 16, 2024

[5.0 -> main] extend life of http_plugin_state to ensure no invalid memory accesses during nodeos shutdown #2088

Merged

spoonincode linked an issue Jan 16, 2024 that may be closed by this pull request

5.0.0-rc3 WAX: Error on shutdown: corrupted size vs. prev_size #2022

Closed

spoonincode mentioned this pull request Jan 16, 2024

5.0.0-rc3 WAX: Error on shutdown: corrupted size vs. prev_size #2022

Closed

ericpassmore mentioned this pull request Jan 17, 2024

[5.0 -> main] fix disabling EOS VM OC subjective limits in tests #2064

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[5.0] extend life of `http_plugin_state` to ensure no invalid memory accesses during nodeos shutdown #2042

[5.0] extend life of `http_plugin_state` to ensure no invalid memory accesses during nodeos shutdown #2042

spoonincode commented Jan 5, 2024

linh2931 left a comment

ericpassmore commented Jan 5, 2024

ericpassmore commented Jan 17, 2024

	virtual ~beast_http_session() {
	is_send_exception_response_ = false;
	plugin_state_.requests_in_flight -= 1;
	if(plugin_state_.get_logger().is_enabled(fc::log_level::all)) {
	auto session_time = steady_clock::now() - session_begin_;
	auto session_time_us = std::chrono::duration_cast<std::chrono::microseconds>(session_time).count();
	plugin_state_.get_logger().log(FC_LOG_MESSAGE(all, "session time ${t}", ("t", session_time_us)));
	plugin_state_.get_logger().log(FC_LOG_MESSAGE(all, " read ${t}", ("t", read_time_us_)));
	plugin_state_.get_logger().log(FC_LOG_MESSAGE(all, " handle ${t}", ("t", handle_time_us_)));
	plugin_state_.get_logger().log(FC_LOG_MESSAGE(all, " write ${t}", ("t", write_time_us_)));
	}
	}

[5.0] extend life of http_plugin_state to ensure no invalid memory accesses during nodeos shutdown #2042

[5.0] extend life of http_plugin_state to ensure no invalid memory accesses during nodeos shutdown #2042

Conversation

spoonincode commented Jan 5, 2024

linh2931 left a comment

Choose a reason for hiding this comment

ericpassmore commented Jan 5, 2024

ericpassmore commented Jan 17, 2024

[5.0] extend life of `http_plugin_state` to ensure no invalid memory accesses during nodeos shutdown #2042

[5.0] extend life of `http_plugin_state` to ensure no invalid memory accesses during nodeos shutdown #2042