Conversation
leskin-in
left a comment
There was a problem hiding this comment.
Note https://github.com/greenplum-db/gpdb/pull/10450 merges the commit you mentioned into 6X.
src/backend/cdb/motion/tupleremap.c
Outdated
There was a problem hiding this comment.
I suppose an extra if (value == 0) should be added before the call to BuildTupleRemapInfo(). In the case in question, we do not need the result of the method, and it does not have any side-effects (except for draining space from remapper->mycontext).
Hmm, I cannot find this commit. Could you point to it? |
There was a problem hiding this comment.
@maksm90 can I ask you to make some additional research about the absence of current error on master?
You are right, the storage of external parameters was refactored in master branch since 6X - now we store them in QueryDispatchDesc. As a result, serialization was also refactored: on 6X we use serializeParamListInfo(), on master - serializeParamsForDispatch(). But I don't see big difference in the code here. The same situation about deserialization: deserializeParamListInfo() on 6X and deserializeExternParams() on master are very much alike while TRRemapDatum() is identical on 6X and master (everything works without your current patch). So may be we should find out the problem in serialization rather than make additional check on deserialization side?
Sure
Serialized representation of external and internal parameters are stored separately
Similarly deserialization of external and internal parameters on QE side are separated. But deserialization procedure for internal parameters doesn't use buggy |
darthunix
left a comment
There was a problem hiding this comment.
@maksm90 thank you for explanation above. I think it would be a very invasive modification to port master branch refactoring (https://github.com/greenplum-db/gpdb/commit/53d12bd56fd124fa1b0bcd0d72ff7cf69f0bd441), so it is better to fix the problem with small modifications as you have suggested.
Yes, I was incorrect about that. The code in question is not in the current 6X_STABLE. |
The master node dispatches external and internal (gathered from plan tree) parameters along with query plan. Internal parameters might include not initialized zero values, e.g., used for storing results of not yet evaluated initPlans. Those parameters are trasmitted as zero values that cases segfault on segments under deserialization of complex type value. The current fix intercepts on QE side handling of zero value parameter before further deserialization process.
Problem description: After sequential execution of isolation2 tests 'standby_replay_dtx_info' and 'ao_unique_index' the coordinator's standby postmaster process together with its children processes were terminated. Root cause: Test 'standby_replay_dtx_info' sets fault injection 'standby_gxacts_overflow' on coordinator's standby, which updates the global var 'max_tm_gxacts' (the limit of distributed transactions) to 1, but at the reset of this fault the value of 'max_tm_gxacts' was not updated to its original value. Therefore, on any next test that created more than 2 distributed transactions that were replayed on the standby, the standby encountered the fatal error "the limit of 1 distributed transactions has been reached" and it was terminated. Fix: Set 'max_tm_gxacts' to its original value when fault injection 'standby_gxacts_overflow' is not set. (cherry picked from commit 423cc57b779bfb8f048f47425b428091a7d959a9)
Problem description
The master node dispatches external and internal (gathered from plan tree) parameters along with query plan. Internal ones might include not initialized values, e.g., used for storing results of not yet evaluated initPlans. Those parameters are transmitted as zero values that cases segfault on segments under deserialization of complex type value.
The current fix intercepts on QE side handling of zero value parameter before further deserialization process.
Scenario to reproduce
Faulty query have to has the following plan:
Internal executor parameter here is $0 that is used in “One-Time Filter” of Result node.
Affected versions
On master branch a huge refactoring was done in place of dispatching executor parameters - commit 53d12bd . In particular, it segregates external and internal parameters in serialized representation of transmitted data and segregates de-serialization procedures of these parameters. The master works perfectly in this context.
6X and 5X are sensitive to this error.