New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PHPC-1179: Reimplement tests that start servers with Mongo Orchestration #913
Conversation
tests/utils/tools.php
Outdated
function failGetMore(MongoDB\Driver\Manager $manager) | ||
{ | ||
configureFailPoint($manager, 'failReceivedGetmore', 'alwaysOn'); | ||
configureFailPoint($manager, 'failCommand', 'alwaysOn', [ 'errorCode' => 237, 'failCommands' => ['getMore'] ]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We use 237
here, as that's the same original code that MongoD would throw if a cursor had already gone by the time we call getMore
. This allows us to make things consistent with the getMore OP behaviour from previous mongod versions. An errorCode
is required here for the failPoint to work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like something worth adding to a block comment for this line or method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
51421b0
to
4c12923
Compare
3ca1e29
to
da857a9
Compare
php_phongo.h
Outdated
@@ -97,6 +97,12 @@ typedef enum { | |||
* or command should select ExecutionTimeoutException. */ | |||
#define PHONGO_SERVER_ERROR_EXCEEDED_TIME_LIMIT 50 | |||
|
|||
/* These constants are used for determining if a server error is due to a | |||
* broken cursor. If so, we select a ConnectionException as this is what we did | |||
* when getMore wasn't a command yet. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly, a ConnectionException was previously thrown because we experienced an opcode failure instead of a mere command failure (with ok: 0
). I understand the desire for consistency in error reporting, but are we lying to users here?
I would think that a ConnectionException implies that the socket itself is discarded. It's possible that libmongoc may very well have done that for the old getmore failure (I don't know offhand so it bears investigation), but I don't think it would for a failed command.
Both of these errors seem very much like ServerExceptions. Although both ConnectionException and ServerException extend RuntimeException (and may report error labels), users can now rest assured that these error codes come from the server, which makes ServerException more sensible.
With the older getmore opcode failures, there was no server-side error code, so ConnectionException (or a base RuntimeException) would have made sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's why I had a separate commit for just this change. I think being more consistent is more helpful though...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But this comes at a cost of inconsistency with regard to error code reporting. We tell users that that any server-originating error codes are used in exceptions within the ServerException hierarchy.
Am I correct in assuming that the error codes for killed cursors using OP_GETMORE differ from the command's error code? I believe so, as MONGOC_ERROR_CURSOR_INVALID_CURSOR
was simply a sequential value in mongoc_error_code_t
and that is what is used in _mongoc_rpc_check_ok()
when inspecting an OP_REPLY. When the getMore command used, libmongoc now reports the error differently and uses the new server-originating error codes, which also changes how users can detect and differentiate each error.
IMO, the error reporting inconsistency is due to the server's API. Given that, I think the variation of exceptions is excusable based on the wire protocol we're using, and using ServerException is actually an improvement when it comes to distinguishing the errors via their code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've removed this now, and split the two test cases in a pre-3.6 and post-3.6 version.
tests/utils/tools.php
Outdated
$cmd = new MongoDB\Driver\Command($doc); | ||
$result = $manager->executeCommand("admin", $cmd); | ||
} catch (Exception $e) { | ||
/* don't care if the fail point can't be set, it's MongoD version |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you're now skipping tests when you know a failpoint isn't available, wouldn't you still want to keep the original behavior here (var_dump()
and throw on unexpected failure to set a failpoint)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it a programming error to call configureFailPoint()
without first checking that the fail point is supported? I thought that's the reason we have the various skip functions and special behavior like reporting that failpoints are never supported on mongos.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's complicated. Because we need to set two failPoints for getMore, which one is available depends on the exact mongod version. We can either ignore the exception, or, do lots of more complex checking in failGetMore
. I've decided that this now "good enough" for what we need.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I didn't notice the code for failGetMore()
when I left that last comment (assuming it was implemented as such the entire time this PR has been open).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that failGetMore()
only calls itself when setting a fail point should succeed, can we get rid of the exception swallowing? That was one of the reasons for my suggestion to change failGetMore()
in the first place.
tests/utils/tools.php
Outdated
function getMOUri() { | ||
if (!($HOST = getenv("MONGODB_ORCHESTRATION_HOST"))) { | ||
$HOST = "192.168.112.10"; | ||
function getMongoOrchestrationURI() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reckon we can do away with this method entirely and simply define the appropriate constants in basic.inc
as we do for others. Consider:
define('MONGODB_ORCHESTRATION_HOST', getenv('MONGODB_ORCHESTRATION_HOST') ?: 'localhost');
define('MONGODB_ORCHESTRATION_PORT', getenv('MONGODB_ORCHESTRATION_HOST') ?: '8889');
define('MONGODB_ORCHESTRATION_URI', sprintf('http://%s:%s/v1', MONGODB_ORCHESTRATION_HOST, MONGODB_ORCHESTRATION_PORT));
If the individual host and port values aren't needed as their own constant, you could also use local variables with the same ?:
expression purely to ensure MONGODB_ORCHESTRATION_URI
gets set properly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no reason to have HOST and PORT as specific constants. And I also think that the name MONGODB_ORCHESTRATION_URI
is wrong, as the "DB" isn't part of the "Mongo Orchestration" name. I've updated it.
tests/utils/tools.php
Outdated
function failGetMore(MongoDB\Driver\Manager $manager) | ||
{ | ||
configureFailPoint($manager, 'failReceivedGetmore', 'alwaysOn'); | ||
configureFailPoint($manager, 'failCommand', 'alwaysOn', [ 'errorCode' => 237, 'failCommands' => ['getMore'] ]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like something worth adding to a block comment for this line or method.
f4eeb90
to
2f483a7
Compare
tests/utils/tools.php
Outdated
$cmd = new MongoDB\Driver\Command($doc); | ||
$result = $manager->executeCommand("admin", $cmd); | ||
} catch (Exception $e) { | ||
/* don't care if the fail point can't be set, it's MongoD version |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I didn't notice the code for failGetMore()
when I left that last comment (assuming it was implemented as such the entire time this PR has been open).
51dd9e4
to
4ef43d0
Compare
4ef43d0
to
0f0e65f
Compare
0f0e65f
to
da511cb
Compare
These tests could not have worked since we moved from a Wire Protocol getMore OP to the OP_COMMAND style of calling
getMore
. Beats me how we didn't manage to detect this.The second commit isn't strictly necessary, but it updates the code to be able to throw a
ConnectionException
if getMore fails.Previously, the
getMore
tests use a failPoint for the getMore OP, where the driver would then (throughlibmongoc
) throw aConnectionException
. With the shift to the command-based failPoint,libmongoc
/the driver now converted that to aServerException
. This second commit restores that behaviour. Alternatively, we can just update the test to check for aServerException
instead.