Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PHPC-1893: Implement SDAM event classes #1251

Closed
wants to merge 13 commits into from

Conversation

tanlisu
Copy link
Contributor

@tanlisu tanlisu commented Aug 18, 2021

@tanlisu tanlisu requested a review from jmikola August 18, 2021 15:39
src/phongo_apm.c Outdated Show resolved Hide resolved
php_phongo_structs.h Outdated Show resolved Hide resolved
src/MongoDB/Monitoring/ServerOpeningEvent.c Outdated Show resolved Hide resolved
src/MongoDB/Monitoring/ServerOpeningEvent.c Outdated Show resolved Hide resolved
tests/apm/monitoring-serverOpening-001.phpt Show resolved Hide resolved
tests/apm/monitoring-serverOpening-001.phpt Show resolved Hide resolved
@tanlisu tanlisu changed the title PHPC-1893: Implement ServerOpeningEvent class PHPC-1893: Implement ServerOpeningEvent, ServerChangedEvent classes Aug 18, 2021
@tanlisu
Copy link
Contributor Author

tanlisu commented Aug 18, 2021

I just realized that the return types in the function header comments of TopologyChangedEvent::getNewDescription() and TopologyChangedEvent::getPreviousDescription() are wrong:

/* {{{ proto MongoDB\Driver\Monitoring\TopologyChangedEvent TopologyChangedEvent::getNewDescription()

/* {{{ proto MongoDB\Driver\Monitoring\TopologyChangedEvent TopologyChangedEvent::getPreviousDescription()

Should I just fix that here or make a separate PR?

@jmikola
Copy link
Member

jmikola commented Aug 18, 2021

Should I just fix that here or make a separate PR?

Separate PR makes sense. You can probably just edit it directly through GitHub, since it's within the same file: https://github.com/mongodb/mongo-php-driver/blob/feature/sdam-monitoring/src/MongoDB/Monitoring/TopologyChangedEvent.c

Comment on lines +40 to +52
$command = new MongoDB\Driver\Command(['ping' => 1]);
$m->executeCommand(DATABASE_NAME, $command);

?>
===DONE===
<?php exit(0); ?>
--EXPECT--
- getHost() returns a string: yes
- getPort() returns an integer: yes
- getTopologyId() returns an ObjectId: yes
===DONE===
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The monitoring-serverClosed-001.phpt test fails on the GitHub CI (https://github.com/mongodb/mongo-php-driver/pull/1251/checks?check_run_id=3365180493) because I think no ServerClosedEvent is emitted from the 'ping' command - how should I make sure that we get a server closed event?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did this test ever pass locally for you, or was CI the first time you observed it in action?

According to the SDAM Monitoring spec, ServerClosedEvent is:

Published when the server monitor’s connection is closed and the server is shutdown.

You may need to use the disableClientPersistence driver option for Manager (third argument) to ensure the connection get closed when the manager object goes out of scope. Without that, the libmongoc client will outlive the Manager object.

While you're doing this, I'd also suggest using Manager::addSubscriber() for this test (ahead of PHPC-1959). Offhand, I'm not sure if adding a subscriber on a specific Manager will add a reference to it, which might prevent it from being garbage collected (and thus the libmongoc client being destroyed due to disableClientPersistence). I don't believe that's the case, as phongo_apm_copy_manager_for_client may be the only place that is done (and only for the lifetime of the command monitoring events being dispatched). Using Manager::addSubscriber() would allow us to confirm that's indeed the case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok, the test did pass locally for me but not on the CI.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I added the disableClientPersistence option but the test seems to fail for the same reason as before (https://github.com/mongodb/mongo-php-driver/pull/1251/checks?check_run_id=3374366170).

src/phongo_apm.c Outdated Show resolved Hide resolved
src/phongo_apm.c Outdated Show resolved Hide resolved
@jmikola
Copy link
Member

jmikola commented Aug 19, 2021

I can confirm that monitoring-serverClosed-001.phpt passes locally with a replica set. I noted that CI is using a standalone 4.4 server. When I run the test with that, it does fail because a ServerClosedEvent is never emitted. I wonder if that may be the result of libmongoc emitting a TopologyClosedEvent (not yet implemented in PHPC) when the entire client shuts down rather than a ServerClosedEvent. I'd suggest looking through libmongoc to see if you can trace how these events are dispatched. Alternatively, ask in the libmongoc channel on Slack to see if @kevinAlbs has some quick insight. If nothing comes of that, I'll take a look later tonight when I get a chance. If the answer ends up being that standalone clusters only issue a TopologyClosedEvent instead of ServerClosedEvent, then perhaps we'll need to restrict this test to a replica set (I assume sharded clusters might behave similarly to standalones in that libmongoc may open mongos connections and just keep them open for the lifetime of the client).

On a separate note, I observed something strange when running the test on a replica set. While trying to diagnose a possible failure, I modified the event subscriber to dump all events:

class MySubscriber implements MongoDB\Driver\Monitoring\SDAMSubscriber
{
    public function serverChanged(MongoDB\Driver\Monitoring\ServerChangedEvent $event) { var_dump($event); }

    public function serverClosed(MongoDB\Driver\Monitoring\ServerClosedEvent $event) { var_dump($event); }

    public function serverOpening(MongoDB\Driver\Monitoring\ServerOpeningEvent $event) { var_dump($event); }
    
    public function topologyChanged(MongoDB\Driver\Monitoring\TopologyChangedEvent $event) { var_dump($event); }

    public function topologyOpening(MongoDB\Driver\Monitoring\TopologyOpeningEvent $event) { var_dump($event); }
}

I had a 4.4 replica set running locally and configured MONGODB_URI as mongodb://127.0.0.1:27017/?replicaSet=rs0. Running the test and dumping all events resulted in some very strange output for the final TopologyChangedEvent event:

object(MongoDB\Driver\Monitoring\TopologyChangedEvent)#4 (3) {
  ["topologyId"]=>
  string(24) "611e9dcded56d969d04f6591"
  ["newTopologyDescription"]=>
  array(2) {
    ["servers"]=>
    array(2) {
      ["servers"]=>
      array(0) {
      }
      ["type"]=>
      string(19) "ReplicaSetNoPrimary"
    }
    ["type"]=>
    string(21) "ReplicaSetWithPrimary"
  }
  ["oldTopologyDescription"]=>
  array(2) {
    ["servers"]=>
    array(0) {
    }
    ["type"]=>
    string(19) "ReplicaSetNoPrimary"
  }
}

The output for servers definitely looks wrong, so please see if you can sort that out.

@tanlisu
Copy link
Contributor Author

tanlisu commented Aug 19, 2021

The output for servers definitely looks wrong, so please see if you can sort that out.

Should be fixed now - I had some extra lines inphp_phongo_topology_description_to_zval that shouldn't have been there.

@tanlisu tanlisu changed the title PHPC-1893: Implement ServerOpeningEvent, ServerChangedEvent classes PHPC-1893: Implement SDAM event classes Aug 19, 2021
@tanlisu
Copy link
Contributor Author

tanlisu commented Aug 19, 2021

Notes: Currently the ServerClosedEvent and TopologyClosedEvent tests fail because those two events aren't being emitted. I also added the ServerHeartbeatFailedEvent class, but it's missing the ServerHeartbeatFailedEvent::getError() function because I wasn't sure how to deal with the bson_error_t. I also haven't looked into how to produce a ServerHeartbeatFailedEvent for testing.
The only SDAM events left are ServerHeartbeatStartedEvent and ServerHeartbeatSucceededEvent.

@jmikola
Copy link
Member

jmikola commented Aug 19, 2021

I also added the ServerHeartbeatFailedEvent class, but it's missing the ServerHeartbeatFailedEvent::getError() function because I wasn't sure how to deal with the bson_error_t.

phongo_apm_command_failed may be a useful reference, since it also has code that creates a PHP exception (stored as php_phongo_commandfailedevent_t.z_error) from a bson_error_t.

I also haven't looked into how to produce a ServerHeartbeatFailedEvent for testing.

Per the spec, this is emitted when a hello command (or isMaster on pre-5.0 servers) fails. You should be able to use the configureFailPoint command (also used within PHPC's test suite via the configureFailPoint() utility function) to coerce such a failure. Server Fail Points in the Unified Test Format spec may be helpful reading there. Additionally, you can grep various spec tests in that repository for a failPoint operation to see how other tests configure fail points for different purposes. There are definitely examples of setting fail points on hello/isMaster.

Since setting the fail point requires connecting to the server, there are at least two ways to go about this:

  • Create a two Managers with disableClientPersistence to ensure they use separate libmongoc clients. Use one to configure the fail point and the other to issue a ping command with SDAM monitoring enabled. You may get a heartbeat event when the hello/isMaster fails during discovery.
  • Create one Manager and configure the fail point. Wait a bit longer than heartbeatFrequencyMS, which is configurable as low as 500ms, before issuing a ping command.

@jmikola
Copy link
Member

jmikola commented Dec 30, 2021

Superseded by #1287.

@jmikola jmikola closed this Dec 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants