Skip to content
This repository was archived by the owner on Dec 22, 2021. It is now read-only.

Conversation

@TBSliver
Copy link
Contributor

@TBSliver TBSliver commented Jul 5, 2018

First pass of adding support for OP_MSG

@coveralls
Copy link

coveralls commented Jul 5, 2018

Coverage Status

Coverage decreased (-0.03%) to 74.628% when pulling b250098 on shadow-dot-cat:TBSliver/PERL-789 into c10809c on mongodb:master.

Copy link
Contributor

@xdg xdg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, a good start, but I'd like to see this done without OO, as that's going to be expensive to add to every operation.

my @sections = MongoDB::_Protocol::prepare_sections( $self->{bson_codec}, $self->{query} );
$self->{query} = \@sections;
my $encoded_sections = MongoDB::_Protocol::join_sections( @sections );
( $op_bson, $request_id ) = MongoDB::_Protocol::write_msg( $encoded_sections, undef );
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can write_msg just take an array of sessions?

my ( $op_bson, $request_id );

if ( $ENV{DO_OP_MSG} ) {#$link->supports_op_msg ) {
push @{$self->{query}}, ( '$db', $self->db_name );
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't necessarily be an array: Document can be other things.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yea that was a 'this works for now during testing' bit - think the three things it needs to cover are Array, Hash, and Tie::IxHash?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you have to coerce to Tie::IxHash (to_IxHash) and work with that.

if ( $ENV{DO_OP_MSG} ) {#$link->supports_op_msg ) {
push @{$self->{query}}, ( '$db', $self->db_name );
my @sections = MongoDB::_Protocol::prepare_sections( $self->{bson_codec}, $self->{query} );
$self->{query} = \@sections;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why you're replacing this here. If you leave it alone, then command monitoring should still do the right thing and you don't have to reassemble it. Think about it this way: OP_MSG is just a way of laying out the query in bytes -- the actual query doesn't change (except for addition of metadata like $db and read preference.).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is being able to test that OP_MSG specifically has been used - the test plan says to ensure each document is done as a document sequence properly. Also the command-monitoring spec states that When a driver sends an OP_MSG with a document sequence, it MUST include the document sequence as a BSON array in CommandStartedEvent.command.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're reading the command monitoring spec backwards. It's saying that in the case of OP_MSG -- which might split a field out of the command into a sequence -- then command monitoring must present it reassembled, so that apps consuming events see them recombined.

@@ -0,0 +1,202 @@
package MongoDB::Protocol::_Section;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think modeling this generically as an object is too expensive for the wire protocol pieces.

return \%out;
}

sub _maybe_expand_sections {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per my comment earlier, if you don't modify the query, then this shouldn't be needed.

=cut

sub prepare_sections {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of washing this through objects, I think you can just directly return 1 or more encoded strings. The protocol 0 part is a type byte concatenated with either an encoded document verbatim, or an encoded document with a key removed. The protocol 1 part is either missing or is type and length bytes concatenated with all the encoded documents of the extracted field.

In hash terms (not that you can use hashes for this) I think it's along these lines:

my $copy = { %$cmd };
my $extracted = delete $copy{$special_key};
my @sections = pack(..., 0, $codec->encode_one($copy)); # payload type 0
if ( $extracted ) {
    my $payload1 = pack(..., 1, 0, $special_key, join('', map { $codec->encode_one($_) } @$extracted)); 
    # fix up bytes 2-5 with payload length
    push @sections, $payload1;
}

=cut

sub split_sections {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As with encoding the message, I think there's a simpler way to handle this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine now in the latest iteration.

$number_returned
) = unpack( P_REPLY_HEADER, $msg );
) = unpack( P_MSG, $msg );
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this works because replies can be either OP_REPLY or OP_MSG. You need to unpack just the header and do opcode inspection early and take different paths.

Copy link
Contributor

@xdg xdg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty close to done. I have only some relatively minor notes.

if ( $split_commands{ $command } eq $ident ) {
# Assumes only a single split on the commands
if ( defined $ident
&& $split_commands{ $command } eq $ident ) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think line 148 is a tautology and can be removed. Maybe you could check $cmd->EXISTS{$ident}, but I suspect it always will and it would just add overhead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yea, that was fairly stupid - reduced it to just be the defined check.

my $docs = $cmd->FETCH( $ident );
# my ( undef, $collection ) = $cmd->Shift;
# my ( undef, $docs ) = $cmd->Shift;
# # Assumes only a single split on the commands
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leftover comments. (And yes, there is currently only a single split supported).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will leave the single split comment there, just in case we cmoe back later for a multi split bit and wonder why its not working....

),
{
type => 0,
documents => [ [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seeing "document" here (and later) is confusing. Let's call it "payload" to match the spec.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually named it documents to keep it with the specification - until they're actually packed they are not a payload object, but the documents which are the content of the payload. if it was going to be more accurate to the spec it would be something like:

{
  type => $type,
  payload => {
    (defined $ident ? (identifier => $ident) : () ),
    documents => $docs
  }
}

but payload there seems a bit of a redundant key...

{
type => 0,
documents => [ [
$command,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason for using $command and $collection instead of just letting them get mapped below? Not checking to exclude them to the map would make the logic more apparent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm that may have come from my initial testing with just raw hashes, which obviously are un-ordered. The only issue I can see with using the Tie::IxHash->Keys bit is if it comes back in a different order, for example it was converted from a hash which means the command isn't guaranteed first, whereas this is guarantees it being first for encoding later. Have removed it for now, but left a comment in explaining the assumption


my $type = $section->{type};
my $ident = $section->{identifier};
my @raw_docs = @{ $section->{documents} };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's avoid a temporary variable here and just inline in the map call on line 203. The logic is still simple and clear.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yep that was left over from debugging earlier on, have removed it

=cut

sub join_sections {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to inline this for efficiency rather than use a subroutine call for it.



if ( $opcode == OP_MSG ) {
MongoDB::InternalError->throw('OP_MSG requires codec for parse_reply') unless defined $codec;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we did it correctly, this should never happens so we can remove the check. If we messed up, this will die in testing. UPDATE: or we don't need codec anyway, but the general point applies.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, have removed it now that its not needed for that whole part

=cut

sub split_sections {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine now in the latest iteration.

return {
flags => {
checksum_present => vec( $bitflags, MSG_FB_CHECKSUM, 1 ),
query_failure => vec( $bitflags, MSG_FB_MORE_TO_COME, 1 ),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More to come isn't the same thing as a query failure. Have this be more_to_come.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah whoops, copy pasta error

checksum_present => vec( $bitflags, MSG_FB_CHECKSUM, 1 ),
query_failure => vec( $bitflags, MSG_FB_MORE_TO_COME, 1 ),
},
docs => $sections[0]->{documents}->[0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a "XXX" comment that this assumes the server never sends a type 1 payload (which it currently doesn't).

When the server adds that capability (as it may in 4.2), we'll need to combine the type 1 payload into the type 0 payload. (I.e. the opposite of prepare). We don't need to do that now (since we have no tests for it), but I want a marker in the code to remind us for later.

Copy link
Contributor

@xdg xdg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super close. Just one open question to address!

## XXX We dont seem to need this decoded till later? flag here for testing
if ( $flag ) {
## XXX MongoDB::Role::_OpReplyParser normally does decoding - this is used for testing mainly
if ( $codec ) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we actually need a codec at all. What code path uses it? (I.e. if you put "die" in this conditional, would anything hit it?) If not, let's take it out.

@TBSliver
Copy link
Contributor Author

TBSliver commented Sep 3, 2018

Have removed codec and implemented a decoding sub int he test which used it. Odd thing has happened though, locally when I run tests against a replicaset >= 3.6, t/readpref.t fails with

...
    not ok 1 - count_documents on polygon:48768 (RSSecondary) succeeds
    #   Failed test 'count_documents on polygon:48768 (RSSecondary) succeeds'
    #   at t/readpref.t line 168.
    #          got: 'MongoDB::DatabaseError: not master and slaveOk=false'
    #     expected: undef
...

But no other test fails - and this test passed before. Not quite sure whats going on there, or is this something that would require a rebase first?

@TBSliver TBSliver changed the title DRAFT: PERL-789 OP_MSG support PERL-789 OP_MSG support Sep 3, 2018
@xdg
Copy link
Contributor

xdg commented Sep 4, 2018

I'll rebase and test it.

@xdg
Copy link
Contributor

xdg commented Sep 4, 2018

Merged as 3f4e395 with a fixup in fdb215b

@xdg xdg closed this Sep 4, 2018
@TBSliver TBSliver deleted the TBSliver/PERL-789 branch September 14, 2018 14:52
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants