Skip to content

Created a decoder handler to deserialize a MemcachedResponse #11

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

dkz2
Copy link
Contributor

@dkz2 dkz2 commented May 26, 2023

Created a decoder handler that can transform a ByteBuffer received from the Memcached server into a MemcachedResponse. This PR will close #5.

Motivation:

This marks the beginning for future implementations of decoding a MemcachedResponse.

Modifications:

Implemented definition of a basic MemcachedResponse.
Implemented a basic decoder to deserialize a MemcachedResponse.
Added Unit and updated our Integration test.

Result:

We can now successfully decode a MemcachedResponse received from the Memcached server.

@dkz2 dkz2 changed the title created a MemcachedResponse decoder Implement memcached response decoder and memcached response May 26, 2023
@dkz2 dkz2 changed the title Implement memcached response decoder and memcached response Created a decoder handler to deserialize a MemcachedResponse May 26, 2023
Copy link
Contributor

@FranzBusch FranzBusch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great start! I left some comments inline

import NIOCore
import NIOPosix

struct MemcachedResponseDecoder: ByteToMessageDecoder {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of using ByteToMessageDecoder can you use NIOSingleStepByteToMessageDecoder instead. This makes the implementation way easier normally.

import NIOCore
import NIOPosix

struct MemcachedResponseDecoder: ByteToMessageDecoder {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add this as a doc comment to this struct

Responses look like:

<RC> <datalen*> <flag1> <flag2> <...>\r\n

Where <RC> is a 2 character return code. The number of flags returned are
based off of the flags supplied.

<datalen> is only for responses with payloads, with the return code 'VA'.

Flags are single character codes, ie 'q' or 'k' or 'I', which adjust the
behavior of the command. If a flag requests a response flag (ie 't' for TTL
remaining), it is returned in the same order as they were in the original
command, though this is not strict.

Flags are single character codes, ie 'q' or 'k' or 'O', which adjust the
behavior of a command. Flags may contain token arguments, which come after the
flag and before the next space or newline, ie 'Oopaque' or 'Kuserkey'. Flags
can return new data or reflect information, in the same order they were
supplied in the request. Sending an 't' flag with a get for an item with 20
seconds of TTL remaining, would return 't20' in the response.

All commands accept a tokens 'P' and 'L' which are completely ignored. The
arguments to 'P' and 'L' can be used as hints or path specifications to a
proxy or router inbetween a client and a memcached daemon. For example, a
client may prepend a "path" in the key itself: "mg /path/foo v" or in a proxy
token: "mg foo Lpath/ v" - the proxy may then optionally remove or forward the
token to a memcached daemon, which will ignore them.

Syntax errors are handled the same as noted under 'Error strings' section
below.

For usage examples beyond basic syntax, please see the wiki:
https://github.com/memcached/memcached/wiki/MetaCommands


typealias ResponseStatus = UInt16

extension ResponseStatus {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make this an enum called ReturnCode and nest it inside the MemcachedResponse type

static let exists = generateCode(from: (.init(ascii: "E"), .init(ascii: "X")))
static let notFound = generateCode(from: (.init(ascii: "N"), .init(ascii: "F")))

init?(asciiValues: (UInt8, UInt8)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of things here:

  • First let's take a UInt16 here instead and just do a switch over the values. We have 4 values that we want to match so we can just spell the out
  • Let's remove the ? from the init. We precondition if we don't know the return code

preconditionFailure("Response code could not be read.")
}

// Check if there's a whitespace character, this indicates flags are present
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not only flags but also datalen. Just invert the comment to say if there is not a whitespace then we are at the end of the line.

}

// Check if there's a whitespace character, this indicates flags are present
if buffer.readableBytes > 2, buffer.getInteger(at: buffer.readerIndex, as: UInt8.self) == UInt8.whitespace {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should just read the next byte here and then base our decision of that. Also return needMoreData if there is not a single byte readable.

buffer.moveReaderIndex(forwardBy: 1)

// -2 for \r\n
_ = buffer.readSlice(length: buffer.readableBytes - 2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should precondition here that we really read \r\n


import NIOCore

enum MemcachedResponse {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if modelling this as an enum is correct. From the return codes we can't know what command was issued e.g. HD and EN is used by both get and set commands. Let's just model this out as a struct that has a returnCode. We are going to do the request-response matching in the handler in the end.

Copy link
Contributor

@FranzBusch FranzBusch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are getting closer. I just added a larger comment with how we have to change this around to make sure we can handle multiple responses and better handle partial responses in the buffer.

//===----------------------------------------------------------------------===//

struct MemcachedResponse {
enum ReturnCode: UInt16 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to back this with the raw value right now

Suggested change
enum ReturnCode: UInt16 {
enum ReturnCode {

}
}

let returnCode: ReturnCode
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make this a var

Suggested change
let returnCode: ReturnCode
var returnCode: ReturnCode

import NIOPosix

struct MemcachedResponseDecoder: NIOSingleStepByteToMessageDecoder {
/// Responses look like:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move this to the doc of the type. So two lines above

Comment on lines 61 to 62
guard let firstReturnCode = buffer.readInteger(as: UInt8.self),
let secondReturnCode = buffer.readInteger(as: UInt8.self) else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we just read a single UInt16 here?

Comment on lines 55 to 58
// Ensure the buffer has at least 3 bytes (minimum for a response code and newline)
guard buffer.readableBytes >= 3 else {
return nil // Need more data
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't have to check this here we can just try to read and return nil further down.

Comment on lines 70 to 90
// If there is not a whitespace, then we are at the end of the line.
guard buffer.readableBytes > 0, let nextByte = buffer.getInteger(at: buffer.readerIndex, as: UInt8.self) else {
return nil // Need more dat
}

if nextByte != UInt8.whitespace {
// We're at the end of the line
buffer.moveReaderIndex(forwardBy: 1)
} else {
// We have additional data or flags to read
buffer.moveReaderIndex(forwardBy: 1)

// Assert that we really read \r\n
guard buffer.readableBytes >= 2,
buffer.getInteger(at: buffer.readerIndex, as: UInt8.self) == UInt8.carriageReturn,
buffer.getInteger(at: buffer.readerIndex + 1, as: UInt8.self) == UInt8.newline else {
preconditionFailure("Response ending '\r\n' not found.")
}

buffer.moveReaderIndex(forwardBy: 2)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part can be written better and checking that we also read the \s. Instead of checking for readable bytes we can always just try to read the next UInt8. Additionally we need to account for multiple flags so we should rather add a loop in here. Can you try to rewrite this using a loop where do the following logic

  1. Read the next byte
  2. Check if the byte is a whitespace
  • If whitespace then try to read a flag (we can stub out this code for now)
  • If not-whitespace then exit the loop
  1. Try to read the next 3 bytes and check if all four bytes are now equal to \r\n

/// https://github.com/memcached/memcached/wiki/MetaCommands
typealias InboundOut = MemcachedResponse

func decode(buffer: inout ByteBuffer) throws -> InboundOut? {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is almost right however you have to account for the fact that there are multiple responses inside the buffer. Furthermore, we are doing a bunch of duplicated work here if we get partial responses e.g. if we get everything except the \r\n we are going to re-parse the return code and flags every time. To avoid this we should introduce a little state machine to this decoder. Some rough code how this might look

    private enum NextStep: Hashable {
        /// The initial step.
        case returnCode
        /// Decode the data length, flags or check if we are the end
        case dataLengthOrFlag(MemcachedResponse.ReturnCode)
        /// Decode the next flag
        case decodeNextFlag(MemcachedResponse.ReturnCode , UInt64, [Flag])
        /// Decode end of line
        case decodeEndOfLine(MemcachedResponse.ReturnCode, UInt64?, [Flag])
       etc...
    }

    private enum NextDecodeAction {
        /// We need more bytes to decode the next step.
        case waitForMoreBytes
        /// We can continue decoding.
        case continueDecodeLoop
        /// We have decoded the next response and need to return it.
        case returnDecodedResponse(MemcachedResponse)
    }

    /// The next step in decoding.
    private var nextStep: NextStep = .decodeReturnCode

    public mutating func decode(buffer: inout ByteBuffer) throws -> MemcachedResponse? {
        while self.nextStep != .none {
            switch try self.next(buffer: &buffer) {
            case .returnDecodedResponse(let response):
                return response

            case .waitForMoreBytes:
                return nil

            case .continueDecodeLoop:
                ()
            }
        }
        return nil
    }

    private mutating func next(buffer: inout ByteBuffer) throws -> NextDecodeAction {
        switch self.nextStep {
        case .decodeReturnCode:
           // Try to decode the return code and move the state
        case .decodeDataLengthOrFlag(let returnCode):
          // Check if the return code is `VA` then decode the data length otherwise move state to flag decoding
        }
    }

    public mutating func decodeLast(buffer: inout ByteBuffer, seenEOF: Bool) throws -> MemcachedResponse? {
        // Try to decode what is left in the buffer.
        if let output = try self.decode(buffer: &buffer) {
            return output
        }

        guard buffer.readableBytes == 0 || seenEOF else {
            // If there are still readable bytes left and we haven't seen an EOF
            // then something is wrong with the message or how we called the decoder.
            throw SomEError
        }

        switch self.nextStep {
           // We have to decide what to do here. Some steps are fine to be in.
        }
    }


case .decodeNextFlag(let returnCode, let dataLength, var flags):
if let nextByte = buffer.readInteger(as: UInt8.self), nextByte != UInt8.whitespace {
flags.append(nextByte)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is going to allocate a new array each time, I could ignore flags for the time being

Copy link
Contributor

@FranzBusch FranzBusch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks pretty good now. One last round of comments and then we are good to go!

/// Decode the next flag
case decodeNextFlag(MemcachedResponse.ReturnCode, UInt64?)
/// Decode end of line
case decodeEndOfLine(MemcachedResponse.ReturnCode, UInt64?)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a TODO here that we need to add a step for decoding the response data if the return code is VA

return .continueDecodeLoop

case .dataLengthOrFlag(let returnCode):
if returnCode == .stored {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not correct we should only decode the data length if the returnCode is VA

// TODO: Implement decoding of data length
}

// Check if the next bytes are \r\n
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of checking if the next two bytes are something we should rather read the next byte and just check if it is a whitespace. If is a whitespace we have to decode flags otherwise we check if it is a \ and continue to read the next 3 bytes. It is fine if we return waitForMoreBytes when we got a \ and don't have 3 more bytes in the buffer. This should be rarely hit and we don't have to introduce a new state.

Comment on lines 21 to 22
var decoder: MemcachedResponseDecoder!
var channel: EmbeddedChannel!
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need a Channel here we can just call the methods on the decoder without the channel.

Comment on lines 36 to 38
var buffer = ByteBufferAllocator().buffer(capacity: 8)
buffer.writeBytes(returnCode)
buffer.writeBytes([UInt8.carriageReturn, UInt8.newline])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we create a helper method called makeMemcachedResponseByteBuffer that takes a MemcachedResponse and encodes it for us. We will have a bunch more tests down the line and this will help us write them easily

}
}

func testDecodeSetStoredResponse() throws {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove the Set from the method names since the responses don't indicate the request methods.

@dkz2 dkz2 requested a review from FranzBusch June 1, 2023 14:52
Copy link
Contributor

@FranzBusch FranzBusch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So close! Last few minor comments!

/// processes the current state of the ByteBuffer.
enum NextStep: Hashable {
/// No further steps are needed, the decoding process is complete for the current message.
case none
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we never transition to that state and since our state machine is looping we can get rid of this

self.nextStep = .returnCode
return .returnDecodedResponse(response)
} else {
preconditionFailure("Expected whitespace or \\r")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably throw here instead of preconditioning since the input is from the network and this might be a denial of service vector otherwise.


// If there's a data length, write it to the buffer.
if let dataLength = response.dataLength, response.returnCode == .VA {
buffer.writeInteger(dataLength, as: UInt64.self)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are missing writing a whitespace here

Copy link
Contributor

@FranzBusch FranzBusch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job!

@FranzBusch FranzBusch merged commit 783ca54 into swift-server:main Jun 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement MemcachedResponseDecoder and MemcachedResponse
2 participants