Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inability to extend message with both repeating groups and vardata #37

Closed
da4089 opened this issue Nov 18, 2016 · 7 comments
Closed

Inability to extend message with both repeating groups and vardata #37

da4089 opened this issue Nov 18, 2016 · 7 comments

Comments

@da4089
Copy link

da4089 commented Nov 18, 2016

In v1.0 draft 2016/06/16, section 5.1.1, it states:

A repeating group may be added, but only after existing groups and if there are no subsequent variable data elements at the end of the message.

This seems like a significant limitation of the protocol, and essentially prevents the use of both variable-length data and repeating groups in a message where schema extensions might be required.

If both the variable-length data "header" (s2.8.2), and the repeating group "header" (aka group dimensions, s3.4.10), mandated a leading byte that distinguished between vardata and a repeating group, this restriction could be lifted:

Variable data would then be preceded by

  • uint8 blockType
    • Bits 0-1: power of two for number of octets in size value (0 = uint8, 1 = uint16, 2 = uint32, 3 = uint64)
    • Bits 2-3: reserved, must be encoded as zero, and ignored on reception
    • Bits 4-5: a standard value VARDATA_BLOCK = 1
    • Bits 6-7: reserved, must be encoded as zero, and ignored on reception
  • unsigned integer type length The number of octets of data following, encoded as an unsigned integer

Repeating groups would be preceded by

  • uint8 blockType
    • Bits 0-1: power of two for number of octets in block length value (0 = uint8, 1 = uint16, 2 = uint32, 3 = uint64)
    • Bits 2-3: power of two for number of octets in number in group value (0 = uint8, 1 = uint16, 2 = uint32, 3 = uint64)
    • Bits 4-5: a standard value GROUP_BLOCK = 2
    • Bits 6-7: reserved, must be encoded as zero, and ignored on reception
  • unsigned integer blockLength Size in octets of group entry
  • unsigned integer numInGroup Number of group entries

(These could of course be expanded to 2 (vardata) and 3 (repeating groups) uint8 octets to avoid bit-twiddling, if that's seen as a better approach).

I appreciate that this scheme would compromise the "all description is in the stubs, not on the wire" nature of the protocol as it is today, but as a tradeoff for enabling schema evolution with both vardata and repeating groups, it seems a small price to pay.

@kleihan
Copy link
Member

kleihan commented Nov 18, 2016

I do not see this as a significant limitation when using SBE for high performance trading which was the main driver for the design. Fixed length of messages was seen as a key feature. Repeating groups cannot be avoided completely, e.g. to report back multiple partial fills of an order in a single message. Variable length fields are only needed for verbose error or other free text fields used for display purposes and should be avoided as much as possible. What are the use cases on a business level that suffer from this rule that you perceive as a significant limitation?

There are other binary encodings for FIX that support variable length messages (e.g. ASN.1, FAST) and may fit better for an interface that depends heavily on variable-length data. The different encodings all have their pros and cons and SBE is about small, fixed length messages. Any encoding for FIX needs to be able to support the complete FIX Repository. Some of the FIX messages contain heavily nested structures, hence the existence of repeating groups as a concept in SBE. In practice, nested structures should be avoided as much as possible when designing FIX messages together with an SBE encoding. Backward compatibility is not a key concern for firms active in the high speed trading arena.

Last not least, SBE is now a Draft Standard which means that changes to the wire format of Version 1.0 are only possible if something fundamental arises during the implementations undertaken by the community. I suggest to keep your request for a future version of SBE.

@da4089
Copy link
Author

da4089 commented Nov 18, 2016

@kleihan -- thanks for your response.

I understand that this is a draft standard, and that my comments are almost certainly way too late to be useful. I decided to report the issue so that it was publicly documented, at least.

It's obvious from the result that a key tension in the design of SBE was the desire to support the FIX application layer on one hand, and a desire to "compete" with OUCH-like protocols on the other (fixed length, easy FPGA parsing, etc).

Fixed-structure protocols evolve by (a) adding fields on the end of existing messages, or (b) adding new messages. SBE has chosen to support a more complex model of evolution but that's then compromised, and the result is that neither the evolution model nor variable length data and groups is "nice": both have limitations and inconsistencies.

Variable-length data and groups introduce the need to specify the length and group dimensions in the wire message (unlike the rest of the protocol where the layout is fixed by the schema). Having made that concession, it seems unfortunate not to have added an extra octet that would avoid the restriction from s5.1.1, and allow new vardata and groups to be added as desired.

@kleihan
Copy link
Member

kleihan commented Nov 18, 2016

I do see another evolvement option for fixed-length messages. It is the use of templates where (e.g. as an exchange) one issues a new template and supports the old one for a limited time. That allows to insert fields at any point for the new template and convert messages provided under the old template as a convenience. These days, the new stuff is often of regulatory nature and users have no choice than to support/populate the fields as soon as the exchange makes them available.

I do not want to discard your issue in general and am looking for more people joining in this discussion which I deem to be important in terms of positioning SBE and its strengths/weaknesses correctly.

@mjpt777
Copy link

mjpt777 commented Nov 18, 2016

If messages and repeating groups had fields in the block header to indicate the number of repeating groups and var data items then the protocol would have been much more flexible without adding significant overhead. Better code reuse would have more than offset performance cost of the extra fields. OTF code would also have been more simple.

This would have allowed for full extension capabilities, more generic and cleaner code in decoders, plus useful facilities like skipping to the end of the message without requiring a framing protocol.

These are lessons learned and cannot be predicted upfront without building a full implementation. Now the lessons have been learned further revisions of the protocol would be remiss to ignore these lessons.

SBE could be a codec that is suitable for most financial needs rather than supporting more disparate "standards".

New templates have been suggested a number of times as an alternative. This is not practical for many organisations and technology practices. Forcing upgrades and rework is not a community focused or considerate view point. Why do we keep seeing suggestions like this in the finance world? ;-) Finance needs to evolve to be more customer focused and collaborative.

@donmendelson
Copy link
Member

Duplicate of issue #26. Keeping open for comments.

@donmendelson
Copy link
Member

Note: this enhancement changes wire format and therefore is a breaking change with version 1.0.

@adkapur adkapur mentioned this issue Sep 5, 2017
@donmendelson
Copy link
Member

The proposed implementation was not accepted but the goal was met in this release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants