Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Common Schema 4.0 support #118

Merged
merged 72 commits into from
Jul 2, 2020
Merged

Common Schema 4.0 support #118

merged 72 commits into from
Jul 2, 2020

Conversation

maxgolov
Copy link
Contributor

@maxgolov maxgolov commented Sep 20, 2019

Add CS4 support:

  • instructions how to generate the serializer/deserializer using OneCollector schema
  • scripts to simplify the process
  • server-side classic bond schema file snapshot (CommonSchemaEvent.bond)
  • client-side "bond lite"" cleaned-up schema file (CsProtocol.bond)
  • bond_const.json - generated from common compact bond definitions using gbc.exe
  • IDecorator moved to public headers
  • CsRecord type moved to public headers (raw record can be altered via custom IDecorator)

PR adds support for all new CS4 fields on wire, but without corresponding semantic APIs.

Some CS4 extensions have been disabled (not compiled-in) due to size optimization.

Further work needed (not in this PR):

  • optional: semantic APIs for new fields and extensions.
  • required: regenerate 1DS Fiddler Inspector schema and refresh the binary on internal website.
  • required: adjust C++ Bond decoder implementation.

Sorry for submitting the gbc.exe binary in the repo. That is the only trusted way to get the matching working bond codegen. This binary is not run on GitHub Actions. Generated code have been checked in.

Updating versions:

  • bump-up SDK version from 3.2 to 3.3
  • Common Schema version from 3.0 to 4.0

- instructions howto generate the serializer/deserializer using OneCollector schema
- scripts to simplify the process
- server-side classic bond schema file (CommonSchemaEvent.bond)
- client-side "bond lite"" schema file (CsProtocol.bond)
- bond_const.json - generated from common compact bond definitions using gbc.exe

This PR adds support for all new CS4 fields on wire, but without corresponding semantic APIs.

Further work needed (not in this PR):
- either add semantic APIs for all new fields...
- or illustrate how to use a custom IModule for runtime decoration of certain common Part A record props
- regenerate 1DS Fiddler Inspector schema and refresh the binary on internal website

I am sorry for submitting the gbc.exe binary in the repo, but that is the only trusted way to get the matching working bond codegen.
Without that key it is very confusing to debug the sandbox app.
We can clean this later with bfg if needed.
…unctionality, i.e. events without a level are getting blocked by default.

Temporarily disable it in common shared codebase. Office guys should enable in their build.
- bump-up SDK version from 3.2 to 3.3
- Common Schema version from 3.0 to 4.0
@maxgolov maxgolov changed the title Common Schema 4.0 support [WIP] Common Schema 4.0 support Sep 20, 2019
@maxgolov maxgolov added work in progress PRs that are not ready for review do not merge PRs that are not ready to merge labels Sep 20, 2019
Update CS version marker from 3 to 4
Update Common Schema version marker from 3 to 4
@maxgolov
Copy link
Contributor Author

This is more or less close to final state, but I'd like to #ifdef the unused schema parts as a memory optimization:

  • some extensions in the record are for Web, Services, XboxLive, etc. - not used by 1DS C++ client SDK
  • if we go with C++ server SDK and/or ever deploy in Xbox or in a Service, we can enable the full set

Correspondingly the payload decoder code (C++) for the client would have just the essential set, while the Fiddler Inspector (C#) would have the full proper set, and so can be used for other SDKs, including services.

@maxgolov maxgolov changed the title [WIP] Common Schema 4.0 support Common Schema 4.0 support Sep 21, 2019
@maxgolov maxgolov removed do not merge PRs that are not ready to merge work in progress PRs that are not ready for review labels Sep 21, 2019
@reyang reyang added enhancement New feature or request design Architecture and design labels Sep 23, 2019
@maxgolov maxgolov requested a review from larvacea as a code owner May 6, 2020 22:31
@maxgolov
Copy link
Contributor Author

dependencies on CS4.0 extensions that are not approved yet.

This code uses extension definitions identical to Collector, same stuff runs in production for over a year. I agree with the sentiment that this is not formally Approved - as per new process. But I feel like we're a bit stalling on this:

  • JS SDK - emits CS4.0
  • UTC - emits CS4.0
  • Collector - claims to implement same schema as here as CS4.0

This PR is just trying to play a catch-up game to get where the rest of other parts are.
We can rename this PR to be CS4.0-release-candidate-01 :)

Copy link
Member

@reyang reyang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@maxgolov
Copy link
Contributor Author

LGTM.

@reyang - I will hold off on merging this until we coordinate the DDV CS4 upgrade story with @sid-dahiya . As we discussed, SDK and legacy decoder seems to work , based on Sid, but we need to spend a bit of time on it.

@maxgolov maxgolov added the do not merge PRs that are not ready to merge label Jun 26, 2020
@maxgolov
Copy link
Contributor Author

maxgolov commented Jun 26, 2020

@sid-dahiya @reyang @mkoscumb @bliptec - what we can also do is as follows:

  • put cs3 schema generated in one dir
  • put cs4 schema generated in another dir
    Side-by-side.

Then at compile time decide what proto flavor we build with. At least that way we would allow the master branch to continue supporting CS3. It will require a minor adjustment, but I think I can do this in a day. I do not want to support both protocols in one build at runtime. But at least having an option to do "emergency" switch back to CS3, without having to go back on the master - could be a nice-to-have.

Thoughts? (compile-time ---- YES, adding runtime ability to plug in both CS3 and CS4 is an overkill ---- NO)

@maxgolov maxgolov requested a review from lalitb June 26, 2020 19:52
@sid-dahiya
Copy link
Contributor

@sid-dahiya @reyang @mkoscumb @bliptec - what we can also do is as follows:

  • put cs3 schema generated in one dir
  • put cs4 schema generated in another dir
    Side-by-side.

Then at compile time decide what proto flavor we build with. At least that way we would allow the master branch to continue supporting CS3. It will require a minor adjustment, but I think I can do this in a day. I do not want to support both protocols in one build at runtime. But at least having an option to do "emergency" switch back to CS3, without having to go back on the master - could be a nice-to-have.

Thoughts? (compile-time ---- YES, adding runtime ability to plug in both CS3 and CS4 is an overkill ---- NO)

As an interim approach, and given this change has been pending for a while, I support this side-by-side idea. This will let the SDK users to decide if they're ready to move to CS4 or not.

Copy link
Contributor

@sid-dahiya sid-dahiya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would really like us to not share explicit internal links here. Anytime SharePoint moves to a different location, or something else changes, the links will become obsolete. Instead, we should be using aka.ms links wherever possible.

Other than that, everything looks good, I'm ok for this to be merged.


1. Download OneCollector Server bond definition - CommonSchemaEvent.bond from server repo:

https://msazure.visualstudio.com/OneDsCollector/_git/Collector?path=%2FServices%2FModels%2FCommonSchema%2FBond%2FCommonSchemaEvent.bond&version=GBmaster
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should obfuscate this in an aka.ms link.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can hide that.. The challenge is that this will be OSS SDK, but Collector won't be public. We can probably document this process in private lib/modules , without elaborating in public repo.

@@ -426,10 +426,6 @@ namespace ARIASDK_NS_BEGIN
tenantTokenToId(m_tenantToken).c_str(), record.baseType.c_str());
return;
}
// TODO:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still pending?

@@ -0,0 +1,709 @@
namespace Microsoft.OneDs.Models.CommonSchema.Bond

// TODO(pawelc): should all/most of the int fields be nullable? Is it ok to use 0 as null?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we get an answer to this before this is finished?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nah, that's fine - this has been there for 2-3 years now. I can remove all these comments completely. Currently it's just a verbatim copy of Collector definition... They also have the same comment in there in Azure VSO 😁

[Description("Name the client application")]
8: optional string name = nothing;

[Description("Identifier to mark session. Unless overridden, a session cookie will be used to set this value based on 30 min inactivity")]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

::Nit: Extra space before [Description.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd probably NOT change it other than removing silly comments, trying to keep it as close to original, so that it's easier to diff.. Unless we want to copy that bond definition to private lib/modules, and only maintain the generated code in OSS, without elaborating how this code generation process works.

[Description("Upload buffer sequence number in the format <buffer identifier>:<sequence number>")]
11: optional string bSeq = nothing;

// See "https://microsoft.sharepoint.com/teams/CommonSchema/Shared Documents/Schema Specs/SequenceField_PartA_CS2.0.docx?web=1"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a general rule of thumb, can we replace all internal links with aka.ms links? Not only does it obfuscate it, but if the links get moved, there won't be broken links in the comments.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somewhat agree, somewhat not, currently it's as-is copy of the mainline definition. I agree that maybe we can scrub it to remove internal links completely in our copy of the protocol definition.

[Version("1.0")]
struct Javascript
{
[Description("Logging Library version.")]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The example values being listed below, are they an exhaustive list of values (seem likely)? If not, can we make the lists smaller?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, that's again - Andrew and Ram are maintaining this part. This part of the protocol is not even implemented by 1DS C++ SDK. Keeping it as-is.

Comment on lines 358 to 361
[Description("Describes whether the user has given consent for Cookies. See https://microsoft.sharepoint.com/teams/celaeucookie")]
210: required bool userConsent;

[Description("Language with highest quality score from http header accept-language")]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

::Nit: Bonus spaces at the start of the line.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are quite a few places with the spacing mix-up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not changing it here. Keeping it with all the original flaws, same content as in the mainline Azure VSO.

@@ -1,5 +1,7 @@
//------------------------------------------------------------------------------
// This code was generated by a tool.
// This code was generated by a tool, then manually edited with performance
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we not update the tool to prevent the need for manual edits?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really. It's gonna be a serious challenge. I will try to explain the challenge:

  • there is a Bond protocol that details various extensions
  • there is a Bond compiler - code generator, that generates code that covers ALL extensions, including optional
  • there is no feature in that Bond compiler that allows you to omit or #ifdef certain extensions, thus the manual editing is required...

Now what happens if I don't do the manual edit? The code generated, structures, methods, are all going to have significant bloat due to unrelated Web, Javascript, Xbl, etc. extensions. We certainly don't want to degrade our perf by providing exhausting implementation of all of extensions in our SDK, we omit / disable the unrelated parts manually.

Implementing another separate compilation / translation layer that injects ifdef attributes over Bond (being kinda technology that we are gradually moving away from), is a wasted effort. Easier to add 10-20 ifdefs manually once a year.

Copy link
Contributor

@lalitb lalitb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was more of learning for me to go through the changes. Looks good to me :)

@maxgolov
Copy link
Contributor Author

maxgolov commented Jul 1, 2020

@sid-dahiya @lalitb @reyang - I made the following changes:

  • added conditional compilation around all places where CS4 is adding new fields. That way we can compile the build with CS3 or with CS4 support (not both).
  • by default for now I do not set the HAVE_CS4 and HAVE_CS4_FULL defines, means the change now turned off (I verified in my previous commit that the change passes all tests).

I will add a few tests, where I'm gonna try to build relevant portions with CS4, OR a full separate build loop to validate CS4. I am cautiously NOT making the CS4 the default yet. My further actions are to provide a document that details CS3-to-CS4 delta, field by field, the defines, as well as 1DS Fiddler Decoder that is capable of decoding both CS3 and CS4.

Due to extent of changes, I'm gonna merge the change - assuming it passes all tests, and submit further documentation + decoder in a follow-up PR. With this change merged - we are not yet switching to CS4 by default. This will be a separate minor commit.

@maxgolov maxgolov merged commit 1fc7e4e into master Jul 2, 2020
@maxgolov maxgolov linked an issue Jul 6, 2020 that may be closed by this pull request
maxgolov added a commit that referenced this pull request Oct 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
design Architecture and design do not merge PRs that are not ready to merge enhancement New feature or request PR sent to code review Pending code review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CS 4.0: compliance to Common Schema version 4.0
5 participants