Skip to content

Refactor data recorder#233

Merged
zhouzhuojie merged 1 commit into
masterfrom
zz/refactor-data-recorder
Mar 11, 2019
Merged

Refactor data recorder#233
zhouzhuojie merged 1 commit into
masterfrom
zz/refactor-data-recorder

Conversation

@zhouzhuojie
Copy link
Copy Markdown
Collaborator

@zhouzhuojie zhouzhuojie commented Mar 8, 2019

Description

Fix #232 #203

  • Refactor all data recorders
  • Add data_record_frame.go
  • Change the interface of DataRecorder to comply with DataRecordFrame

Motivation and Context

  • Add the option to set the output mode of DataRecordFrame
  • Consolidate and simplify adding new data recorders

How Has This Been Tested?

Tested locally, note that kt tool prints the value field of kafka messages with escape.

FLAGR_RECORDER_ENABLED=true FLAGR_RECORDER_FRAME_OUTPUT_MODE=payload_raw_json make run

kt consume -topic flagr-records -offsets all=newest-1:
{
  "partition": 0,
  "offset": 8,
  "key": "a1234",
  "value": "{\"payload\":{\"evalContext\":{\"enableDebug\":true,\"entityContext\":{\"hello\":\"world\"},\"entityID\":\"a1234\",\"entityType\":\"a123\",\"flagID\":1,\"flagKey\":\"kmmcd1nsd6\"},\"evalDebugLog\":{\"segmentDebugLogs\":[{\"msg\":\"argument: state not found\",\"segmentID\":1}]},\"flagID\":1,\"flagKey\":\"kmmcd1nsd6\",\"flagSnapshotID\":46,\"segmentID\":1,\"timestamp\":\"2019-03-08T23:45:37Z\",\"variantAttachment\":null,\"variantID\":null,\"variantKey\":null}}",
  "timestamp": "1969-12-31T15:59:59.999-08:00"
}


FLAGR_RECORDER_ENABLED=true make run

kt consume -topic flagr-records -offsets all=newest-1:
{
  "partition": 0,
  "offset": 9,
  "key": "a1234",
  "value": "{\"payload\":\"{\\\"evalContext\\\":{\\\"enableDebug\\\":true,\\\"entityContext\\\":{\\\"hello\\\":\\\"world\\\"},\\\"entityID\\\":\\\"a1234\\\",\\\"entityType\\\":\\\"a123\\\",\\\"flagID\\\":1,\\\"flagKey\\\":\\\"kmmcd1nsd6\\\"},\\\"evalDebugLog\\\":{\\\"segmentDebugLogs\\\":[{\\\"msg\\\":\\\"argument: state not found\\\",\\\"segmentID\\\":1}]},\\\"flagID\\\":1,\\\"flagKey\\\":\\\"kmmcd1nsd6\\\",\\\"flagSnapshotID\\\":46,\\\"segmentID\\\":1,\\\"timestamp\\\":\\\"2019-03-08T23:47:23Z\\\",\\\"variantAttachment\\\":null,\\\"variantID\\\":null,\\\"variantKey\\\":null}\",\"encrypted\":false}",
  "timestamp": "1969-12-31T15:59:59.999-08:00"
}



Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

@zhouzhuojie zhouzhuojie force-pushed the zz/refactor-data-recorder branch from 3dca95f to 5d9012b Compare March 8, 2019 03:23
@zhouzhuojie
Copy link
Copy Markdown
Collaborator Author

TODO, increase test coverage for date_record_frame.go

@zhouzhuojie zhouzhuojie mentioned this pull request Mar 8, 2019
Comment thread pkg/config/env.go Outdated
//
// payload mode - it respects the encryption settings, and it will stringify the payload to unify
// the type of the output for both plaintext and encrypted payload.
// {

This comment was marked as resolved.

@codecov-io
Copy link
Copy Markdown

codecov-io commented Mar 8, 2019

Codecov Report

Merging #233 into master will decrease coverage by 0.18%.
The diff coverage is 82.75%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #233      +/-   ##
==========================================
- Coverage   83.87%   83.69%   -0.19%     
==========================================
  Files          24       25       +1     
  Lines        1433     1441       +8     
==========================================
+ Hits         1202     1206       +4     
- Misses        174      177       +3     
- Partials       57       58       +1
Impacted Files Coverage Δ
pkg/handler/data_recorder.go 100% <ø> (ø) ⬆️
pkg/handler/eval.go 77.27% <0%> (-0.89%) ⬇️
pkg/handler/data_recorder_kafka.go 76.25% <77.27%> (-6.05%) ⬇️
pkg/handler/data_recorder_kinesis.go 81.57% <85.71%> (-1.4%) ⬇️
pkg/handler/data_record_frame.go 88.57% <88.57%> (ø)
pkg/handler/data_recorder_pubsub.go 50% <92.3%> (+2.5%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 65fd5b9...a18539c. Read the comment docs.

@zhouzhuojie zhouzhuojie force-pushed the zz/refactor-data-recorder branch from 5d9012b to b883882 Compare March 8, 2019 18:46
@zhouzhuojie zhouzhuojie force-pushed the zz/refactor-data-recorder branch from b883882 to 3e9a288 Compare March 8, 2019 18:52
Comment thread pkg/handler/data_record_frame.go Outdated
// DataRecordFrame represents the structure we can json.Marshal into data recorders
type DataRecordFrame struct {
Encrypted bool `json:"encrypted"`
Payload string `json:"payload"`
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that at the very least we should add omitempty to both Payload and PayloadRawJSON as there is only ever going to be one or the other. The more I've though about this however I've come to think that it would be nice to achieve a couple of different goals:

  1. Keep the output JSON message as payload whether or not options are set to use raw payload or not. In each case, the resulting data is the payload.
  2. Forgo the "encrypted" field if we have the raw JSON setting turned on. If we are outputting raw JSON then the result will by definition not be encrypted.

With these ideas in mind, I've created an alternative version of the this file that takes these scenarios into account. I'm sure there is room for some cleanup but I'd like to see what you think of this idea:

package handler

import (
	"encoding/base64"
	"encoding/json"

	"github.com/brandur/simplebox"
	"github.com/checkr/flagr/pkg/util"
	"github.com/checkr/flagr/swagger_gen/models"
)

type dataRecordEncryptor interface {
	Encrypt([]byte) (string, error)
}

type simpleboxEncryptor struct{ key [simplebox.KeySize]byte }

func (se *simpleboxEncryptor) Encrypt(b []byte) (string, error) {
	s := base64.StdEncoding.EncodeToString(
		simplebox.NewFromSecretKey(&se.key).Encrypt(b),
	)
	return s, nil
}

func newSimpleboxEncryptor(k string) dataRecordEncryptor {
	key := [simplebox.KeySize]byte{}
	copy(key[:], k)
	return &simpleboxEncryptor{key: key}
}

const (
	frameOutputModePayloadRawJSON = "payload_raw_json"
)

// DataRecordFrameOptions represents the options we can set to create a DataRecordFrame
type DataRecordFrameOptions struct {
	Encrypted       bool
	Encryptor       dataRecordEncryptor
	FrameOutputMode string
}

type rawPayload struct {
	Payload json.RawMessage `json:"payload"`
}

type stringPayload struct {
	Payload   string `json:"payload"`
	Encrypted bool
}

// DataRecordFrame represents the structure we can json.Marshal into data recorders
type DataRecordFrame struct {
	evalResult models.EvalResult
	options    DataRecordFrameOptions
}

func (drf *DataRecordFrame) MarshalJSON() ([]byte, error) {
	payload, err := drf.evalResult.MarshalBinary()
	if err != nil {
		return nil, err
	}

	if drf.options.FrameOutputMode == frameOutputModePayloadRawJSON {
		return json.Marshal(&rawPayload{
			Payload: payload,
		})
	}

	if drf.options.Encrypted && drf.options.Encryptor != nil {
		encryptedPayload, err := drf.options.Encryptor.Encrypt(payload)
		if err != nil {
			return nil, err
		}
		return json.Marshal(&stringPayload{
			Payload:   encryptedPayload,
			Encrypted: true,
		})
	}

	return json.Marshal(&stringPayload{
		Payload:   string(payload),
		Encrypted: false,
	})
}

// GetPartitionKey gets the partition key from entityID
func (drf *DataRecordFrame) GetPartitionKey() string {
	if drf.evalResult.EvalContext == nil {
		return ""
	}
	return util.SafeString(drf.evalResult.EvalContext.EntityID)
}

// Output sets the paylaod using its input and returns the json marshal bytes
func (drf *DataRecordFrame) Output() ([]byte, error) {
	return json.Marshal(drf)
}

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it, much better!

@zhouzhuojie zhouzhuojie force-pushed the zz/refactor-data-recorder branch 2 times, most recently from 09cc758 to a18539c Compare March 8, 2019 23:31
Comment thread pkg/config/env.go Outdated
// "encrypted": false
// }
//
// payload_raw_json mode - ignores the encryption settings.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we have removed the MarshalIdent I think we should either put the example JSON in a single line (as in the string version) or make a note in the comment that the resulting output will consist of a single record per line.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, I changed to show the single-line version in the comments.

@zhouzhuojie zhouzhuojie force-pushed the zz/refactor-data-recorder branch from a18539c to a550473 Compare March 11, 2019 20:53
@zhouzhuojie zhouzhuojie force-pushed the zz/refactor-data-recorder branch from a550473 to c952cb8 Compare March 11, 2019 20:54
// DataRecorder can record and produce the evaluation result
type DataRecorder interface {
AsyncRecord(*models.EvalResult)
AsyncRecord(models.EvalResult)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like there should be some standardization on whether EvalResult is used as a pointer or a value. It seems to be a bit of both right now. Any thoughts on this?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I refactored this interface, I had to make too many nil checks all over the places, and they didn't give much margin benefits especially when the struct is small enough. Instead, I will just assume that we can receive a struct from upstream. Which kind of makes sense: if there's nothing to log, it shouldn't even bother any DataRecorders.

And in the evaluation code path, the ultimate response is auto generated from https://github.com/checkr/flagr/blob/master/swagger_gen/restapi/operations/evaluation/post_evaluation_responses.go#L28, which is a pointer. Therefore the pointer reference is prepared inside eval.go

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, that makes sense. Well that's all I really have from my end. Looks good.

@crberube
Copy link
Copy Markdown
Contributor

Overall this looks pretty solid. I think at this point my only question has to do with the way EvalResult is treated (as noted above). Thanks for the help with this!

@zhouzhuojie zhouzhuojie merged commit 57fe2c4 into master Mar 11, 2019
@zhouzhuojie zhouzhuojie deleted the zz/refactor-data-recorder branch March 11, 2019 21:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Data Pipeline Format

3 participants