Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

logger: marshal integers as strings #1150

Closed
nhooyr opened this issue Sep 14, 2018 · 9 comments
Closed

logger: marshal integers as strings #1150

nhooyr opened this issue Sep 14, 2018 · 9 comments
Assignees
Labels
api: logging Issues related to the Cloud Logging API. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.

Comments

@nhooyr
Copy link

nhooyr commented Sep 14, 2018

In https://github.com/GoogleCloudPlatform/google-cloud-go/blob/c79f55251a11f4e0f64bb8c15f167392cb607dca/logging/logging.go#L695

Instead of unmarshalling into a float64, you can call UseNumber on a json decoder and decode into a string. Then instead of returning a number here, a structpb string value should be returned to prevent integer truncation.

@jeanbza jeanbza added api: logging Issues related to the Cloud Logging API. type: question Request for information or clarification. Not an issue. labels Sep 14, 2018
@jeanbza jeanbza self-assigned this Sep 14, 2018
@jeanbza
Copy link
Member

jeanbza commented Nov 30, 2018

@nhooyr Could you explain this a bit more? What are you trying to do, what do you expect to happen, and what do you see happening?

@jeanbza jeanbza added the needs more info This issue needs more information from the customer to proceed. label Nov 30, 2018
@jeanbza
Copy link
Member

jeanbza commented Dec 13, 2018

We've asked some clarifying questions but didn't get a response back in some time. Closing this. Feel free to ping back, though.

@jeanbza jeanbza closed this as completed Dec 13, 2018
@nhooyr
Copy link
Author

nhooyr commented Dec 24, 2018

The Go JSON library unmarshals all numbers into float64 by default. The issue with this is that float64's cannot hold all Go numbers e.g. the full range of int64s. So if you send this library a big int64, it will truncate it into a float64 because of the cited code. This should be fixed.

@jeanbza jeanbza reopened this Dec 27, 2018
@jeanbza jeanbza removed needs more info This issue needs more information from the customer to proceed. type: question Request for information or clarification. Not an issue. labels Dec 27, 2018
@JustinBeckwith JustinBeckwith added triage me I really want to be triaged. 🚨 This issue needs some love. labels Dec 27, 2018
@jeanbza jeanbza added type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design. and removed 🚨 This issue needs some love. triage me I really want to be triaged. labels Dec 27, 2018
@jeanbza
Copy link
Member

jeanbza commented Dec 27, 2018

Thanks @nhooyr. This requires a bit more investigation on our part, but I've re-opened it and will investigate when we free up some time.

@sinmetal
Copy link
Contributor

sinmetal commented Jan 7, 2019

I am bothered by the problem that large values are erroneous, so I'm happy if I am cured by this FR.

FYI

However, I tried it with gcloud logging, but I got an error, so I think that it is difficult to solve this problem at the client.

The following is the result I tried.

gcloud logging write projects/gcpugjp/logs/bignum --log-http --payload-type=json '{"num": 36028797018963966, "text": "36028797018963966"}'
=======================
==== request start ====
uri: https://logging.googleapis.com/v2/entries:write?alt=json
method: POST
== headers start ==
Authorization: --- Token Redacted ---
accept: application/json
accept-encoding: gzip, deflate
content-length: 201
content-type: application/json
user-agent: google-cloud-sdk x_Tw5K8nnjoRAqULM9PFAC2b gcloud/217.0.0 command/gcloud.logging.write invocation-id/c4ec4b9252264851b33d9ac029133c8d environment/None environment-version/None interactive/True from-script/False python/2.7.14 (Macintosh; Intel Mac OS X 16.7.0)
== headers end ==
== body start ==
{"entries": [{"jsonPayload": {"num": 36028797018963966, "text": "36028797018963966"}, "logName": "projects/gcpugjp/logs/bignum", "resource": {"type": "global"}, "severity": "DEFAULT"}]}
== body end ==
==== request end ====
---- response start ----
-- headers start --
-content-encoding: gzip
alt-svc: quic=":443"; ma=2592000; v="44,43,39,35"
cache-control: private
content-length: 3
content-type: application/json; charset=UTF-8
date: Tue, 02 Oct 2018 10:25:37 GMT
server: ESF
status: 200
transfer-encoding: chunked
vary: Origin, X-Origin, Referer
x-content-type-options: nosniff
x-frame-options: SAMEORIGIN
x-xss-protection: 1; mode=block
-- headers end --
-- body start --
{}

-- body end --
total round trip time (request+response): 0.182 secs
---- response end ----
----------------------
Created log entry.

Post Payload num and text are 36028797018963966 , but when you get it num is 36028797018963968 .

gcloud logging read projects/gcpugjp/logs/bignum
---
insertId: iljpr3f1gkkso
jsonPayload:
  num: 36028797018963968
  text: '36028797018963966'
logName: projects/gcpugjp/logs/bignum
receiveTimestamp: '2018-10-02T10:25:37.510646519Z'
resource:
  labels:
    project_id: gcpugjp
  type: global
timestamp: '2018-10-02T10:25:37.510646519Z'
---

@jeanbza jeanbza removed their assignment Jul 30, 2019
@odeke-em odeke-em self-assigned this Aug 4, 2019
@odeke-em
Copy link
Contributor

odeke-em commented Aug 4, 2019

Thank you for reporting this @nhooyr and for the chime in @sinmetal!

So to respond in reverse chronological order:
b) @sinmetal the value that you are seeing marshaled as 36028797018963968 is the float64 equivalent of converting from int64 36028797018963966 but the conversion might not retrieve the original value unfortunately and you can see this as per https://play.golang.org/p/bLCbhS-_dHU

package main

import (
	"fmt"
)

func main() {
	const orig = int64(36028797018963966)
	fmt.Printf("Int64->Float64:\nInt64:   %d\nFloat64: %.0f\n\n", orig, float64(orig))

	asFloat := float64(36028797018963966)
	fmt.Printf("Float64->Int64:\nFloat64: %.0f\nInt64:   %d\n\n", asFloat, int64(asFloat))
}

which produces

Int64->Float64:
Int64:   36028797018963966
Float64: 36028797018963968

Float64->Int64:
Float64: 36028797018963968
Int64:   36028797018963968

a) @nhooyr spot on! In deed, I think we should go on with marshalling integers/numbers as strings otherwise we'll suffer from precision losses as we've seen and here is a test that we can use to test and reproduce the problem but also to serve as a regression test when we mail a fix

package main

import (
	"context"
	"errors"
	"fmt"
	"math"
	"net"
	"testing"

	"cloud.google.com/go/logging"
        structpb "github.com/golang/protobuf/ptypes/struct"
	"github.com/golang/protobuf/ptypes/empty"
	"google.golang.org/api/option"
	logpb "google.golang.org/genproto/googleapis/logging/v2"
	"google.golang.org/grpc"
)

func TestIssue1150(t *testing.T) {
	ln, err := net.Listen("tcp", ":0")
	if err != nil {
		t.Fatalf("Failed to find an available port: %v", err)
	}
	defer ln.Close()
	srv := grpc.NewServer()

	var logsSvc logsService
	logpb.RegisterLoggingServiceV2Server(srv, &logsSvc)

	go srv.Serve(ln)

	conn, err := grpc.Dial(ln.Addr().String(), grpc.WithInsecure(), grpc.WithBlock())
	if err != nil {
		t.Fatalf("Failed to dial to the gRPC server: %v", err)
	}
	defer conn.Close()

	ctx := context.Background()
	lc, err := logging.NewClient(ctx, "odeke-sandbox", option.WithGRPCConn(conn))
	if err != nil {
		t.Fatalf("Failed to create new client: %v", err)
	}
	defer lc.Close()

	lgr := lc.Logger("1150")

	key := "max"
	value := uint64(math.MaxUint64)
	if err := lgr.LogSync(ctx, logging.Entry{Payload: map[string]interface{}{
		key: value,
	}}); err != nil {
		t.Fatalf("Failed to logSync: %v\n", err)
	}

	if len(logsSvc) == 0 {
		t.Fatal("Expected at least one log")
	}
	headEntry := logsSvc[0]
	if headEntry == nil || len(headEntry.Entries) == 0 {
		t.Fatal("Could not retrieve a single entry")
	}
	payload := headEntry.Entries[0].GetPayload()
	jsonPayload, ok := payload.(*logpb.LogEntry_JsonPayload)
	if !ok {
		t.Fatalf("Got %T, expected a *logpg.LogEntry_JsonPayloaa", payload)
	}

	st := jsonPayload.JsonPayload
	if st == nil {
		t.Fatal("Unexpectedly got a nil JSON payload")
	}
	gotValue := st.Fields[key]
        var asString string

        switch gv := gotValue.Kind.(type) {
        case *structpb.Value_NumberValue:
            asString = fmt.Sprintf("%f", gv.NumberValue)

        case *structpb.Value_StringValue:
            asString = gv.StringValue
        }

	wantString := fmt.Sprintf("%d", value)
	if asString != wantString {
		t.Fatalf("Value as string mismatch\nGot:  %q\nWant: %q", asString, wantString)
	}
}

type logsService []*logpb.WriteLogEntriesRequest

var _ logpb.LoggingServiceV2Server = (*logsService)(nil)

func (lgs *logsService) DeleteLog(ctx context.Context, req *logpb.DeleteLogRequest) (*empty.Empty, error) {
	return nil, errors.New("unimplemented")
}

func (lgs *logsService) WriteLogEntries(ctx context.Context, req *logpb.WriteLogEntriesRequest) (*logpb.WriteLogEntriesResponse, error) {
	*lgs = append(*lgs, req)
	return new(logpb.WriteLogEntriesResponse), nil
}

func (lgs *logsService) ListLogEntries(ctx context.Context, req *logpb.ListLogEntriesRequest) (*logpb.ListLogEntriesResponse, error) {
	return nil, errors.New("unimplemented")
}

func (lgs *logsService) ListLogs(ctx context.Context, req *logpb.ListLogsRequest) (*logpb.ListLogsResponse, error) {
	return nil, errors.New("unimplemented")
}

func (lgs *logsService) ListMonitoredResourceDescriptors(ctx context.Context, req *logpb.ListMonitoredResourceDescriptorsRequest) (*logpb.ListMonitoredResourceDescriptorsResponse, error) {
	return nil, errors.New("unimplemented")
}

Before

$ GO111MODULE=off go test -v
=== RUN   TestIssue1150
--- FAIL: TestIssue1150 (0.00s)
    repro_test.go:85: Value as string mismatch
        Got:  "18446744073709551616.000000"
        Want: "18446744073709551615"
FAIL
exit status 1
FAIL	_/Users/emmanuelodeke/Desktop/openSrc/bugs/golang/1150	0.032s

After

On applying the patch

diff --git a/logging/logging.go b/logging/logging.go
index c537675f..84ae6e80 100644
--- a/logging/logging.go
+++ b/logging/logging.go
@@ -727,8 +727,9 @@ func toProtoStruct(v interface{}) (*structpb.Struct, error) {
 		}
 	}
 	var m map[string]interface{}
-	err = json.Unmarshal(jb, &m)
-	if err != nil {
+	dec := json.NewDecoder(bytes.NewReader(jb))
+	dec.UseNumber()
+	if err = dec.Decode(&m); err != nil {
 		return nil, fmt.Errorf("logging: json.Unmarshal: %v", err)
 	}
 	return jsonMapToProtoStruct(m), nil
@@ -760,6 +761,8 @@ func jsonValueToStructValue(v interface{}) *structpb.Value {
 			vals = append(vals, jsonValueToStructValue(e))
 		}
 		return &structpb.Value{Kind: &structpb.Value_ListValue{ListValue: &structpb.ListValue{Values: vals}}}
+	case json.Number:
+		return &structpb.Value{Kind: &structpb.Value_StringValue{StringValue: x.String()}}
 	default:
 		panic(fmt.Sprintf("bad type %T for JSON value", v))
 	}

result

$ GO111MODULE=off go test -v
=== RUN   TestIssue1150
--- PASS: TestIssue1150 (0.00s)
PASS
ok  	_/Users/emmanuelodeke/Desktop/openSrc/bugs/golang/1150	0.022s

@odeke-em
Copy link
Contributor

odeke-em commented Aug 4, 2019

@odeke-em
Copy link
Contributor

odeke-em commented Aug 6, 2019

So Jean asked me some great questions that made me reevaluate on whether this even needs a chance.

By making json.Numbers strings in the serialized output, we lose comparability!

Despite a perceived loss of precision, actually users can compare their
integers with the transformed float64 values e.g. The search:

   resource.type="project"
   resource.labels.project_id="odeke-sandbox"
   jsonPayload.max = 18446044073709551615

will correctly yield the result:

    19-08-05 18:21:01.011 MDT
{"time":"2019-08-06T00:21:01.011231Z","max":18446044073709552000}

I am going to freeze this until we get the chance to speak to some folks from the Stackdriver Logging engineering team about these decisions and the behavior. I'd rather keep this working for the majority of users.

@odeke-em
Copy link
Contributor

So I think implementing this feature request will cause way more averse effects than the benefit of visually inspecting the same value. You can still search for those values as I've shown in #1150 (comment) and it is just a perceived precision loss but really those values are still uploaded as they are, and also this would require intervention from the Google Cloud Stackdriver team to change their protos.

Therefore, I am going to close this as won't fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: logging Issues related to the Cloud Logging API. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.
Projects
None yet
Development

No branches or pull requests

5 participants