Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

x/build/maintner: GerritMessage doesn't include inline comments #24863

Open
dmitshur opened this issue Apr 14, 2018 · 6 comments

Comments

@dmitshur
Copy link
Member

commented Apr 14, 2018

Consider CL 97058 as an example. It has reviews with inline comments. For example, here's one by Andrew:

image

GerritMessage structure has the Message string field:

// Message is the raw message contents from Gerrit (a subset
// of the raw git commit message), starting with "Patch Set
// nnnn".
Message string

But it doesn't expose the inline comments that were a part of that message. Here's the corresponding GerritMessage from maintner.Corpus:

(*maintner.GerritMessage)({
  Meta:    (*maintner.GitCommit)({GitCommit 9298d2c50518c3444f01b95aae9579f9d3bdb30d}),
  Version: (int32) 1,
  Message: (string) (len=26) "Patch Set 1:\n\n(4 comments)",
  Date:    (time.Time) 2018-02-26 05:16:52 +0000 +0000,
  Author:  (*maintner.GitPerson)({
    Str: (string) (len=61) "Andrew Bonventre <22285@62eb7196-b449-3ce5-99f1-c037f21e1705>"
  })
}),

This is a feature request and a tracking issue for it. I imagine this should be in scope, since CL comment bodies are already included. /cc @bradfitz @andybons

@dmitshur dmitshur added the Builders label Apr 14, 2018

@gopherbot gopherbot added this to the Unreleased milestone Apr 14, 2018

@bradfitz

This comment has been minimized.

Copy link
Member

commented Apr 15, 2018

It's a bit of work to get them and I haven't need them for anything yet, so it's just not done yet.

@dmitshur

This comment has been minimized.

Copy link
Member Author

commented Apr 15, 2018

Makes sense. I think it’s fair for whoever needs this first to work on the implementation. :) I’ll need this later, so I’m happy to work on it if it’s not done by then. I just wanted to file the issue for now.

@dmitshur

This comment has been minimized.

Copy link
Member Author

commented Apr 18, 2018

Posting some notes here after a quick investigation into where the inline comment data is.

It's available via Gerrit's git-based protocol (unsurprisingly). It's located inside the diff of the corresponding meta commit.

E.g., to see inline comments for that message from CL 97058 shown in the original issue, you can do the following in https://go.googlesource.com/tools git repo:

$ git fetch origin refs/changes/58/97058/meta
From https://go.googlesource.com/tools
 * branch              refs/changes/58/97058/meta -> FETCH_HEAD

Then follow the meta commit history to find the commit corresponding to that comment (this is what maintner already does to find all CL review messages):

$ git log FETCH_HEAD
...
    Tag: autogenerated:gerrit:newPatchSet
    Groups: 1f47dc6becad3ecc4a64a3c099003024529cb7db

commit 9298d2c50518c3444f01b95aae9579f9d3bdb30d
Author: Andrew Bonventre <22285@62eb7196-b449-3ce5-99f1-c037f21e1705>
Date:   Mon Feb 26 05:16:52 2018 +0000

    Update patch set 1
    
    Patch Set 1:
    
    (4 comments)
    
    Patch-set: 1

commit 1f917aaee1f32c46ecb0cea5d26c57700f1c0b71
Author: David Url <26506@62eb7196-b449-3ce5-99f1-c037f21e1705>
Date:   Sun Feb 25 17:46:13 2018 +0000

    Update patch set 1
    
    Patch Set 1: 
...

The inline comments are stored in a JSON encoded format in the commit diff:

$ git show 9298d2c50518c3444f01b95aae9579f9d3bdb30d
commit 9298d2c50518c3444f01b95aae9579f9d3bdb30d
Author: Andrew Bonventre <22285@62eb7196-b449-3ce5-99f1-c037f21e1705>
Date:   Mon Feb 26 05:16:52 2018 +0000

    Update patch set 1
    
    Patch Set 1:
    
    (4 comments)
    
    Patch-set: 1

diff --git a/cc5c776e597bc8903799f3e7894a387959ba1976 b/cc5c776e597bc8903799f3e7894a387959ba1976
new file mode 100644
index 00000000..2d4c287c
--- /dev/null
+++ b/cc5c776e597bc8903799f3e7894a387959ba1976
@@ -0,0 +1,96 @@
+{
+  "comments": [
+    {
+      "key": {
+        "uuid": "33bdaac8_f15ed164",
+        "filename": "cmd/present/static/styles.css",
+        "patchSetId": 1
+      },
+      "lineNbr": 418,
+      "author": {
+        "id": 22285
+      },
+      "writtenOn": "2018-02-26T05:16:52Z",
+      "side": 1,
+      "message": "while others do this, I would lose the element portion of the selector and just have it be:\n\n.pagenumber",
+      "range": {
+        "startLine": 418,
+        "startChar": 0,
+        "endLine": 418,
+        "endChar": 3
+      },
+      "revId": "cc5c776e597bc8903799f3e7894a387959ba1976",
+      "serverId": "62eb7196-b449-3ce5-99f1-c037f21e1705",
+      "unresolved": true
+    },
+    {
+      "key": {
+        "uuid": "5c6b57c3_5a1df3df",
+        "filename": "cmd/present/static/styles.css",
+        "patchSetId": 1
+      },
+      "lineNbr": 423,
+      "author": {
+        "id": 22285
+      },
+      "writtenOn": "2018-02-26T05:16:52Z",
+      "side": 1,
+      "message": "instead of adding padding, set right to be 10px instead of 0.",
+      "range": {
+        "startLine": 423,
+        "startChar": 2,
+        "endLine": 423,
+        "endChar": 21
+      },
+      "revId": "cc5c776e597bc8903799f3e7894a387959ba1976",
+      "serverId": "62eb7196-b449-3ce5-99f1-c037f21e1705",
+      "unresolved": true
+    },
+    {
+      "key": {
+        "uuid": "e9c57107_0ff68085",
+        "filename": "cmd/present/static/styles.css",
+        "patchSetId": 1
+      },
+      "lineNbr": 424,
+      "author": {
+        "id": 22285
+      },
+      "writtenOn": "2018-02-26T05:16:52Z",
+      "side": 1,
+      "message": "alphabetize these.\n\nalso space after :",
+      "range": {
+        "startLine": 419,
+        "startChar": 0,
+        "endLine": 424,
+        "endChar": 17
+      },
+      "revId": "cc5c776e597bc8903799f3e7894a387959ba1976",
+      "serverId": "62eb7196-b449-3ce5-99f1-c037f21e1705",
+      "unresolved": true
+    },
+    {
+      "key": {
+        "uuid": "de026b61_b933f428",
+        "filename": "cmd/present/templates/slides.tmpl",
+        "patchSetId": 1
+      },
+      "lineNbr": 65,
+      "author": {
+        "id": 22285
+      },
+      "writtenOn": "2018-02-26T05:16:52Z",
+      "side": 1,
+      "message": "make this a span instead of a div. since you don\u0027t need the padding (from the comment above) it won\u0027t matter.",
+      "range": {
+        "startLine": 65,
+        "startChar": 7,
+        "endLine": 65,
+        "endChar": 10
+      },
+      "revId": "cc5c776e597bc8903799f3e7894a387959ba1976",
+      "serverId": "62eb7196-b449-3ce5-99f1-c037f21e1705",
+      "unresolved": true
+    }
+  ]
+}
\ No newline at end of file

That is... quite a lot of bytes just to represent 4 inline comments of a single review comment.

The golang.org/x/build/maintner/maintpb package currently has:

type GitCommit struct {
	Sha1 string `protobuf:"bytes,1,opt,name=sha1" json:"sha1,omitempty"`
	// raw is the "git cat-file commit $sha1" output.
	Raw      []byte       `protobuf:"bytes,2,opt,name=raw,proto3" json:"raw,omitempty"`
	DiffTree *GitDiffTree `protobuf:"bytes,3,opt,name=diff_tree,json=diffTree" json:"diff_tree,omitempty"`
}

// git diff-tree --numstat oldtree newtree
type GitDiffTree struct {
	File []*GitDiffTreeFile `protobuf:"bytes,1,rep,name=file" json:"file,omitempty"`
}

// GitDiffTreeFile represents one line of `git diff-tree --numstat` output.
type GitDiffTreeFile struct {
	File    string `protobuf:"bytes,1,opt,name=file" json:"file,omitempty"`
	Added   int64  `protobuf:"varint,2,opt,name=added" json:"added,omitempty"`
	Deleted int64  `protobuf:"varint,3,opt,name=deleted" json:"deleted,omitempty"`
	Binary  bool   `protobuf:"varint,4,opt,name=binary" json:"binary,omitempty"`
}

The most obvious way to support this is to expand GitDiffTreeFile to have an optional Raw []byte field containing the actual git diff output, and populate it only for meta commits.

It's hard for me to judge if that is the appropriate solution. It feels like a lot of bytes to express just a few inline comments. Do we need it all?

@bradfitz What's the reason the maintpb.Mutation schema is so low level (for Gerrit)? I imagine it could've been higher level (more similar to how it is for GitHub issues; but that might just be because GitHub API is higher level). Is it so that the higher-level information that's exposed can be more easily modified, without having to alter historical data?

@bradfitz

This comment has been minimized.

Copy link
Member

commented Apr 18, 2018

@bradfitz What's the reason the maintpb.Mutation schema is so low level (for Gerrit)?

Because I knew that Gerrit (as of the past few years with its NoteDb backend) stored all its database in git, so simply mirroring its magic git refs was sufficient for having all its data. And then we just have to parse it.

@dmitshur

This comment has been minimized.

Copy link
Member Author

commented Apr 18, 2018

Fair enough, but the tradeoff is more data.

Are you okay with the size increase from including the diffs for meta commits?

Should we try to estimate or calculate how much it would cost (on disk and in-memory) before considering implementing this?

@bradfitz

This comment has been minimized.

Copy link
Member

commented Apr 18, 2018

Storing a bunch of diffs of JSON will be a pain to parse later. Doable I guess.

But yeah, we'd want to see how big it is first. (uncompressed and compressed)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.