Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tcpinfo v2 parser #1067

Merged
merged 16 commits into from
Mar 22, 2022
Merged

Add tcpinfo v2 parser #1067

merged 16 commits into from
Mar 22, 2022

Conversation

stephen-soltesz
Copy link
Contributor

@stephen-soltesz stephen-soltesz commented Mar 15, 2022

This change updates the tcpinfo parser to support standard columns for the v2 data pipeline. This change preserves the previous behavior by thinning snapshots 10:1. This change eliminates tcpinfo support for the v1 data pipeline.

This change moves structures previously defined in the tcpinfo.go file to the common schema.go, now used only by v1 datatypes. These structures include ServerInfo, ClientInfo, and ParseInfoV0.

Design:

Part of:

Testing:

  • unit testing, local development mode w/ local output, and by sandbox deployment (with sandbox gardener configs, to verify processing) See: mlab-sandbox.ndt.tcpinfo.

This change is Reviewable

@coveralls
Copy link
Collaborator

coveralls commented Mar 15, 2022

Pull Request Test Coverage Report for Build 7261

  • 35 of 45 (77.78%) changed or added relevant lines in 4 files are covered.
  • 6 unchanged lines in 3 files lost coverage.
  • Overall coverage increased (+0.7%) to 65.286%

Changes Missing Coverage Covered Lines Changed/Added Lines %
parser/ndt.go 0 2 0.0%
parser/tcpinfo.go 33 41 80.49%
Files with Coverage Reduction New Missed Lines %
parser/tcpinfo.go 1 77.04%
schema/schema.go 2 92.59%
active/poller.go 3 64.33%
Totals Coverage Status
Change from base Build 7253: 0.7%
Covered Lines: 3972
Relevant Lines: 6084

💛 - Coveralls

Copy link
Contributor

@cristinaleonr cristinaleonr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry about the delay. I will take a look at this and the design tomorrow.

Reviewable status: 0 of 1 approvals obtained

Copy link
Contributor

@cristinaleonr cristinaleonr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 approvals obtained (waiting on @stephen-soltesz)


parser/tcpinfo.go, line 194 at r1 (raw file):

			Metadata: tcpMeta,
			// TODO - restore full snapshots, or implement smarter filtering.
			Snapshots: thinSnaps(snaps),

Do you want to add a link to the Snapshot Thinning section of the doc here for more context?
Coincidentally, they recently updated the TotT episode on our floor to TODOs and TODONTs.


parser/tcpinfo.go, line 200 at r1 (raw file):

	if err := p.Put(&row); err != nil {
		metrics.TestTotal.WithLabelValues(p.TableName(), "tcpinfo", "put error").Inc()
		metrics.ErrorCount.WithLabelValues(p.TableName(), "", "put error").Inc()

How come we pass in a value for the "filetype" label for TestTotal right above but not for ErrorCount? Same question for the other errors above (e.g.,WarningCount).


schema/schema.go, line 25 at r1 (raw file):

}

// ServerInfo details various information about the server.

I would say "various kinds of information" or just "information." Same thing below.


schema/tcpinfo.go, line 34 at r1 (raw file):

	Parser ParseInfo         `json:"parser" bigquery:"parser"`
	Date   civil.Date        `json:"date" bigquery:"date"`
	Raw    *TCPInfoRawRecord `json:"raw" bigquery:"raw"`

I added a comment about this type to the design doc.


schema/descriptions/toplevel.yaml, line 19 at r1 (raw file):

  Description: Original measurement collection timestamp.

# Lower case top-level columns from for 'Standard Column' schemas.

From or for?


schema/descriptions/TCPInfoRow.yaml, line 407 at r1 (raw file):

  Kernel:
VegasInfo:
  Description: Instrumntation in Vegas TCP

Instrumentation*. I know you didn't add this but I saw it when I was looking at the table in BQ and now I can't unsee it.

Copy link
Contributor Author

@stephen-soltesz stephen-soltesz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. PTAL?

Reviewable status: 0 of 1 approvals obtained (waiting on @cristinaleonr)


parser/tcpinfo.go, line 194 at r1 (raw file):

Previously, cristinaleonr (Cristina Leon) wrote…

Do you want to add a link to the Snapshot Thinning section of the doc here for more context?
Coincidentally, they recently updated the TotT episode on our floor to TODOs and TODONTs.

:-) I copied this TODO -- another instance of "don't change more than I have to". But, it's worth revisiting, so I support your calling attention to it. I've created a new issue to consider snapshot thinning.


parser/tcpinfo.go, line 200 at r1 (raw file):

Previously, cristinaleonr (Cristina Leon) wrote…

How come we pass in a value for the "filetype" label for TestTotal right above but not for ErrorCount? Same question for the other errors above (e.g.,WarningCount).

Good question. I try to keep conventions as I find them (as they were in this file). But, looking across other parsers, I see the datatype is typically here. And, I found two uses in the ndt.go parser that had the wrong number of labels (which could panic if ever called).

I would also prefer to take a close second pass across the etl metrics as a future clean up PR. I'm sure we can help here. (For example ErrorCount is not named well - and these may not need to be counted twice, now that the labels are identical; I'm sure we can find other improvements).

I've fixed these labels to match the datatype.


schema/schema.go, line 25 at r1 (raw file):

Previously, cristinaleonr (Cristina Leon) wrote…

I would say "various kinds of information" or just "information." Same thing below.

Done.


schema/tcpinfo.go, line 34 at r1 (raw file):

Previously, cristinaleonr (Cristina Leon) wrote…

I added a comment about this type to the design doc.

Good idea. Updated.


schema/descriptions/toplevel.yaml, line 19 at r1 (raw file):

Previously, cristinaleonr (Cristina Leon) wrote…

From or for?

lol, "pick one". Done.


schema/descriptions/TCPInfoRow.yaml, line 407 at r1 (raw file):

Previously, cristinaleonr (Cristina Leon) wrote…

Instrumentation*. I know you didn't add this but I saw it when I was looking at the table in BQ and now I can't unsee it.

Fixes always welcome. Done.

Copy link
Contributor

@cristinaleonr cristinaleonr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: 0 of 1 approvals obtained

Copy link
Contributor

@cristinaleonr cristinaleonr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 1 of 1 approvals obtained

@stephen-soltesz stephen-soltesz merged commit cf97719 into master Mar 22, 2022
@stephen-soltesz stephen-soltesz deleted the sandbox-soltesz-tcpinfo-v2 branch March 22, 2022 18:01
cristinaleonr pushed a commit that referenced this pull request Jun 3, 2022
* Update tcpinfo schema to use std columns
* Rename tcpinfo descriptions for standard columns
* Create v2 pipeline tables for tcpinfo
* Move a record description to toplevel.yaml
* Use datatype for ErrorCount metric
* Use native snapshot.ConnectionLog for raw record
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants