Skip to content

Conversation

gfr10598
Copy link
Contributor

@gfr10598 gfr10598 commented Jul 8, 2020

This adds CLI parameters to allow filtering of files based on last mod time.

We hope that this will allow DataTransfer service to be more efficient than if it has to review the entire history.
The config file now imposes 12h age restrictions on 3 of the 4 transfers. The mid-day transfer has NO age
restriction, so it will handle any old files that were missed in recent transfers.
Q: Should we add a liberal limit for the mid-day transfer, maybe one week or one month?

The PR also includes, for now, commenting out of some lines in the test configs that appear to be superfluous.


This change is Reviewable

@gfr10598 gfr10598 requested a review from stephen-soltesz July 8, 2020 15:02
@coveralls
Copy link

coveralls commented Jul 8, 2020

Pull Request Test Coverage Report for Build 168

  • 16 of 16 (100.0%) changed or added relevant lines in 2 files are covered.
  • 8 unchanged lines in 1 file lost coverage.
  • Overall coverage decreased (-2.3%) to 97.654%

Files with Coverage Reduction New Missed Lines %
internal/stctl/sync.go 8 87.5%
Totals Coverage Status
Change from base Build 160: -2.3%
Covered Lines: 333
Relevant Lines: 341

💛 - Coveralls

Copy link
Contributor

@stephen-soltesz stephen-soltesz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 approvals obtained (waiting on @gfr10598 and @stephen-soltesz)


daily-archive-transfers.yaml, line 42 at r1 (raw file):

    'stctl', '-gcs.source=archive-mlab-oti',
             '-gcs.target=archive-measurement-lab',
             '-time=04:00:00',

04:00 is not like the others. Is that intentional?

p->a: 02:30, 08:30, 14:30, 20:30
a->a: 04:00, 11:30, 17:30, 23:30


internal/stctl/create.go, line 66 at r1 (raw file):

	if prefixes != nil {
		cond.IncludePrefixes = prefixes
		spec.ObjectConditions = cond

Is behavior identical for an empty condition as nil?


internal/stctl/create.go, line 69 at r1 (raw file):

	}
	if maxAge > 0 {
		cond.MaxTimeElapsedSinceLastModification = fmt.Sprintf("%ds", maxAge/1000000000)

maxAge.Seconds()?


internal/stctl/sync_test.go, line 92 at r1 (raw file):

								Description: getDesc("fake-source", "fake-target", flagx.Time{Hour: 1, Minute: 2, Second: 3}),
								Schedule:    &storagetransfer.Schedule{
									// ScheduleEndDate: nil,

Why are these unnecessary now?

@gfr10598 gfr10598 marked this pull request as ready for review July 8, 2020 19:19
@gfr10598 gfr10598 force-pushed the sandbox-old-recent branch from 8b65808 to f111be8 Compare July 8, 2020 19:22
@gfr10598 gfr10598 changed the base branch from master to sandbox-soltesz-cbif July 8, 2020 19:39
@gfr10598 gfr10598 changed the base branch from sandbox-soltesz-cbif to hack July 8, 2020 19:41
Copy link
Contributor Author

@gfr10598 gfr10598 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 approvals obtained (waiting on @stephen-soltesz)


daily-archive-transfers.yaml, line 42 at r1 (raw file):

Previously, stephen-soltesz (Stephen Soltesz) wrote…

04:00 is not like the others. Is that intentional?

p->a: 02:30, 08:30, 14:30, 20:30
a->a: 04:00, 11:30, 17:30, 23:30

Yes. That is commented in the NOTE line in the comments. May require adjustment later.


internal/stctl/create.go, line 66 at r1 (raw file):

Previously, stephen-soltesz (Stephen Soltesz) wrote…

Is behavior identical for an empty condition as nil?

I believe so. ObjectConditions were previously only added if prefixes were provided, so I wanted to preserve that behavior.


internal/stctl/create.go, line 69 at r1 (raw file):

Previously, stephen-soltesz (Stephen Soltesz) wrote…

maxAge.Seconds()?

Done.


internal/stctl/sync_test.go, line 92 at r1 (raw file):

Previously, stephen-soltesz (Stephen Soltesz) wrote…

Why are these unnecessary now?

Added comments and deleted lines. See comment below for this particular one.

Copy link
Contributor

@stephen-soltesz stephen-soltesz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 5 files at r3.
Reviewable status: 1 change requests, 0 of 1 approvals obtained (waiting on @gfr10598)


daily-archive-transfers.yaml, line 42 at r1 (raw file):

Previously, gfr10598 (Gregory Russell) wrote…

Yes. That is commented in the NOTE line in the comments. May require adjustment later.

Well, shame on me! I guess I don't read comments carefully even when I'm supposed to be reviewing them O.O

I recommend a consistent pattern so that it's easier to reason about and anticipate, and updating gardener and other dependencies as necessary.

A consistent drumbeat from all the work I do on uniform names, standard columns, and earlier naming conventions is to eliminate idiosyncrasies and special cases so that anyone knows what to expect with minimal cognitive load around all the exceptions. Exceptions add up, and they make large systems inscrutable and unmanageable over time.

If not in this PR, then please open an issue.


daily-archive-transfers.yaml, line 82 at r3 (raw file):

# 14:30:00 Configure daily pusher to local archive transfer.
# For this one, maxFileAge is set to 7 days, to catch stragglers.
# This likely means the calculating phase will take much longer.

Please also mention that this is "midday" relative to the "last transfer of the day" at 02:30, so the longer calculation should be okay.


internal/stctl/sync_test.go, line 177 at r3 (raw file):

								Description: getDesc("fake-source", "fake-target", flagx.Time{Hour: 1, Minute: 2, Second: 3}),
								// Schedule can be empty because there is no TransferSpec?
								Schedule: &storagetransfer.Schedule{},

If these changes are causing the coverage profile to drop, please restore them.

https://coveralls.io/builds/31948756/source?filename=internal/stctl/sync.go

It doesn't make sense that the other changes would cause these lines to be uncovered.

@gfr10598 gfr10598 force-pushed the sandbox-old-recent branch from 1552370 to 8be88d4 Compare July 9, 2020 14:40
@gfr10598 gfr10598 force-pushed the sandbox-old-recent branch from 8be88d4 to dfa2caf Compare July 9, 2020 14:47
Copy link
Contributor Author

@gfr10598 gfr10598 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 1 change requests, 0 of 1 approvals obtained (waiting on @stephen-soltesz)


daily-archive-transfers.yaml, line 42 at r1 (raw file):

Previously, stephen-soltesz (Stephen Soltesz) wrote…

Well, shame on me! I guess I don't read comments carefully even when I'm supposed to be reviewing them O.O

I recommend a consistent pattern so that it's easier to reason about and anticipate, and updating gardener and other dependencies as necessary.

A consistent drumbeat from all the work I do on uniform names, standard columns, and earlier naming conventions is to eliminate idiosyncrasies and special cases so that anyone knows what to expect with minimal cognitive load around all the exceptions. Exceptions add up, and they make large systems inscrutable and unmanageable over time.

If not in this PR, then please open an issue.

Ok. We will need to delay the parser trigger then, but I think that is needed anyway, as the current old gardener trigger is at 04:10 UTC.

I've updated the times to 02:30 and 04:30, and then at six hour intervals thereafter.


daily-archive-transfers.yaml, line 82 at r3 (raw file):

Previously, stephen-soltesz (Stephen Soltesz) wrote…

Please also mention that this is "midday" relative to the "last transfer of the day" at 02:30, so the longer calculation should be okay.

Done.


internal/stctl/sync_test.go, line 177 at r3 (raw file):

Previously, stephen-soltesz (Stephen Soltesz) wrote…

If these changes are causing the coverage profile to drop, please restore them.

https://coveralls.io/builds/31948756/source?filename=internal/stctl/sync.go

It doesn't make sense that the other changes would cause these lines to be uncovered.

Done.

Copy link
Contributor

@stephen-soltesz stephen-soltesz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One clarifying question and request for an extra comment. :lgtm:

Reviewed 2 of 6 files at r1, 1 of 5 files at r3, 2 of 3 files at r4.
Reviewable status: :shipit: complete! 1 of 1 approvals obtained (waiting on @gfr10598)


daily-archive-transfers.yaml, line 42 at r1 (raw file):

Previously, gfr10598 (Gregory Russell) wrote…

Ok. We will need to delay the parser trigger then, but I think that is needed anyway, as the current old gardener trigger is at 04:10 UTC.

I've updated the times to 02:30 and 04:30, and then at six hour intervals thereafter.

Is 2 hours sufficient?

Recent pusher-to-archive transfers take 2.5 to 3hrs. https://console.cloud.google.com/transfer/cloud?project=mlab-oti&folder&organizationId This is without the benefit of the min or maxFileAge -- You believe this will work?

Please add a summary note at the start of the file, so it's easy to see in one place, like:

p->a: 02:30, 08:30, 14:30, 20:30
a->a: 04:30, 10:30, 16:30, 22:30

Copy link
Contributor Author

@gfr10598 gfr10598 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 1 approvals obtained (waiting on @stephen-soltesz)


daily-archive-transfers.yaml, line 42 at r1 (raw file):

Previously, stephen-soltesz (Stephen Soltesz) wrote…

Is 2 hours sufficient?

Recent pusher-to-archive transfers take 2.5 to 3hrs. https://console.cloud.google.com/transfer/cloud?project=mlab-oti&folder&organizationId This is without the benefit of the min or maxFileAge -- You believe this will work?

Please add a summary note at the start of the file, so it's easy to see in one place, like:

p->a: 02:30, 08:30, 14:30, 20:30
a->a: 04:30, 10:30, 16:30, 22:30

Done.

@gfr10598 gfr10598 changed the base branch from hack to master July 10, 2020 20:27
@gfr10598 gfr10598 merged commit 65c9cad into master Jul 10, 2020
@gfr10598 gfr10598 deleted the sandbox-old-recent branch July 10, 2020 20:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants