Skip to content

Conversation

gfr10598
Copy link
Contributor

@gfr10598 gfr10598 commented Jul 7, 2020

Add separate transfers every six hours.

It is hoped that this will make the transfers faster, so we don't miss the start of parsing.

Also restricts the transfer to files modified in the past 5 days. Hopefully transfer service can use this to optimize the request.

NOTE: I haven't quite figured out how to create multiple requests. Changed the description, but that doesn't seem to be adequate.


This change is Reviewable

@gfr10598 gfr10598 requested a review from stephen-soltesz July 7, 2020 21:20
@coveralls
Copy link

coveralls commented Jul 7, 2020

Pull Request Test Coverage Report for Build 146

  • 6 of 6 (100.0%) changed or added relevant lines in 2 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage remained the same at 100.0%

Totals Coverage Status
Change from base Build 134: 0.0%
Covered Lines: 334
Relevant Lines: 334

💛 - Coveralls

Copy link
Contributor

@stephen-soltesz stephen-soltesz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the "multiple jobs" failing due to the image names in daily-archive-transfers.yaml?

Reviewable status: 1 change requests, 0 of 1 approvals obtained (waiting on @gfr10598 and @stephen-soltesz)


daily-archive-transfers.yaml, line 49 at r1 (raw file):

# 08:30:00 Configure daily pusher to local archive transfer.
- name: gcp-config-cbif-0830

name refers to the docker image built in the first step. The comment is sufficient to declare the time.


internal/stctl/create.go, line 50 at r1 (raw file):

// jobs. WARNING: Do not modify this format without adjusting existing configs to match.
func getDesc(src, dest string, start flagx.Time) string {
	return fmt.Sprintf("STCTL: transfer %s -> %s at %s", src, dest, start)

Hmm, this is too bad. The storagetransfer API is limited. The web UI more so. The structured description was chosen to help the 'sync' operation identify pre-existing transfer jobs, so they could be "disabled" (rather than deleted, so they remain in the historical web ui view) and a new one created with the correct settings.

This change would prevent the 'sync' operation from identifying earlier jobs that require updating the start time, and could erroneously leave earlier transfer schedules active if the start time were changed in the config.

(This change will require manually ending existing transfer jobs with the earlier names. But, maybe keeping the old ones would actually be valuable to address the comment below?)


internal/stctl/create.go, line 66 at r1 (raw file):

			IncludePrefixes: prefixes,
			// Only files modified in the last 5 days.
			MaxTimeElapsedSinceLastModification: "432000",

While possibly extraordinary, this leaves the possibility of missing archives. Perhaps a combination of a daily transfer for "all files" and the smaller, time-bounded transfers for expediency?

Copy link
Contributor

@stephen-soltesz stephen-soltesz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 1 change requests, 0 of 1 approvals obtained (waiting on @gfr10598)


internal/stctl/create.go, line 66 at r1 (raw file):

Previously, stephen-soltesz (Stephen Soltesz) wrote…

While possibly extraordinary, this leaves the possibility of missing archives. Perhaps a combination of a daily transfer for "all files" and the smaller, time-bounded transfers for expediency?

Oof, so for an example of how this API is limited (and darn strange), the deploy is failing b/c the value is not suffixed with a unit "s":

Failed to sync (error: googleapi: Error 400: Invalid value at 'transfer_job.transfer_spec.object_conditions.max_time_elapsed_since_last_modification' (type.googleapis.com/google.protobuf.Duration), Field 'maxTimeElapsedSinceLastModification', Illegal duration format; duration must end with 's', invalid)

Copy link
Contributor Author

@gfr10598 gfr10598 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 1 change requests, 0 of 1 approvals obtained (waiting on @stephen-soltesz)


daily-archive-transfers.yaml, line 49 at r1 (raw file):

Previously, stephen-soltesz (Stephen Soltesz) wrote…

name refers to the docker image built in the first step. The comment is sufficient to declare the time.

Done.


internal/stctl/create.go, line 50 at r1 (raw file):

Previously, stephen-soltesz (Stephen Soltesz) wrote…

Hmm, this is too bad. The storagetransfer API is limited. The web UI more so. The structured description was chosen to help the 'sync' operation identify pre-existing transfer jobs, so they could be "disabled" (rather than deleted, so they remain in the historical web ui view) and a new one created with the correct settings.

This change would prevent the 'sync' operation from identifying earlier jobs that require updating the start time, and could erroneously leave earlier transfer schedules active if the start time were changed in the config.

(This change will require manually ending existing transfer jobs with the earlier names. But, maybe keeping the old ones would actually be valuable to address the comment below?)

I think this is an acceptable compromise. We can manually disable or delete jobs we no longer need. If we delete one by mistake, it will be restored by a future deployment.


internal/stctl/create.go, line 66 at r1 (raw file):

Previously, stephen-soltesz (Stephen Soltesz) wrote…

Oof, so for an example of how this API is limited (and darn strange), the deploy is failing b/c the value is not suffixed with a unit "s":

Failed to sync (error: googleapi: Error 400: Invalid value at 'transfer_job.transfer_spec.object_conditions.max_time_elapsed_since_last_modification' (type.googleapis.com/google.protobuf.Duration), Field 'maxTimeElapsedSinceLastModification', Illegal duration format; duration must end with 's', invalid)

Fixed the missing "s".

I like the idea of keeping one job that has no filter, but also is not time sensitive.

Or, we could have a job that has a minimum past time, like 48 hours, and the incremental jobs could have a maximum time like 12 hours or 24 hours. Then they would be handling different data, and not stepping on each other. The incremental one could be more restrictive, because the other one would handle anything in the past.

One issue is that, since I don't want to reprocess tcpinfo more than annually, any files from previous days wouldn't get reprocessed until a year later. They would be available in the archive, and we could detect that they were missing from the tables. Perhaps in future we could add automatic detection and correction of such missing archives.

Copy link
Contributor Author

@gfr10598 gfr10598 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 1 change requests, 0 of 1 approvals obtained (waiting on @stephen-soltesz)


internal/stctl/create.go, line 50 at r1 (raw file):

Previously, gfr10598 (Gregory Russell) wrote…

I think this is an acceptable compromise. We can manually disable or delete jobs we no longer need. If we delete one by mistake, it will be restored by a future deployment.

Alternatively, we could change the CLI to take multiple times, and make the sync more sophisticated, so that it deletes any jobs with the same buckets, but missing times.

I think I'm happy with this as is for now though.

Copy link
Contributor

@stephen-soltesz stephen-soltesz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix the unit tests (travis is failing). Then, :lgtm:

Reviewable status: :shipit: complete! 1 of 1 approvals obtained (waiting on @gfr10598)


internal/stctl/create.go, line 50 at r1 (raw file):

Previously, gfr10598 (Gregory Russell) wrote…

Alternatively, we could change the CLI to take multiple times, and make the sync more sophisticated, so that it deletes any jobs with the same buckets, but missing times.

I think I'm happy with this as is for now though.

Would you be willing to write up this idea in an issue in this repo. I like the idea of preserving the automatic 'sync' behavior, but I also agree that there is better use of time.


internal/stctl/create.go, line 66 at r1 (raw file):

Previously, gfr10598 (Gregory Russell) wrote…

Fixed the missing "s".

I like the idea of keeping one job that has no filter, but also is not time sensitive.

Or, we could have a job that has a minimum past time, like 48 hours, and the incremental jobs could have a maximum time like 12 hours or 24 hours. Then they would be handling different data, and not stepping on each other. The incremental one could be more restrictive, because the other one would handle anything in the past.

One issue is that, since I don't want to reprocess tcpinfo more than annually, any files from previous days wouldn't get reprocessed until a year later. They would be available in the archive, and we could detect that they were missing from the tables. Perhaps in future we could add automatic detection and correction of such missing archives.

Yeah, I like the non-overlapping job idea too. 👍

Copy link
Contributor Author

@gfr10598 gfr10598 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 1 approvals obtained (waiting on @stephen-soltesz)


internal/stctl/create.go, line 50 at r1 (raw file):

Previously, stephen-soltesz (Stephen Soltesz) wrote…

Would you be willing to write up this idea in an issue in this repo. I like the idea of preserving the automatic 'sync' behavior, but I also agree that there is better use of time.

Done.


internal/stctl/create.go, line 66 at r1 (raw file):

Previously, stephen-soltesz (Stephen Soltesz) wrote…

Yeah, I like the non-overlapping job idea too. 👍

I'll start another PR.

@gfr10598 gfr10598 merged commit c3ace7e into master Jul 8, 2020
@gfr10598 gfr10598 deleted the sandbox-six-hour-transfer branch July 8, 2020 13:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants