-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Add design proposal for workload history #30486
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
I couldn't figure out the best area label to add to this PR. If you have write-permissions please help me learn by adding exactly one area label. |
|
|
||
| ## `dotnet workload history` | ||
|
|
||
| The `dotnet workload history` command will show the history of workload installation commands. The output will be similar to the following format: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we include the success/failure indicator in the table as well?
|
|
||
| For example: | ||
|
|
||
| `dotnet workload update --from-history --before 3 --manifests-only` - This will roll back to the manifests before operation 3 in the workload history. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So before walks back and selects the first operation that was successful? For example, if you did --before 8 and 6 and 7 failed, but 5 succeeded, it will select 5? Would --after 2 select the last known successful state (if 3 and 5 were successful and 4 failed, it would pick 5)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say --before picks the latest successful operation before the id specified, so in your case, it would pick 5, but if 1/2/3/4 were all successful, it would still pick 5. After would pick the first successful one after that, so 3 for your case or 2 if that one was successful. (There has to be some way to say "pick 2 if possible", and after seems more appropriate than before.) Does that sound reasonable to you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, what I was going for was that --before n would give you the state immediately before command n was executed, regardless of whether it succeeded or failed. If command failed, then ideally it would roll back any changes, and --before n and --after n would mean the same thing.
Basically each record in the history records the state before and after the command was executed. The number specified selects which record to use, and whether you specify --before or --after selects which part of the state in that record to use. Whether a command was successful or not doesn't affect the logic.
Does this make sense? If this wasn't obvious from my description, maybe it's also going to be confusing to developers and we should try to find a different way to express these operations.
|
Should the standalone install include an entry to tell us when it updated baseline manifests? |
|
|
||
| When a CLI command is run which modifies the installed workloads or updates the manifests, we will write a record to the workload history log. The workload commands this applies to include `install`, `update`, `uninstall`, `repair`, and `restore`. Each workload history record will include the following information: | ||
|
|
||
| - Date/time command was run |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need all this information? I'm wondering why we need both time run and time completed, I'm wondering if we should be keeping failures at all (given that they don't actually update workload state and we aren't rolling back to them), and I'm a bit iffy on the rollback file/workload state parts, not because those aren't reasonable things to include but because those are potentially big, and a file that's intended to persist and never go is a major problem if it grows that fast. We presumably aren't intending to delete it at any point (or what's the point of having the file in the first place?), so I can easily imagine this becoming a problem, especially if we really do include all this information. In particular, imagine if I specify the wrong file by accident as my rollback file. It's 10 GB. I include the whole thing in this file, but the command fails. I don't immediately realize what my mistake was, so I run it 2-3 more times, adding another 20-30 GB to the history file. Do we really want a 40 GB file here? And I would argue that if the user then deletes the file as "why do I have this massive thing in my SDK?", we've failed. I think it should ideally be as streamlined as possible to prevent users from ever hitting that problem unless they run 100 million commands.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
None of this should be big. The workload state isn't supposed to include the actual contents of the workload packs or even the manifests. "Installed workloads" just means the list of names of the workloads that are installed. To roll back to a given state may still require downloading the workload packages from NuGet feeds, we aren't trying to avoid that.
Time run and time completed lets us know how long the command took to run, which could be useful. I think we should try to gather all this information as well as other information that we think might be useful. We'll be writing a separate data file for each command, and I would guess that each file would be 1 KB or less.
I think it's very unlikely that someone would accidentally specify a rollback file that was very large but still parsed correctly as a rollback file. We don't need to include the rollback file contents verbatim, just what we parsed out of it successfully. And if it fails to parse, then that's a failure that's early enough that I agree we wouldn't need to write a workload history record. If we actually start to try downloading something or installing something, I think we should write the record even if the operation ends up failing.
|
Is this superseded by #36809 ? @Forgind @dsplaisted |
Yes |
No description provided.