Refactor links status symlink dirs to be encapsulated in the LogManager, not stored on the FlowCfg / JobSpec objects#174
Conversation
c472796 to
618bc71
Compare
machshev
left a comment
There was a problem hiding this comment.
Thanks @AlexJones0, this is better than what we have now. Just wondering if we can fully sweep this all up into the LogManager using Lazy operations and the callbacks?
f21b5ad to
191a18a
Compare
The workspace config is just defined from attributes on the `FlowCfg` and then frozen, so there is no need for it to be constructed on each `Deploy`. We instead create a frozen configuration once in the base flow after all HJSON wildcard expansion, and use that configuration for every `Deploy` object. Signed-off-by: Alex Jones <alex.jones@lowrisc.org>
We do not need each of these independent call-sites; in reality, these status symlink dirs are used for jobs running in the scheduler and so we only need to make them when the scheduler is run. Likewise, these can be encapsulated in the `LogManager` scheduler observer with the rest of the status linking functionality. This also helps reduce duplication - we do not need to ensure that each flow that runs the scheduler remembers to create the directories, and we do not need to duplicate the same code across the different flows. Signed-off-by: Alex Jones <alex.jones@lowrisc.org>
Even when using the legacy Launchers to run jobs, they will now be going through the new async scheduler with symlinked dirs managed by the `LogManager`. So there is no need to do the linking in each of the launchers anymore. Remove this deprecated code. Signed-off-by: Alex Jones <alex.jones@lowrisc.org>
Make the scheduler tests no longer rely on the existence of `JobSpec.links`. The "links" defined by default on the `job_spec_factory` will be removed by a future commit when the attribute is dropped from the `JobSpec`. We can also simplify some tests - the new scheduler no longer makes assumptions that certain paths do or do not exist. Signed-off-by: Alex Jones <alex.jones@lowrisc.org>
These have now been migrated to the `LogManager` so that all the log status symlinking logic takes place in one centralized location. Signed-off-by: Alex Jones <alex.jones@lowrisc.org>
191a18a to
537355b
Compare
links status symlink dirs to be derived from the WorkspaceConfig, not stored on the FlowCfg / JobSpec objectslinks status symlink dirs to be encapsulated in the LogManager, not stored on the FlowCfg / JobSpec objects
rswarbrick
left a comment
There was a problem hiding this comment.
Wow, this is dramatically simpler! One minor question, but it looks like a great improvement.
|
|
||
| def status_symlink_dir(self, job: JobSpec, status: JobStatus) -> Path: | ||
| """Get the output dir for symlinking jobs of a given status.""" | ||
| return job.workspace_cfg.scratch_path / status.name.lower() |
There was a problem hiding this comment.
Should this have any check to make sure that the status is terminal/running? If neither is true, then presumably it will return a path that isn't used?
There was a problem hiding this comment.
I opted to take the approach that the caller is responsible for calling has_symlink_dir to check if a status is terminal/running. For this little helper, I think it making it able to return a directory for any status makes sense. If somebody wants to make a "scheduled" or "queued" directory for whatever reason, they can. But we just choose not to because it doesn't mean much for our use case.
WDYT?
Do some refactoring as a follow-up to address #138 (comment) and #163 (and hence #146 (comment)).
Rather than pre-compute the status symlink output directories on the different
FlowCfgobjects and pass them through theJobSpecobjects, move all this logic to be encapsulated within the newLogManageralong with the existing runtime status symlinking logic.See the relevant commit messages for more information.
Closes #163.