Updated OOD_WORKFLOW_SYNC_KEY usage in project manager tutorial#1349
Conversation
|
@Bubballoo3 Please help to review this |
Bubballoo3
left a comment
There was a problem hiding this comment.
I think this will work better if we weave it into the existing tutorial rather than add a section beneath it. First things first, we need to update the existing tutorial to use the OOD_WORKFLOW_SYNC_KEY instead of the COUNT variable. We can move your first section explaining the basic premise/behavior at the top of that section (where we currently have 'We also add a COUNT variable so that we can control when data is overwritten, an essential consideration if you plan to have more than one instance of a workflow run simultaneously.'). We will also need to add the step of enabling the sync key to that section.
I think it would also be helpful to demonstrate that the sync key overrides the value set by the launcher, which is useful for debugging. That is, we configure an environment variable for OOD_WORKFLOW_SYNC_KEY in both launchers, with a default value like 'test'. That way we can run each launcher independently to ensure that they work as expected. When they run in the workflow, that 'test' value gets overridden with the actual synchronization value. I think that process is super important to understand for debugging, so it would be great to include there.
I think the tempdir and lockfile examples are alright, but could be covered more briefly given that they are independent from the tangible example used in the rest of the tutorial. Maybe we can add a section like 'Alternative synchronization approaches' and then cover those briefly without presenting any code (so that all the code contributes to the main example).
Finally, I really like the conclusion of 'The Project Manager is designed to be flexible, so if the examples above aren't applicable to your needs, you can always design your own systems and conventions that fit your specific situation.' so let's keep that as the closing line.
|
Now that I think about it a little more, I am not sure that the lock file example you provide is actually the best way to do that. Since dependent launchers won't run if their parent job fails, it would be better to promote a pattern where a script checks its own output before finishing, and fails if something has gone wrong. That's a minor point, but it might be safer to just cover tmpdirs to show that you can name directories with the sync key in addition to files (which will be demonstrated in the main tutorial content). Since we already show files being named with the key, I would rather avoid the question of what people end up using those synchronized files for (leave it up to them). |
I thought of doing that, but it will cause confusion with the gif attached below. So kept it same. I believe better idea will be to add a sentence like - "OOD_WORKFLOW_SYNC_KEY can be used in place of COUNT in above example". Lmk if you still want to update the COUNT tutorial. |
That just means that the gif is out of date and will have to be re-recorded. The COUNT variable was only introduced as a crutch to get around this synchronization issue; now that we have a better approach there is no reason to leave it in there. |
|
Agreed I will do it |
|
4.2 is going to be releasing this week, so try to get to this later today. Even if you can't fully finish it, if you can get your in-progress changes pushed then I can finish it up without worrying about merge conflicts |
|
I will do this today and later once you agree upon changes I will make the gif. |
Bubballoo3
left a comment
There was a problem hiding this comment.
This looks great! I made a few suggestions for phrasing, but overall the content looks good. You should be good to record the gif, just make sure you show the part where the switch gets turned on.
A last point, I think the debugging advice on the bottom makes more sense as its own paragraph, but I left this out of the suggestion so that the word-by-word changes remained visible.
| ``OOD_WORKFLOW_SYNC_KEY`` variable so that we can control when data is overwritten, an essential consideration | ||
| if you plan to have more than one instance of a workflow run simultaneously. You need to submit workflow with | ||
| **Enable OOD_WORKFLOW_SYNC_KEY** turned on for above example. Thus every launcher in workflow run receives | ||
| *same* random 16-character token through the ``OOD_WORKFLOW_SYNC_KEY`` environment variable. The token stays | ||
| identical across launchers and unique to each run, which helps launchers to share intermediate files in this example. |
There was a problem hiding this comment.
| ``OOD_WORKFLOW_SYNC_KEY`` variable so that we can control when data is overwritten, an essential consideration | |
| if you plan to have more than one instance of a workflow run simultaneously. You need to submit workflow with | |
| **Enable OOD_WORKFLOW_SYNC_KEY** turned on for above example. Thus every launcher in workflow run receives | |
| *same* random 16-character token through the ``OOD_WORKFLOW_SYNC_KEY`` environment variable. The token stays | |
| identical across launchers and unique to each run, which helps launchers to share intermediate files in this example. | |
| ``OOD_WORKFLOW_SYNC_KEY`` variable so that we can prevent data from being overwritten, an essential consideration | |
| if you plan to have more than one instance of a workflow run simultaneously. To use this variable, we have to set | |
| **Enable OOD_WORKFLOW_SYNC_KEY** to 'on' for this workflow, which can be selected when the workflow is created or in the 'Edit' form of an existing workflow. When enabled, every launcher in the workflow receives the | |
| *same* random 16-character token through the ``OOD_WORKFLOW_SYNC_KEY`` environment variable. The token stays | |
| identical across launchers and unique to each run, which in this example allows the launchers to read and write data to intermediate files. |
|
Content looks great! I made two small edits but am totally happy with it now. As soon as you can record the updated gif, this is ready to move. Thanks so much! |
|
I will capture the video once OSC website is back online after maintenance. |
|
Updated the video and tested. Its good to go from my side. |
Bubballoo3
left a comment
There was a problem hiding this comment.
Looks good! Thanks so much for your work on this!
There was a problem hiding this comment.
It looks like this file may have been added by accident in place of the updated mp4 source file. This doesn't need to prevent us from merging, but would be nice to fix soon
There was a problem hiding this comment.
Since the DS_store is a metadata file, I will delete it for now and we can just add the new .mp4 in a separate PR, since it does not impact the site
There was a problem hiding this comment.
Oh I named the gif same as earlier; and old long video was irrelevant so I delete that. So the GIF is updated and we don't need 24FPS mp4 anyways. Lmk if you want me to upload that.
https://osc.github.io/ood-documentation-test/develop/
Added details on possible use of OOD_WORKFLOW_SYNC_KEY, not restricted to examples provided.
Feature was added in PR OSC/ondemand#5348
Let me know if you can think of any other use-cases, I will add those too.