-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Workflow guidelines in amor reduction #48
Comments
|
I think the guidelines are a bit unclear on the Ultimately any file identifier (be it a repository + name, a hash, a Scicat ID or whatever) will be resolved to a file path in the file system, in my mind it's natural to call that path a It's also easier if we make |
I'm not fond of the Generally I don't think it's good to name anything |
Alternatively we just get rid of the The |
The It's just what it's called in other workflows, so I would like to standardize it all, instead of having each package have its own naming. About the In addition, as an example, say I have events that are binned in pixels, and then I apply masks on the pixels. Currently, such data is named I do get the point about
|
@SimonHeybrock this was after a discussion I had with @jokasimr where we discussed how to handle different sources for files. Previously, we had settled on just having a His alternative solution was to have something like
So then, depending on which one you set on your workflow/params, it would pick the appropriate provider. I guess the question is, what was our plan for handling Scicat files with our current solution? |
This seems unnecessary since the "conversion" would be just the identity function? |
What is wrong with the solution in ESSsans and the Guidelines (on Scicat files could be handled like tutorial files, e.g., as here: https://scipp.github.io/esssans/user-guide/isis/zoom.html#Configuring-data-to-load. But no ultimate decision has been made. The solution with having multiple filename types seems very invasive and non-obvious? It sounds like it is going back to the old approach, which was confusing. |
@jokasimr Here is the conclusion from the previous discussion on this topic, based on which the decision for this approach was made: scipp/esssans#136 (comment). |
With the approach @nvaytet suggested we get parallel file downloads for free with dask as part of the pipeline, that's quite nice. Another benefit is that we're working in the declarative framework of sciline instead of stepping outside of it. It also seems quite verbose to create a function for each file you're using in the tutorial workflows, but that's maybe less important. I still think
Wasn't the old approach confusing because it tried to hide the difference between local files and remote files? This approach doesn't do that. If it has been decided we do it in the other way (the one currently in the guidelines) then I'm fine with that. |
Not clear this is positive? It also means side-effects, potential race-conditions, ...
... but you need to write multiple providers, and modify the workflow (by inserting different providers) depending on the file sources?
Maybe I do not understand what you suggest. If you have multiple filename types, I suppose you need to make dedicated providers for each, and then let the user (or the workflow wrapper?) insert the correct ones?
Great 👍 |
Not sure that would be a problem in practice if we design the file download providers sanely. And the benefits of parallel file downloads can be quite large.
Yes, one provider per file source. But not one per file.
I'd suggest to insert all providers from the start. Then the user doesn't have to insert the correct ones. The correct one will be used based on the parameters the user has provided. |
@nvaytet pointed out the misunderstanding here. I was thinking a pipeline could have multiple providers producing the same type of output, but that's of course not the case and it never has been. Sorry about the confusion. The user would have to insert the right one manually. |
Making a list of things to do before I forget:
Use an
AmorWorkflow
with default parameters and providers, instead of talking about providers in the notebook.Naming (in accordance with SANS and diffraction workflows):
Run
->RunType
Sample
->SampleRun
FilePath
vs dedicated functions to provide file namesRawDetector
->LoadedNeXusDetector
?RawEvents
->RawDetectorData
ChopperCorrectedTofEvents
->ReducibleDetectorData
?The text was updated successfully, but these errors were encountered: