New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor (WIP) #237
Refactor (WIP) #237
Conversation
Looks great so far. I'm curious to hear more about how Thunder will abstract over parallelization engines. Regarding the new packages, would usage be something like |
Thanks @d-v-b ! For engine switching, there's a new user-facing global context for switching between backends, and internally we condition on its state (it mainly only matters during loading). For example, you'll be able to load using spark with import thunder
thunder.setup(spark=True)
data = thunder.series.fromBinary('path') and load locally with import thunder
thunder.setup()
data = thunder.series.fromBinary('path') or because import thunder
data = thunder.series.fromBinary('path') In all cases the returned object |
Oh and I haven't decided yet about the external libraries, could alias them as you suggest in which case
open to suggestions here! |
Regarding the external libraries, I'm not sure what would be best. If you envision people using the registration and source extraction methods outside of Thunder, then separate libs makes sense. Is this your intention? In the other case, I don't see a big difference between |
@d-v-b we'll get the "using it outside Thunder" thing regardless, that's the nice thing about making them separate, assuming we structure them correctly 👍 the aliasing is just sugar in case people forget the names. I'll look at how some other projects do it... |
@d-v-b awesome suggestion! Didn't realize it was included with skimage, that's perfect. Just made the change for reading, seems to work great. For writing, how do you think we should handle dimensions and bit-depth? After playing with the options, here's a rough proposal:
|
And while I'm renaming... the current plan is as follows: for two-word names where the first word is four characters or less, make it one word, e.g.
for names where the first word is longer than four characters, or there are more than two words, use snake case, e.g.
Exceptions would only be to ensure consistency with closely associated Python libraries (e.g. Let me know if anyone disagrees! |
@freeman-lab Regarding bit-depth and data type, we should be sure to allow everything that tifs can hold (or everything fiji can read...) -- I think tifs can contain 32-bit floats, and for fractional data (e.g. dff timeseries) this is pretty critical. Also I'm fairly sure signed integer types are allowed, so the may need to be nuanced. Numpy allows arrays with dtype Perhaps for recasting data, there could be some kwargs to specify how this should be done, e.g. |
Thanks @d-v-b , I was counting on your pedantry 👍
|
@freeman-lab, |
@d-v-b 👍 careful documentation and explanation of changes (esp. breaking ones!) will be a high priority as soon as this is done |
thunder and scikit-image integration? |
closing in favor of #243 |
This is a huge refactoring of Thunder, and will the basis of an upcoming new release. We'd normally break it up into multiple PRs, but this touches so much of the code base that it was easier to do all at once.
There are three primary goals, based on a year of community experience and feedback, and consideration of the current ecosystem:
py.test
for unit tests, and Pythonic naming conventions.refactoring
new packages (inside
thunder-project
)new packages (external)