Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request for comment: road to scikit-image 1.0 #3263

Open
jni opened this Issue Jul 12, 2018 · 8 comments

Comments

Projects
None yet
6 participants
@jni
Copy link
Contributor

jni commented Jul 12, 2018

Hi all,

@stefanv, @emmanuelle, and I have been working on a blog post about the future of scikit-image. We want a much broader input. The only reason we haven't gotten more of you involved yet is just lack of organisation on our part. But let's call this issue the moment where we get organised.

The post is live on my blog here:

https://ilovesymposia.com/2018/07/13/the-road-to-scikit-image-1-0/

Please provide comments, suggestions, corrections, insults ;), etc, either here or on the blog. Ok maybe reserve insults for the blog...

I'm looking forward to working with all of you to make an (even more) amazing library!

@hmaarrfk

This comment has been minimized.

Copy link
Member

hmaarrfk commented Jul 30, 2018

Hey @jni, great writeup. I'm really impressed with the way you summarized and explained the values of the project.

Here are my toughts.

User data + magic

I think maybe the most important step for the 1.0 version will have to be deciding how care for the users’ data and respect the We don’t do magic go hand in hand with the library being easy to use.

A rule that I am converging to in my own work is: img_as_* should never be called in algorithmic functions.

Funny enough, img_as_* is imported in the top level __init__.py which I think warrants some rethinking. img_as_* is particularly useful for certain types of data, mainly those obtained from 8bit cameras, and definitely useful for saving images in common "non-scientific" formats.

Helping users

I've been personally experimenting with the stacklevel parameter of the warn. This helps point users to the line in their code that is causing the warning to appear, instead of the line that outputs the warning. I've managed to create pretty fancy decorators for deprecations. I'll share this work with the scikit-image team soon to try and get some feedback.

Metadata + units

I've been experimenting quite a bit with xarray and dask since the sprint at BIDS (thank you all for pointing me to those two projects).

xarray actually does a good job at keeping things like units organized.
It is also transparent with many numpy operations, but I occasionally hit a few hiccups.

I expect that sticking to numpy might be a good idea seeing as many people are working toward making sure the __array_function__ protocol becomes a reality.

When xarray doesn't work, I often have to access the "data" attribute, or call np.asarray which really freaks me out when I'm passing parameters to my low level c-libraries that need to write back to the array.

@grlee77

This comment has been minimized.

Copy link
Contributor

grlee77 commented Aug 1, 2018

I think what you have posted is a nice summary and am in general agreement.

I agree that it could help to lower the barrier to inviting additional new core members to help broaden the base of regular contributors. Finding funding support to either hire new developers or allow existing contributors to spend more time on scikit-image would also help, but is obviously easier said than done. Have there been discussions of becoming a NumFocus sponsored project or otherwise applying for external funding?

If there are plans for an updated manuscript, I would be interested in contributing to that. Publications provide at least some justification of time spent for those of us at academic institutions and I unfortunately joined a bit late to be involved in with the initial publication related to the library.

I will just add here a few items that attracted me as a user and eventual developer for scikit-image as opposed to other libraries such as OpenCV or ITK:

  1. Support for n-dimensional floating point data (present in ITK, but lacking in OpenCV)
  2. A lower barrier to quickly read and understand the underlying implementations.
  3. Being a Python library, its interface and documentation strings are more natural to a Python programmer than wrappers generated on top of C/C++ code.

Aside from the specific features above, the open and helpful nature of interactions with the core team on GitHub made a positive impression and added to my desire to contribute to the project.

@hmaarrfk hmaarrfk referenced this issue Aug 5, 2018

Open

Discussion: solution for large data files #3323

0 of 7 tasks complete

@sciunto sciunto referenced this issue Aug 22, 2018

Closed

scikit-image sprint at BIDS May 28 - Jun 1 #3086

4 of 4 tasks complete
@hmaarrfk

This comment has been minimized.

Copy link
Member

hmaarrfk commented Sep 9, 2018

I would like to see bi-weekly to month releases. This would help ensure that developers see a quick return on their investment in the scikit-image infrastructure and encourage them to contribute to scikit-image as opposed to other projects or their personal forks.

Xref mailing list: https://mail.python.org/pipermail/scikit-image/2018-September/005632.html

@jni

This comment has been minimized.

Copy link
Contributor Author

jni commented Oct 6, 2018

Hi everyone,

I'm back on this topic after dropping it for a while. =)

First: I promised a way to submit anonymous comments, and now I finally have it:
https://PollEv.com/juannunezigl611

The site above assures me that comments are completely anonymous by design, with zero knowledge even from the developers/admins. Please share this link! Together with the link to the blog post.

About some of the existing comments:

@sciunto

the build/test/benchmark/docs infrastructure is getting more and more complex, which could be intimidating to new contributors

I am on board with this. I think all of it is extremely useful though, but PRs explicitly to simplify this process would be very welcome. (Thank you @hmaarrfk, for example, for your most recent PR removing the special-casing for PYAMG!)

PRs should be more collaborative

I think this speaks to the mentorship part of the values. For me, @stefanv's mentorship when I was first contributing to this project was life-changing, and I definitely want to pay this forward. Could you elaborate on your ideal collaboration model for PRs? I think the main challenge here is that we are so widely distributed around the globe, so real-time collaboration suffers.

One idea is that we could have some sort of direct link "organize a micro-sprint", that sends an email to the mailing list with a topic, a doodle poll for availability, and a request for help implementing a feature. This might make it easier to get from "idea for complex feature" to "merge" in a short time, rather than months or years, and also to get help with stuff you are not super familiar with.

As a side point, I also didn't touch on it but I think as core developers we should push the final "polishing" commits ourselves, rather than ask contributors to fix things.

I wish also to see more “immediate” features like “skimage, give me the information related to the histogram of the image” and not “skimage, take this image, use 256 bins, give me two arrays. Ok, now, matplotlib, please, draw an histogram of this…”.

This to me is the most controversial of the comments. =P It is in a bit of tension with "we don't do magic", as well as the idea that we are, at our core, about scientific processing and analysis library. For example, @hmaarrfk has been making efforts to remove our core dependency on matplotlib, and I very much support these efforts: matplotlib is certainly a part of most interactive Python environments, but when it comes to producing a lean piece of software for an image analysis pipeline, it's just way too much.

HOWEVER, I think this sort of thing could find a very good place with the viewer sub-package, as something that ships separately to scikit-image.

I'd love to hear from other people on this topic: where do we want to fit in on the magic/convenience spectrum? Are they actually in tension or is this a false dichotomy?

@hmaarrfk

img_as_ and magic

Yes, this is a bugbear of mine also. I'd aim to deprecate importing img_as_ at the top level, favouring util. This should be pretty straightforward I think. We could also take the opportunity to rename the functions rescale_to_[type], which would be more indicative of the operation about to take place.

stacklevel

ditto. It requires careful thinking but I consider this very worthwhile.

metadata and sticking with NumPy

Yes, I will definitely never suggest using anything other than NumPy arrays to represent our images. (This was also noted by Royi Avital on the blog post.) Metadata could be useful on the viewer side of things. Perhaps we could simply provide a metadata submodule that contains functions for cases where metadata is modified, e.g. metadata.downscale_local_mean(meta: dict, factor, ...) -> dict, which would change the appropriate parameters within the metadata. See imageio/imageio#382 for efforts to standardise metadata fields. I suggest we would worry about those.

monthly releases

I think this is something we should consider in the far future. As Stéfan pointed out in the mailing list, one advantage of slowish release cycles is that we have time to correct things when funky APIs or actual bugs make their way into master. When we have a larger number of people on master, the number of people testing cutting-edge stuff will increase, and this might be more viable.

In terms of encouraging new contributors, this is a big deal for me, but I want to put most effort to reducing PR -> merge time, rather than merge -> release time.

@grlee77

updated manuscript

Yes, my intention (which I am announcing here I guess =P) is to write papers periodically, and invite all contributors (core and not) to participate in the paper. I'd certainly like to write one to coincide with 1.0. I can attest from personal experience that my work with scikit-image was viewed quite differently at my university pre- and post- paper, and it is unfair that contributors that came after the paper don't get the same benefits. Of course I have many thoughts on the brokenness of the incentive system, but in the meantime, I think periodic papers offer (in my opinion) the best way to credit new contributors.

NumFOCUS

Stéfan has briefly looked into NumFOCUS, but there has been no concerted effort to become sponsored by them. From what I can tell though, that funding is still small-potatoes, and not something that I think would be game-changing for skimage.

Part of the purpose of this roadmap, though, is to make it easier to get funding. So let's get it finished! I'm hoping to submit a PR with the "official" version (incorporating all this feedback) next month.

a few items that attracted me as a user and eventual developer for scikit-image

👍! This list made me very happy and was very much in line with the values I want us to portray. =) (Hopefully you agree that I articulated them in the post?)


Ok, that's my take on the existing comments. If anyone has further input, please speak now or forever hold your peace! At the start of November, I'll collect all the comments, try to incorporate them as best I can, and then submit a PR proposing the adoption of the roadmap. At that point, I hope that we only have to deal with minor issues with the phrasing, not the overall vision. =)

I'd also like to make the point that some of the comments may lean more in one or the other direction. Sometimes a consensus can be built, and I hope that will be the case here, but other times this is difficult. Even if you only write in to agree with particular items of this vision, this is useful for us to get an idea of community support for various aspects of it.

@GenevieveBuckley

This comment has been minimized.

Copy link
Contributor

GenevieveBuckley commented Oct 6, 2018

I would like to see many of the rough edges sanded down in chaining together scikit-image functions. There a quite a few places where individual functions are great, but you have to do a bunch of fiddling to turn the results into a format that matches the input of the next skimage function.

The way I interpreted François' comment was that it was more about 'better abstraction of ideas' rather than 'doing magic' or 'matplotlib', and I'd agree with that. It's similar to what I'm saying here too.

@hmaarrfk

This comment was marked as off-topic.

Copy link
Member

hmaarrfk commented Oct 15, 2018

@jni, a while back @NelleV mentionned she wanted to add a few posts about the BIDS Sprint. I think this one kinda fits that description. What do you two think?

@jni

This comment has been minimized.

Copy link
Contributor Author

jni commented Oct 15, 2018

@hmaarrfk huh? Nelle proofread this post before it was posted, and the BIDS sprint is acknowledged twice in it. =) If you mean she might want to put it in a central location, then yes, she might. =P

I have so many more posts (in my head) from that though. Life is way too short. LOL

@ctrueden

This comment has been minimized.

Copy link

ctrueden commented Nov 22, 2018

I posted some thoughts on the Image.sc Forum thread here.

@sciunto sciunto added this to the 1.0 milestone Feb 1, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.