Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update FlowSOM dependency #73

Open
mbuttner opened this issue May 20, 2024 · 10 comments
Open

Update FlowSOM dependency #73

mbuttner opened this issue May 20, 2024 · 10 comments

Comments

@mbuttner
Copy link
Collaborator

Hi everyone,

following a brief discussion with @burtonrj in #71: There is a Python implementation of FlowSOM by the original authors (https://github.com/saeyslab/FlowSOM_Python), which offers a comprehensive functionality of FlowSOM clustering and effectively carried over the functionality of the FlowSOM R package. It depends on scVerse packages like pytometry and MuData. The pytometry package currently uses @burtonrj's implementation of FlowSOM, which depends on the packages miniSOM and consensusclustering. Hence, we have two parallel implementations here, where efforts could be more integrated, and second, we would like to reduce the number of dependencies in pytometry (see #64) as part of the governance strategy.

Possible actions

@burtonrj suggested to

  1. remove the FlowSOM functionality from pytometry and therefore
  2. remove the dependencies consensusclustering and minisom.
  3. Instead, point to the FlowSOM implementation of @saeyslab and add documentation accordingly.

As a perspective, one should start a discussion about the integration of the FlowSOM package in the scverse.

I am happy with this suggestion in general and like to suggest some modifications to provide continuity for all users who are already using the current FlowSOM implementation in pytometry:

  1. Make consensusclustering and minisom optional dependencies in the next version.
  2. Move examples for FlowSOM to consensusclustering and replace current example with a pointer to the FlowSOM python package.

I'd like to hear @grst and @quentinblampey thoughts on this.

@grst
Copy link
Collaborator

grst commented May 21, 2024

ping @berombau

I like the idea of using the official FlowSOM, but I'd be curious how it compares to @burtonrj's implementation in terms of speed.

We could keep a wrapper in pytometry for visibility and backwards-compatibility that requires flowSOM as an optional dependency. Refering to it (including some of it's nice visualization) in the tutorial/documentation sounds good to me.

@quentinblampey
Copy link
Contributor

Since FlowSOM_Python is implemented by the original authors and integrated within the scverse ecosystem, it makes sense to use it. The solution proposed by @grst sounds good to me!

And it would also be great to check if the results are consistent for the two packages, but this requires quite some work.

@berombau
Copy link

Hi everyone, thank you for the discussion. We're ok with these proposed actions. The flowsom package itself depends on pytometry currently for a function normalize_estimate_logicle, but I'll try to make this and other dependencies optional so you can easily integrate flowsom with minimal dependencies. The MuData dependency is minimal and will probably stay.

The current implementation depends on Numba for speed. There is ongoing work on a batched SOM training update that would further increase parallelization, which we hope to conclude by the summer.

Alternative versions can reuse the scverse integration and visualizations of our package by implementing flowsom.models.BaseFlowSOMEstimator. It's even possible to mix-and-match the models for overclustering and metaclustering, but this is mostly for benchmarking. We can add additional models if that would provide better continuity for users. We can try to make it as consistent as possible, but this is indeed not that trivial. Sometimes there are slight differences and a previous analysis will not be fully reproducible. It's easier to work with containers or an older package version then.

@berombau
Copy link

berombau commented Jul 5, 2024

I added some of these changes in https://github.com/saeyslab/FlowSOM_Python/tree/interop-pytometry in preparation for a 0.0.2 version. The pytometry package is now an additional install as explained in the notebook. We do require the 0.1.5 version not yet released on PyPI #69.

@mbuttner
Copy link
Collaborator Author

mbuttner commented Jul 5, 2024

Hi @berombau
thank you for the update. My PyPI account is still not operational and there has been no response to my account recovery request from PyPI in the past two months. I keep working on it!

@mbuttner
Copy link
Collaborator Author

mbuttner commented Aug 1, 2024

Hi @berombau
I recovered access to my PyPI account and uploaded pytometry version 0.1.5: https://pypi.org/project/pytometry/0.1.5/

@berombau
Copy link

So in FlowSOM_Python we need the 0.1.5 version for the pytometry function normalize_autologicle. A PyPI installation still does not work because of the pandas issue, which is now fixed in burtonrj/consensusclustering#1 and pytometry version 0.1.6.dev5. So with a PyPI 0.1.6 release, I think the installation issue will be resolved.

@Zethson
Copy link
Member

Zethson commented Sep 23, 2024

@mbuttner would it kindly be possible to make that 0.1.6 release?

@mbuttner
Copy link
Collaborator Author

@Zethson thanks for the nudge. I'll see to it tomorrow.

@mbuttner
Copy link
Collaborator Author

I added a 0.1.6 release to PyPI and created a new release tag here on Github.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants