-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Better compatibility for databricks #250
Comments
Thanks - I've created the branch There are a couple of other functions that save data as ZIP files, including:
Would we need to extend functionality for all of these, as well? |
Hi James, thanks, that was quick! I will try it now. |
Hi James, I tried your fix. It took some time to run. I was still not able to run it through. Below is the error. It looks similar to previous ones but not exactly the same. Thanks!
|
Hmm... just to confirm, did you set the environmental variable SF_ALLOW_ZIP=0? I'm not sure how this error could be encountered if that variable is set. |
It's likely. I just started rerunning and will let you know tomorrow morning once it's finished. Thanks! |
Hi James, it worked. I probably missed the environment variable. I'm going to train MIL models, attention-based MIL won't work yet at this time because of zip file issue, correct? |
I've just added a possible solution for attention-based MIL - give it a try and let me know if it works! |
Hi James, it worked like a charm! Although I did notice that there were a few necessary packages for the training were not included in the installation, for example, fastai. |
Glad to hear it! Re: dependencies - as you are aware, Slideflow is seeking to support a diverse set of deep learning tasks (segmentation, image generation, self-supervised learning, classification) and training paradigms. Some of these tasks have specific version requirements (e.g. StyleGAN requires PyTorch < 1.12) or dependencies ( Rather than requiring all users to install all dependencies, the approach we have taken is to limit the auto-installed dependencies to only what all users will use, and then users can install additional dependencies based on their needs. For example, this will install only the base requirements of slideflow:
This will install dependencies for cell segmentation:
This will install all of the PyTorch-associated dependencies, including FastAI:
and so on. The installation instructions at https://slideflow.dev/installation/ do note that PyTorch users should install with We're definitely open to hearing suggestions for alternative approaches. We could also expand the discussion of this in the installation instructions. |
Got it, makes sense to make it need-based. I think it was also because I installed it from source so it might be a different experience if I use other methods like pip. I will definitely let you know if I have more thoughts about this. Thanks! |
Hi @jamesdolezal, I encountered zip file issue again when running slide_map.save_umap('path') even after I defined the environment variable. Thought that you might have missed this one. |
Feature
Solve compatibility issues with the Databricks platform. For example, zip file restrictions.
Pitch
Databricks have some unique restrictions which have caused compatibility issues with the slideflow. A more DB-compatible version of slideflow is beneficial to this user group.
Alternatives
Additional context
Start with solving zip file generation problems on Databricks.
The text was updated successfully, but these errors were encountered: