Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compare to other ML e2e platforms #58

Closed
elgalu opened this issue Jun 18, 2018 · 7 comments
Closed

Compare to other ML e2e platforms #58

elgalu opened this issue Jun 18, 2018 · 7 comments
Labels
area/docs Documentation issues

Comments

@elgalu
Copy link

elgalu commented Jun 18, 2018

First of all congratulations for releasing all this hard work to the public!

I went through the examples to see if I would be able to figure out how exactly does this project differentiates from others but only saw some minor technical differences.

Could you provide a summary on why did you decide to create a complete new ML pipeline instead of joining some of the other ongoing efforts?

@mateiz
Copy link
Contributor

mateiz commented Jun 18, 2018

Our blog post at https://databricks.com/blog/2018/06/05/introducing-mlflow-an-open-source-machine-learning-platform.html has the overall motivation, though it might be good to write more direct comparisons on the website at some point. (I personally don't like having that because they invariably get out of date and then people are worried that the website compared against an older version of their platform.)

In a nutshell, I think there are two main goals in MLflow that are different from several of the platforms you list:

  1. MLflow is meant to be an "open" platform in the sense that it's easy to bring in any ML library, existing code, existing deployment tools, etc, whereas a lot of the projects you mentioned are focused on a specific set of libraries (for example, TensorFlow and PyTorch) or a specific deployment environment (for example, Kubernetes). We want to allow people to string together workflows out of any component that some other team has used to implement an ML task. If some team across the world wrote a great classifier using a 25-year old R library, that's awesome: you should be able to call it as easily as you can call the latest Spark or PyTorch release.
  2. The specific functions supported are different, and in particular, MLflow focuses more on the ML lifecycle and less on the task of deploying jobs to a specific execution platform (such as Kubernetes). For instance, we've spent more time on an experiment tracking UI/API, on a project packaging format that's easy to share through Git, and on designing a multi-flavor model format that allows deployment to quite a few different tools (local serving, SageMaker, Azure ML and Spark ML in the current release). On the other hand, many of these platforms focus specifically on deploying jobs to Kubernetes, which we don't currently try to do.

Basically, we didn't find anything that supported the large-scale, multi-library experimentation and deployment workflow that we saw people wanting to do, so we decided to focus on that.

In general though, MLflow should also be pretty complementary to many of the tools you listed. For example, you can deploy your jobs to Kubernetes using one of these but use MLflow Tracking to track experiments or MLflow Models as a format for deploying the model. MLflow's goal is mainly to let you manage the ML lifecycle regardless of which tools you use to train or run the model.

@rquintino
Copy link

rquintino commented Jun 21, 2018

outsider view, but agree with @mateiz , Polyaxon has some similarities but others above focus on completely different problems, not properly ml experimentation platforms for the ml develop lifecycle

been following pretty much every experimentation framework for quite some time, actually using and evolving our own, pretty much changes the way you do ml, for the better :) IMO.

still find mlflow work one of the best so far on the open source, was very refreshing to see this published and a lot of needs/ideas validated (grabbing some new ideas and contributing if possible).

Some of the ideas we were actually already using on our platform (local storage, experiments as a group of runs but not blocking of comparing pretty much everything to anything, everything gets stored,minimum overhead) plus job queue/scale out/docker, notifications, & saving a huge amount of metadata on a huge scale (thousands up to millions of runs), currently targeting better UI (here mlflow is better :) )

some references for similar work or workflows:

(must read, this is the feeling & inspiration )
http://blog.niland.io/how-we-conduct-research-at-niland/

this one, very recent:
https://machinelearningmastery.com/controlled-experiments-in-machine-learning/

others:
https://www.wandb.com/
https://azuremarketplace.microsoft.com/en-us/marketplace/apps/Microsoft.MachineLearningExperimentation?tab=Overview
https://github.com/williamFalcon/test-tube
http://artemis-ml.readthedocs.io/en/latest/experiments.html
https://www.comet.ml/
https://kaixhin.github.io/FGLab/
https://github.com/IDSIA/sacred
https://mllg.github.io/batchtools/
https://neptune.ml/
https://docs.skymind.ai/docs/welcome
https://github.com/mitdbg/modeldb
https://pythonhosted.org/Sumatra/index.html
https://github.com/christiansch/pythia
https://modelchimp.com/
https://github.com/ucbrise/flor
http://vfx.ai/2017/11/machine-learning-labs/
RQ

@mateiz
Copy link
Contributor

mateiz commented Jun 22, 2018

Glad you like it, Rui! We're still early on on so we'd love input on what to improve or what will make it easier to run. We've also tried to design MLflow in a fairly modular way, where you can pick up some pieces but not others in your own platform.

@kirk86
Copy link

kirk86 commented Aug 20, 2019

@elgalu That's an ongoing issue that I see on github with many other libraries as well. For instance, I see that mlfflow is hightly influenced by sacred which was influenced by sumatra but it's a shame that ppl don't contribute to existing libraries. Even at this point I still find it hard to see the differences between mlflow vs sacred, for instance (not meant to be a criticism). Not only that, but some old libraries like sumatra still have features that I haven't seen in any of the new libraries being offered.

@hernanborre
Copy link

Our blog post at https://databricks.com/blog/2018/06/05/introducing-mlflow-an-open-source-machine-learning-platform.html has the overall motivation, though it might be good to write more direct comparisons on the website at some point. (I personally don't like having that because they invariably get out of date and then people are worried that the website compared against an older version of their platform.)

In a nutshell, I think there are two main goals in MLflow that are different from several of the platforms you list:

  1. MLflow is meant to be an "open" platform in the sense that it's easy to bring in any ML library, existing code, existing deployment tools, etc, whereas a lot of the projects you mentioned are focused on a specific set of libraries (for example, TensorFlow and PyTorch) or a specific deployment environment (for example, Kubernetes). We want to allow people to string together workflows out of any component that some other team has used to implement an ML task. If some team across the world wrote a great classifier using a 25-year old R library, that's awesome: you should be able to call it as easily as you can call the latest Spark or PyTorch release.
  2. The specific functions supported are different, and in particular, MLflow focuses more on the ML lifecycle and less on the task of deploying jobs to a specific execution platform (such as Kubernetes). For instance, we've spent more time on an experiment tracking UI/API, on a project packaging format that's easy to share through Git, and on designing a multi-flavor model format that allows deployment to quite a few different tools (local serving, SageMaker, Azure ML and Spark ML in the current release). On the other hand, many of these platforms focus specifically on deploying jobs to Kubernetes, which we don't currently try to do.

Basically, we didn't find anything that supported the large-scale, multi-library experimentation and deployment workflow that we saw people wanting to do, so we decided to focus on that.

In general though, MLflow should also be pretty complementary to many of the tools you listed. For example, you can deploy your jobs to Kubernetes using one of these but use MLflow Tracking to track experiments or MLflow Models as a format for deploying the model. MLflow's goal is mainly to let you manage the ML lifecycle regardless of which tools you use to train or run the model.

Hi Mateiz, what a great reply!

I'm deciding what stack to adopt for my current employer and I'm having a hard time figuring out if it's possible ( or if it makes sense too) to have TFX on the models side but adopt MLFlow to manage libraries, artifacts, lifecycle, etc.

What are your thoughts on on this?

Thanks in advance,
Best regards,
Hernán

@mateiz
Copy link
Contributor

mateiz commented Sep 6, 2019

Yup, it should be possible to do that. MLflow already supports saving and managing TensorFlow models, as well as automatic logging of metrics that you send to TensorBoard. Which other pieces of TFX are important to you? We might be able to add built in integrations if needed, or you can just use them alongside the MLflow APIs.

@hernanborre
Copy link

Thanks a lot for your quick reply @mateiz!

So basically the idea is to have a relatively common pipeline implemented. The main idea is to use it to a variety of applications (from tabular data to images or NLP). Since I'm building up the machine learning area in this company, we are still discussing with the stakeholders which use cases will be tackled first. However, I've been working on ML/DS for a while now and I know the importance of defining a pipeline to be able to reuse models, data prep, data validation and sharing.

My only fear on integrating MLFlow and TFX is that there will be things that might go out of control in either one of the tools at some point.

jdlesage added a commit to jdlesage/mlflow that referenced this issue Dec 23, 2019
dbczumar referenced this issue in dbczumar/mlflow Apr 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/docs Documentation issues
Projects
None yet
Development

No branches or pull requests

6 participants