Skip to content

Commit

Permalink
Some improvements to deep learning landing page (#10736)
Browse files Browse the repository at this point in the history
Signed-off-by: chenmoneygithub <chen.qian@databricks.com>
  • Loading branch information
chenmoneygithub committed Dec 26, 2023
1 parent 71dd7ed commit 3c98879
Showing 1 changed file with 56 additions and 62 deletions.
118 changes: 56 additions & 62 deletions docs/source/deep-learning/index.rst
Original file line number Diff line number Diff line change
@@ -1,45 +1,69 @@
Deep Learning
=============

The realm of deep learning has witnessed an unprecedented surge, revolutionizing numerous sectors with its ability to
process vast amounts of data and capture intricate patterns. From the real-time object detection in autonomous vehicles
to the generation of art through Generative Adversarial Networks, and from natural language processing applications in
The realm of deep learning has witnessed an unprecedented surge, revolutionizing numerous sectors with its ability to
process vast amounts of data and capture intricate patterns. From the real-time object detection in autonomous vehicles
to the generation of art through Generative Adversarial Networks, and from natural language processing applications in
chatbots to predictive analytics in e-commerce, deep learning models are at the forefront of today's AI-driven innovations.

MLflow acknowledges the profound impact and complexity of deep learning. With a keen focus on the unique challenges
posed by deep learning workflows, such as iterative model training and hyperparameter tuning, MLflow introduces a robust
suite of tools specifically designed for these advanced models. MLflow helps to facilitate seamless model development,
ensuring reproducibility, and provides enhanced monitoring capabilities with the concept of 'steps' for recording metrics at
various training iterations, with integrated UI features that enable you to easily visualize the iterative improvements
of key metrics during training epochs.

Key Benefits:
-------------
* **Iterative Model Training**: With the concept of 'steps', MLflow allows users to log metrics at various training iterations, offering a granular view of the model's progress.
* **Reproducibility**: Ensure that every model training run can be replicated with the exact same conditions.
* **Scalability**: Handle projects ranging from small-scale models to enterprise-level deployments with ease.
* **Traceability**: Keep track of every detail, from hyperparameters to the final model output.

Deep Autologging Integrations
In the deep learning realm, libraries such as PyTorch, Keras, Tensorflow provide handy tools to build and train deep learning
models. MLflow, on the other hand, targets the problem of experiment tracking in deep learning, including logging your
experiment setup (learning rate, batch size, etc) along with training metrics (loss, accuracy, etc) and the model
(architecture, weights, etc). MLflow provides native integrations with deep learning libraries, so you can plug MLflow
into your existing deep learning workflow with minimal changes to your code, and view your experiments in the MLflow UI.

Why MLflow for Deep Learning?
-----------------------------
One of the standout features of MLflow's deep learning support is its deep autologging integrations. These integrations automatically
capture and log intricate details during the training of deep learning models, ensuring that every nuance, from model parameters to
evaluation metrics, is meticulously recorded. This is especially prominent in frameworks like TensorFlow, PyTorch Lightning, base PyTorch,
and Keras, making the iterative training process more insightful and manageable.
MLflow offers a list of features that power your deep learning workflows:

* **Experiments Tracking**: MLflow tracks your deep learning experiments, including parameters, metrics, and models.
Your experiments will be stored in the MLflow server, so you can compare across different experiments and share them.
* **Model Registry**: You can register your trained deep learning models in the MLflow server, so you can easily
retrieve them later for inference.
* **Model Deployment**: After training, you can serve the trained model with MLflow as a REST API endpoint, so you can
easily integrate it with your application.

Experiments Tracking
^^^^^^^^^^^^^^^^^^^^
Tracking is the cornerstone of the MLflow ecosystem, and especially vital for the iterative nature of deep learning:

- **Experiments and Runs**: Organize your deep learning projects into experiments, with each experiment containing multiple runs.
Each run captures essential data like metrics at various training steps, hyperparameters, and the code state.
- **Artifacts**: Store vital outputs such as deep learning models, visualizations, or even tensorboard logs. This artifact
repository ensures traceability and easy access.
- **Metrics at Steps**: With deep learning's iterative nature, MLflow allows logging metrics at various training steps,
offering a granular view of the model's progress.
- **Dependencies and Environment**: Capture the computational environment, including deep learning frameworks' versions,
ensuring reproducibility.
- **Input Examples and Model Signatures**: Define the expected format of the model's inputs, crucial for complex data like
images or sequences.
- **UI Integration**: The enhanced UI provides a visual overview of deep learning runs, facilitating comparison and insights
into training progress.
- **Search Functionality**: Efficiently navigate through your deep learning experiments using robust search capabilities.
- **APIs**: Interact with the tracking system programmatically, integrating deep learning workflows seamlessly.

Native Library Support
----------------------
Deep learning in MLflow is enriched by its native support for a number of the most popular libraries. The native integration with
each of these libraries within MLflow help to streamline and simplify the training process, as well as saving, logging, loading, and
representing models as generic Python functions for inference use anywhere.

Opting for these native integrations brings forth a myriad of advantages:
Model Registry
^^^^^^^^^^^^^^
A centralized repository for your deep learning models:

* **Auto-logging Capabilities**: Automatically capture details without manual intervention.
* **Custom Serialization**: Streamline the model saving and loading process with custom methods tailored for each library.
* **Unified Interface**: Regardless of the underlying library, interact with a consistent MLflow interface.
- **Versioning**: Handle multiple iterations and versions of deep learning models, facilitating comparison or reversion.
- **Annotations**: Attach notes, training datasets, or other relevant metadata to models.
- **Lifecycle Stages**: Clearly define the stage of each model version, ensuring clarity in deployment and further fine-tuning.

Model Deployment
^^^^^^^^^^^^^^^^
Transition deep learning models from training to real-world applications:

- **Consistency**: Ensure models, especially those with GPU dependencies, behave consistently across different deployment environments.
- **Docker and GPU Support**: Deploy in containerized environments, ensuring all dependencies, including GPU support, are encapsulated.
- **Scalability**: From deploying a single model to serving multiple distributed deep learning models, MLflow scales as per
your requirements.

The officially supported integrations for deep learning libraries in MLflow encompass:
Native Library Support
----------------------
MLflow has native integrations with common deep learning libraries, such as PyTorch, Keras and Tensorflow, so you can plug
MLflow into your workflow easily. The officially supported integrations for deep learning libraries in MLflow encompass:

.. raw:: html

Expand Down Expand Up @@ -124,33 +148,3 @@ For detailed guide on how to integrate MLflow with these libraries, refer to the
keras/index
tensorflow/index
pytorch/index

MLflow Tracking for Deep Learning
---------------------------------
Tracking remains a cornerstone of the MLflow ecosystem, especially vital for the iterative nature of deep learning:

- **Experiments and Runs**: Organize your deep learning projects into experiments, with each experiment containing multiple runs. Each run captures essential data like metrics at various training steps, hyperparameters, and the code state.
- **Artifacts**: Store vital outputs such as deep learning models, visualizations, or even tensorboard logs. This artifact repository ensures traceability and easy access.
- **Metrics at Steps**: With deep learning's iterative nature, MLflow allows logging metrics at various training steps, offering a granular view of the model's progress.
- **Dependencies and Environment**: Capture the computational environment, including deep learning frameworks' versions, ensuring reproducibility.
- **Input Examples and Model Signatures**: Define the expected format of the model's inputs, crucial for complex data like images or sequences.
- **UI Integration**: The enhanced UI provides a visual overview of deep learning runs, facilitating comparison and insights into training progress.
- **Search Functionality**: Efficiently navigate through your deep learning experiments using robust search capabilities.
- **APIs**: Interact with the tracking system programmatically, integrating deep learning workflows seamlessly.


Model Registry
--------------
A centralized repository for your deep learning models:

- **Versioning**: Handle multiple iterations and versions of deep learning models, facilitating comparison or reversion.
- **Annotations**: Attach notes, training datasets, or other relevant metadata to models.
- **Lifecycle Stages**: Clearly define the stage of each model version, ensuring clarity in deployment and further fine-tuning.

Deployment for Deep Learning Models
-----------------------------------
Transition deep learning models from training to real-world applications:

- **Consistency**: Ensure models, especially those with GPU dependencies, behave consistently across different deployment environments.
- **Docker and GPU Support**: Deploy in containerized environments, ensuring all dependencies, including GPU support, are encapsulated.
- **Scalability**: From deploying a single model to serving multiple distributed deep learning models, MLflow scales as per your requirements.

0 comments on commit 3c98879

Please sign in to comment.