Skip to content

Commit

Permalink
docs: Clean up BentoCloud doc and add get started doc (#4472)
Browse files Browse the repository at this point in the history
* Clean up BentoCloud doc and add get started doc

Signed-off-by: Sherlock113 <sherlockxu07@gmail.com>

* Minor wording change

Signed-off-by: Sherlock113 <sherlockxu07@gmail.com>

---------

Signed-off-by: Sherlock113 <sherlockxu07@gmail.com>
  • Loading branch information
Sherlock113 committed Feb 2, 2024
1 parent 892ca93 commit e978121
Show file tree
Hide file tree
Showing 11 changed files with 60 additions and 285 deletions.

This file was deleted.

6 changes: 0 additions & 6 deletions docs/source/bentocloud/best-practices/cost-optimization.rst
Original file line number Diff line number Diff line change
Expand Up @@ -37,12 +37,6 @@ For seasoned users familiar with advanced use cases of BentoML, consider the fol
This helps you identify the most cost-effective setup and avoid overpaying or underutilizing resources.
* **Model parallelism**. Understand how parallelized your model can be. Efficiently parallelized models utilize resources better, reducing the need for more expensive compute power.
If your model can be highly-parallelized, we recommend you enable :doc:`adaptive batching </guides/batching>` to send inputs to your model.
* **Distributing tasks on Runners**. BentoML :doc:`Runners </concepts/runner>` are the computation unit that can be executed on remote Python workers
and scaled independently. Distributing tasks on Runners allows for more efficient resource usage, ensuring each runner is fully utilized without being overloaded. Specifically, you can do the following:

* Relocate compute-heavy tasks (for example, model inference and heavy pre-processing) to dedicated Runners.
* Allocate different models or stages of a pipeline to separate Runners for better resource management.

* **Adaptive scaling**. Take into account of a wide range of factors when setting your scaling strategies. Specifically, think about:

* **Traffic**. Configure your scaling strategy based on observed and predicted traffic patterns. Dynamically adjusting resources based on demand
Expand Down
7 changes: 0 additions & 7 deletions docs/source/bentocloud/best-practices/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,14 +15,7 @@ This section contains a list of best practices for BentoCloud usage.

Optimize your overall costs on BentoCloud by applying best practices.

.. grid-item-card:: Bento building and deployment
:link: bento-building-and-deployment
:link-type: doc

Accelerate your Bento application delivery lifecycle.

.. toctree::
:hidden:

cost-optimization
bento-building-and-deployment
54 changes: 54 additions & 0 deletions docs/source/bentocloud/get-started.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
===========
Get started
===========

BentoCloud offers serverless infrastructure tailored for AI inference, allowing you to efficiently deploy, manage, and scale any machine learning (ML) models in the cloud. It operates in conjunction with BentoML, an open-source model serving framework, to facilitate the easy creation and deployment of high-performance AI API services with custom code. As the original creators of BentoML and its ecosystem tools like OpenLLM, we seek to improve cost efficiency of your inference workload with our
serverless infrastructure optimized for GPUs and fast autoscaling.

Specifically, BentoCloud features:

- Optimized infrastructure for deploying any model, including the latest large language models (LLMs), Stable Diffusion models, and user-customized models built with various ML frameworks.
- Autoscaling with scale-to-zero support so you only pay for what you use.
- Flexible APIs for continuous integration and deployments (CI/CD).
- Built-in observability tools for monitoring model performance and troubleshooting.

Plans
-----

BentoCloud is available with the following two plans.

Starter
^^^^^^^

The Starter plan is designed for small teams of developers who want to focus on building AI applications without infrastructure management. With the autoscaling feature of BentoCloud, you only pay for the resources you use.

Enterprise
^^^^^^^^^^

The Enterprise plan includes all the features offered in the Starter plan. It is tailored for teams that want to use BentoCloud in :doc:`their own cloud or on-premises environment (BYOC) </bentocloud/how-tos/byoc>`, ensuring data security and compliance. If you prefer not to use your own cluster, we can provide a dedicated cloud environment for you. Either way, we take care of managing the infrastructure to ensure a scalable and secure model deployment experience.

Access BentoCloud
-----------------

To gain access to BentoCloud, sign up here:

.. raw:: html

<a href="https://kdyvd8c5ifq.typeform.com/to/eTujPAaE" class="custom-button demo">Schedule a Demo</a>
<a href="https://cloud.bentoml.com" class="custom-button trial">Start Free Trial</a>

Once you have your BentoCloud account, do the following to get started:

1. Install BentoML by running ``pip install bentoml``. See :doc:`/get-started/installation` for details.
2. Create an :doc:`API token with Developer Operations Access </bentocloud/how-tos/manage-access-token>`.
3. Log in to BentoCloud with the ``bentoml cloud login`` command, which will be displayed on the BentoCloud console after you create the API token.

Now, you can try an `example project and deploy it to BentoCloud <https://github.com/bentoml/quickstart>`_.

Resources
---------

If you are a first-time user of BentoCloud, we recommend you read the following documents to get familiar with BentoCloud:

- Deploy :doc:`example projects </use-cases/index>` to BentoCloud
- :doc:`/bentocloud/how-tos/manage-deployments`
28 changes: 0 additions & 28 deletions docs/source/bentocloud/getting-started/index.rst

This file was deleted.

70 changes: 0 additions & 70 deletions docs/source/bentocloud/getting-started/understand-bentocloud.rst

This file was deleted.

File renamed without changes.
9 changes: 5 additions & 4 deletions docs/source/bentocloud/how-tos/index.rst
Original file line number Diff line number Diff line change
@@ -1,16 +1,16 @@
=============
How-to guides
=============
======
Guides
======

How-to guides take the reader through the steps required to solve a problem.

They are recipes, directions to achieve a specific end result, and are wholly **goal-oriented**.

* :doc:`manage-deployments`
* :doc:`manage-access-token`
* :doc:`manage-models-and-bentos`
* :doc:`manage-users`
* :doc:`manage-cost-and-payment-info`
* :doc:`byoc`

.. toctree::
:hidden:
Expand All @@ -20,3 +20,4 @@ They are recipes, directions to achieve a specific end result, and are wholly **
manage-models-and-bentos
manage-users
manage-cost-and-payment-info
byoc

0 comments on commit e978121

Please sign in to comment.