# Deploying a model on Arcee Cloud

In this notebook, you will learn how to deploy a model on Arcee Cloud. This could be a pre-trained model available off-the-shelf, or a model you have tailored to your needs with a combination of merging, continuous pretraining and alignment.

You can run this demo for free thanks to the Arcee free tier. Your endpoint will be shut down automatically after 2 hours.

The Arcee documentation is available at [docs.arcee.ai](https://docs.arcee.ai/deployment/start-deployment).

## Prerequisites

Please [sign up](https://app.arcee.ai/account/signup) to Arcee Cloud and create an [API key](https://docs.arcee.ai/getting-arcee-api-key/getting-arcee-api-key).

Then, please update the cell below with your API key. Remember to keep this key safe, and **DON'T COMMIT IT to one of your repositories**.

In [None]:
%env ARCEE_API_KEY=YOUR_API_KEY

Create a new Python environment (optional but recommended) and install [arcee-python](https://github.com/arcee-ai/arcee-python).

In [None]:
# Uncomment the next three lines to create a virtual environment
#!pip install -q virtualenv
#!virtualenv -q arcee-cloud
#!source arcee-cloud/bin/activate

%pip install -q arcee-py

In [None]:
import arcee

## Deploying a model

Let's pick the model we'd like to deploy, and set the name of this deployment.

In [None]:
model_name = "Llama-3-8B-Instruct"
deployment_name = "My Llama-3-8B-Instruct deployment"

We're now ready to deploy the model. We'll use the `start_deployment()` API and simply pass the model and deployment names.

Here, we deploy an off-the-shelf model. For a pretrained or a merged model, we would respectively use the `pretraining` or the the `merging` parameter in stead of the `alignment` parameter.

In [None]:
help(arcee.start_deployment)

In [None]:
response = arcee.start_deployment(deployment_name=deployment_name, alignment=model_name)
print(response)

Let's wait for the endpoint to be provisioned. It should only takes a few minutes.

The `deployment_status` API lets us query the current state of the endpoint.

In [None]:
help(arcee.deployment_status)

In [None]:
from time import sleep

while True:
    response = arcee.deployment_status(deployment_name)
    if response["deployment_processing_state"] == "pending":
        print("Deployment is in progress. Waiting 30 seconds before checking again.")
        sleep(30)
    else:
        print(response)
        break

## Generating text with our model

Now, let's test the endpoint with a simple prompt.

The `generate()` API requires the deployment name and the prompt.

In [None]:
help(arcee.generate)

In [None]:
response = arcee.generate(deployment_name=deployment_name, query="How did Alan Turing break the Enigma code?")
print(response["text"])

In [None]:
query = "Please write a marketing pitch for a new SaaS AI platform called Arcee Cloud. \
    Arcee Cloud makes it simple for enterprise users to tailor open-source small language models to their own domain knowledge, \
    in order to build high-quality, cost-effective and secure AI solutions. Focus on facts, don't make up numbers.\
    We will send this pitch by email to business and technical decision-makers, so make it sound exciting and convincing. \
    The contact email is sales@arcee.ai. Feel free to use emojis as appropriate."

response = arcee.generate(deployment_name=deployment_name, query=query)
print(response["text"])

## Stopping our deployment

Once we're done working with our model, we should stop the deployment to avoid unwanted charges.

The `stop_deployment()` API only requires the deployment name.

In [None]:
help(arcee.stop_deployment)

In [None]:
arcee.stop_deployment(deployment_name=deployment_name)

In [None]:
arcee.deployment_status(deployment_name)

This concludes the model deployment demonstration. Thank you for your time!

If you'd like to know more about using Arcee Cloud in your organization, please visit the [Arcee website](https://www.arcee.ai), or contact [sales@arcee.ai](mailto:sales@arcee.ai).

