## Deploy Falcon using Jumpstart

In this notebook, we will deploy out-of-the-box Falcon-7B instruct as a Sagemaker Endpoint. This is an instruction/tasks tuned version of Falcon with 7B parameters. We will deploy to ml.g5.2xlarge, then interact with this model using [LangChain](https://www.langchain.com/) in the next notebook.

#### Imports 

In [None]:
from sagemaker.predictor import Predictor
from sagemaker import get_execution_role
from sagemaker.jumpstart.model import JumpStartModel
import sagemaker
import logging
import boto3
import time
import json

#### Setup essentials 

In [None]:
logger = logging.getLogger('sagemaker')
logger.setLevel(logging.DEBUG)
logger.addHandler(logging.StreamHandler())
logger.info(f'Using sagemaker=={sagemaker.__version__}')
logger.info(f'Using boto3=={boto3.__version__}')

In [None]:
MODEL_ID = 'huggingface-llm-falcon-7b-instruct-bf16'  # this is hard-coded
INSTANCE_TYPE = 'ml.g5.2xlarge'
MODEL_DATA_DOWNLOAD_TIMEOUT = 3600  # in seconds
CONTAINER_STARTUP_HEALTH_CHECK_TIMEOUT = 3600

# set up roles and clients 
client = boto3.client('sagemaker-runtime')
ROLE = get_execution_role()
logger.info(f'Role => {ROLE}')

In [None]:
unix_time = int(time.time())
endpoint_name = f'{MODEL_ID}-{unix_time}'
logger.info(f'Endpoint name: {endpoint_name}')

#### Deploy Falcon-7B-instruct model as a SageMaker endpoint

In [None]:
%%time

my_model = JumpStartModel(model_id=MODEL_ID)
predictor = my_model.deploy(instance_type=INSTANCE_TYPE, 
                 endpoint_name=endpoint_name, 
                 model_data_download_timeout=MODEL_DATA_DOWNLOAD_TIMEOUT, 
                 container_startup_health_check_timeout=CONTAINER_STARTUP_HEALTH_CHECK_TIMEOUT)

In [None]:
%store endpoint_name