# Hugging Faceで公開されている大規模言語モデルをSageMakerにデプロイ


* 対象モデル
  
  cyberagent/open-calm-3b
  
  https://huggingface.co/cyberagent/open-calm-3b

* HuggingFace Inference Containers

  https://huggingface.co/docs/sagemaker/index


### SageMakerライブラリーのインストール

In [6]:
%pip install sagemaker --upgrade


Requirement already up-to-date: sagemaker in /home/vscode/.local/lib/python3.8/site-packages (2.181.0)
Note: you may need to restart the kernel to use updated packages.


---

### パラメーターを指定

In [7]:
model_id = 'cyberagent/open-calm-3b'
instance_type = 'ml.g5.2xlarge'


### インポート

In [8]:
import sagemaker
import boto3


### IAMロールの取得

In [9]:
try:
	role = sagemaker.get_execution_role()
except ValueError:
	iam = boto3.client('iam')
	role_name = 'AmazonSageMaker-ExecutionRole-20230617T201891' # Role name with `AmazonSageMakerFullAccess` policy attached
	role = iam.get_role(RoleName=role_name)['Role']['Arn']


Couldn't call 'get_role' to get Role ARN from role name inspiron14 to get Role path.


### SageMakerへデプロイ

In [10]:
from sagemaker.huggingface.model import HuggingFaceModel

# Hub model configuration <https://huggingface.co/models>
hub = {
  'HF_MODEL_ID': model_id, # model_id from hf.co/models
  'HF_TASK':'text-generation'          # NLP task you want to use for predictions
}

# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
  env=hub,                            # configuration for loading model from Hub
  role=role,                          # IAM role with permissions to create an endpoint
  transformers_version='4.28',        # Transformers version used
  pytorch_version='2.0',              # PyTorch version used
  py_version='py310',                 # Python version used
)

# deploy model to SageMaker Inference
predictor = huggingface_model.deploy(
  initial_instance_count=1,
  instance_type=instance_type,
  container_startup_health_check_timeout=300,
)


-----------!

### 推論

In [11]:
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained(model_id)

data = {
  'inputs': 'AIによって私達の暮らしは、',
  'parameters': {
    'max_new_tokens': 64,
    'do_sample': True,
    'temperature': 0.7,
    'top_p': 0.9,
    'repetition_penalty': 1.05,
    'pad_token_id': tokenizer.pad_token_id,
  }
}

# request
result = predictor.predict(data)

result


  from .autonotebook import tqdm as notebook_tqdm


[{'generated_text': 'AIによって私達の暮らしは、大きく変わりつつあります。そんな中今回紹介するのは、「VRで未来の生活を体験できる」というイベント「HourFace2.0, VR Future Moment.」です。\n3D空間で展開される仮想現実を体感できそうな空間・時間軸の中で様々な体験ができるという内容になっていますが・・これはちょっと凄いかも!'}]

### エンドポイントの削除

In [12]:
predictor.delete_endpoint(delete_endpoint_config=False)
predictor.delete_model()
