# Setup for Fine-tuning Korean ReRanker using Amazon SageMaker
- Container: conda_python3
- We recommend python 3.10 or later.
- version check: !python -V

## 1. Install python SDK
- **패키지 설치 후 notebook이 재시작 합니다**

In [None]:
install_needed = True

In [None]:
import sys
import IPython

if install_needed:
    print("installing deps and restarting kernel")
    !sudo curl -L "https://github.com/docker/compose/releases/download/v2.7.0/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
    !sudo chmod +x /usr/local/bin/docker-compose
    !{sys.executable} -m pip install -U pip
    !{sys.executable} -m pip install -U awscli
    !{sys.executable} -m pip install -U botocore
    !{sys.executable} -m pip install -U boto3
    !{sys.executable} -m pip install -U sagemaker 
    !{sys.executable} -m pip install -U termcolor
    !{sys.executable} -m pip install -U transformers
    !{sys.executable} -m pip install -U datasets
    !{sys.executable} -m pip install -U sentencepiece
    !{sys.executable} -m pip install -U FlagEmbedding

    IPython.Application.instance().kernel.do_shutdown(True)

## 2. Building serving image
- Fine-tuned reranker 모델 서빙은 AWS의 `HuggingFace Inference Containers` 를 사용합니다. 
    - Native Deep Learning Conatiner (DLC)의 정보는 [link](https://github.com/aws/deep-learning-containers/blob/master/available_images.md)를 통해 확인하세요.
- 원할한 서빙을 위해서는 `transformer >= 4.36.2` 가 필요합니다. (transformer ver.: 4.28.1 in native container)
- 때문에 해당 예제에서는 custom container image를 이용하여 serving 하도록 합니다. 
- **[중요] ECR 사용을 위해서는 `AmazonEC2ContainerRegistryFullAccess` 권한이 필요합니다**

In [None]:
import boto3
from utils.ecr import ecr_handler

In [None]:
%%writefile src/serving/Dockerfile-serving

FROM 763104351884.dkr.ecr.us-east-1.amazonaws.com/huggingface-pytorch-inference:2.0.0-transformers4.28.1-gpu-py310-cu118-ubuntu20.04
RUN pip install -U pip
RUN pip install -U botocore
RUN pip install -U awscli
RUN pip install -U boto3
RUN pip install -U sagemaker
RUN pip install -U transformers
ENV PYTHONUNBUFFERED=TRUE

In [None]:
build_image = True

### **[주의]** 아래 코드의 region 및 accound id 변경하지 않음
`ecr.build_docker(docker_dir, dockerfile, repository_name, strRegionName="us-east-1", strAccountId="763104351884")`

In [None]:
if build_image:

    ecr = ecr_handler()
    region = boto3.Session().region_name
    account_id = boto3.client("sts").get_caller_identity().get("Account")

    repository_name = "ko-reranker-serve"  ## <-- 원하는 docker repostory 이름을 추가
    repository_name = repository_name.lower()
    dockerfile = "Dockerfile-serving"
    docker_dir = "./src/serving/"
    tag = "latest"

    ecr.build_docker(docker_dir, dockerfile, repository_name, strRegionName="us-east-1", strAccountId="763104351884")
    ecr_repository_uri = ecr.register_image_to_ecr(region, account_id, repository_name, tag)
    
else:
    ecr_repository_uri = "<your ecr repo uri>" #"419974056037.dkr.ecr.us-east-1.amazonaws.com/ko-reranker-serve"