# GPT-SoVITS on Sagemaker

## build image

**Note**
- The better way to build the Docker image is to use the notebook terminal !"
- Sometimes the download gets stuck when downloading models from Modelscope. You can try re-executing the command to build the Docker image.

In [None]:
# please copy and excute this commend in terminal
# !chmod +x ./*.sh && ./build_and_push.sh 

In [16]:
pip show sagemaker

Name: sagemaker
Version: 2.229.0
Summary: UNKNOWN
Home-page: UNKNOWN
Author: 
Author-email: 
License: UNKNOWN
Location: /home/ec2-user/anaconda3/envs/pytorch_p310/lib/python3.10/site-packages
Requires: attrs, boto3, cloudpickle, docker, google-pasta, importlib-metadata, jsonschema, numpy, packaging, pandas, pathos, platformdirs, protobuf, psutil, PyYAML, requests, schema, smdebug-rulesconfig, tblib, tqdm, urllib3
Required-by: sagemaker-ssh-helper
Note: you may need to restart the kernel to use updated packages.


In [2]:
!pip install boto3 sagemaker awscli sagemaker_ssh_helper -U

Collecting boto3
  Downloading boto3-1.35.2-py3-none-any.whl.metadata (6.6 kB)
Collecting sagemaker
  Using cached sagemaker-2.229.0-py3-none-any.whl.metadata (4.1 kB)
Collecting sagemaker_ssh_helper
  Using cached sagemaker_ssh_helper-2.2.0-py3-none-any.whl.metadata (3.1 kB)
Collecting botocore<1.36.0,>=1.35.2 (from boto3)
  Downloading botocore-1.35.2-py3-none-any.whl.metadata (5.7 kB)
Downloading boto3-1.35.2-py3-none-any.whl (139 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m139.1/139.1 kB[0m [31m11.3 MB/s[0m eta [36m0:00:00[0m
[?25hUsing cached sagemaker-2.229.0-py3-none-any.whl (1.5 MB)
Using cached sagemaker_ssh_helper-2.2.0-py3-none-any.whl (98 kB)
Downloading botocore-1.35.2-py3-none-any.whl (12.5 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m12.5/12.5 MB[0m [31m105.2 MB/s[0m eta [36m0:00:00[0m00:01[0m0:01[0m
[?25hInstalling collected packages: botocore, boto3, sagemaker, sagemaker_ssh_helper
  Attempting uninstall: botocor

In [4]:
import boto3
import sagemaker
from sagemaker import Model, image_uris, serializers, deserializers

role = sagemaker.get_execution_role()  # execution role for the endpoint
sess = sagemaker.session.Session()  # sagemaker session for interacting with different AWS APIs
region = sess._region_name  # region name of the current SageMaker Studio environment
account_id = sess.account_id()  # account_id of the current SageMaker Studio environment
bucket = sess.default_bucket()
image="gpt-sovits-inference-v2"
s3_client = boto3.client("s3")
sm_client = boto3.client("sagemaker")
smr_client = boto3.client("sagemaker-runtime")

full_image_uri=f"{account_id}.dkr.ecr.{region}.amazonaws.com/{image}:latest"
print(full_image_uri)


sagemaker.config INFO - Not applying SDK defaults from location: /etc/xdg/sagemaker/config.yaml
sagemaker.config INFO - Not applying SDK defaults from location: /home/ec2-user/.config/sagemaker/config.yaml
596899493901.dkr.ecr.us-east-1.amazonaws.com/gpt-sovits-inference-v2:latest


## Prepare and upload deploy codes

In [None]:
!rm GPT-SoVITS.tar.gz
!tar -czvf GPT-SoVITS.tar.gz --transform 's,^,GPT-SoVITS/,' . --exclude='*.ipynb' --exclude='serve' --exclude='docs' --exclude='GPT-SoVITS.*' --exclude='./.*'

In [None]:
s3_code_prefix = "gpt_sovits_codes"
bucket = sess.default_bucket()
code_artifact = sess.upload_data("GPT-SoVITS.tar.gz", bucket, s3_code_prefix)
print(f"S3 Code or Model tar ball uploaded to --- > {code_artifact}")

## Remote debug test 
Since we are using the BYOC (Bring Your Own Container) method to deploy the GPT-SoVITS model, we can deploy and debug the code using [SSH Helper](https://github.com/aws-samples/sagemaker-ssh-helper/blob/main/README.md) after preparing the initial code. Once the debugging is successful, we can then deploy it using the regular method.

1. Deploy the model using SageMaker SSH Helper([Setting up your AWS account with IAM and SSM configuration](https://github.com/aws-samples/sagemaker-ssh-helper/blob/main/IAM_SSM_Setup.md))
2. After got the instance_id, ssh to the instance and debug.

In [6]:
from sagemaker_ssh_helper.wrapper import SSHModelWrapper
model = Model(image_uri=full_image_uri, model_data=model_data, role=role,dependencies=[SSHModelWrapper.dependency_dir()] )

In [7]:
from sagemaker_ssh_helper.wrapper import SSHModelWrapper
from time import gmtime, strftime
from sagemaker import Predictor
instance_type = "ml.g5.xlarge"
endpoint_name = sagemaker.utils.name_from_base("gpt-sovits-inference")
# endpointName="gpt-sovits-sagemaker-endpoint-v2-"+strftime("%Y-%m-%d-%H-%M-%S", gmtime())

ssh_wrapper = SSHModelWrapper.create(model, connection_wait_time_seconds=0)  # <--NEW--

predictor = model.deploy(
    initial_instance_count=1,
    instance_type=instance_type,
    endpoint_name=endpoint_name,
    wait=False
)


# instance_ids = ssh_wrapper.get_instance_ids(timeout_in_sec=900)  # <--NEW-- 
# print(f"To connect over SSM run: aws ssm start-session --target {instance_ids[0]}")

In [None]:
import time
sm_client = boto3.client("sagemaker")
resp = sm_client.describe_endpoint(EndpointName=endpoint_name)
status = resp["EndpointStatus"]
print("Status: " + status)

while status == "Creating":
    time.sleep(60)
    resp = sm_client.describe_endpoint(EndpointName=endpoint_name)
    status = resp["EndpointStatus"]
    print("Status: " + status)

print("Arn: " + resp["EndpointArn"])
print("Status: " + status)

In [10]:
# aws ssm start-session --target <Your_instance_ids>
instance_ids = ssh_wrapper.get_instance_ids(timeout_in_sec=0)
print(instance_ids[0])

mi-0f428d4597ece3b03


## SM Endpoint deployment(Regular method for production)

After debugging is completed using the SSH Helper, you can officially use the following code block for deployment.

Remember to delete the node occupied by the SSH Helper in time! There is a command to delete the node at the end of this example code.

### create sagemaker model

In [8]:
import boto3
import re
import os
import json
import uuid
import boto3
import sagemaker
from time import strftime

## for debug only
from sagemaker_ssh_helper.wrapper import SSHModelWrapper
sm_client = boto3.client(service_name='sagemaker')


def create_model():
    image=full_image_uri
    model_name="gpt-sovits-sagemaker-"+strftime("%Y-%m-%d-%H-%M-%S", gmtime())
    create_model_response = sm_client.create_model(
        ModelName=model_name,
        ExecutionRoleArn=role,
        Containers=[{"Image": image}],
    )
    print(create_model_response)
    return model_name

In [9]:
model_name=create_model()

{'ModelArn': 'arn:aws:sagemaker:us-east-1:596899493901:model/gpt-sovits-sagemaker-2024-08-22-15-37-26', 'ResponseMetadata': {'RequestId': '6e4fcfb7-47fa-4d18-8abb-5126cfe53a6f', 'HTTPStatusCode': 200, 'HTTPHeaders': {'x-amzn-requestid': '6e4fcfb7-47fa-4d18-8abb-5126cfe53a6f', 'content-type': 'application/x-amz-json-1.1', 'content-length': '102', 'date': 'Thu, 22 Aug 2024 15:37:26 GMT'}, 'RetryAttempts': 0}}


### create endpoint configuration

In [10]:
endpoint_name="gpt-sovits-sagemaker-endpoint-v2-"+strftime("%Y-%m-%d-%H-%M-%S", gmtime())
def create_endpoint_configuration():
    create_endpoint_config_response = sm_client.create_endpoint_config(     
        EndpointConfigName=endpoint_name,
        ProductionVariants=[
            {
                #"ModelName":"gpt-sovits-sagemaker-012024-03-28-04-00-03",
                "ModelName":model_name,
                "VariantName": "gpt-sovits-sagemaker"+"-variant",
                "InstanceType": "ml.g5.xlarge",  # 指定 g5.xlarge 机器
                "InitialInstanceCount": 1,
                "ModelDataDownloadTimeoutInSeconds": 1200,
                "ContainerStartupHealthCheckTimeoutInSeconds": 1200,
            }
        ],
    )
    print(create_endpoint_config_response)
    return endpoint_name


In [None]:
create_endpoint_configuration()

### create endpoint

In [14]:
# endpointName="gpt-sovits-sagemaker-endpoint-v2-"+strftime("%Y-%m-%d-%H-%M-%S", gmtime())
def create_endpoint():
    create_endpoint_response = sm_client.create_endpoint(
        EndpointName=endpoint_name,
        #EndpointConfigName="gpt-sovits-sagemaker-configuration2024-03-28-04-03-53",
        EndpointConfigName=endpoint_name
    )
    print("Endpoint Arn: " + create_endpoint_response["EndpointArn"])
    resp = sm_client.describe_endpoint(EndpointName=endpoint_name)
    print("Endpoint Status: " + resp["EndpointStatus"])
    print("Waiting for {} endpoint to be in service".format("gpt-sovits-sagemaker-endpoint"))
    waiter = sm_client.get_waiter("endpoint_in_service")
    waiter.wait(EndpointName=endpoint_name)

In [15]:
endpoint_name

'gpt-sovits-sagemaker-endpoint-v2-2024-08-22-11-20-45'

In [16]:
create_endpoint()

Endpoint Arn: arn:aws:sagemaker:us-east-1:596899493901:endpoint/gpt-sovits-sagemaker-endpoint-v2-2024-08-22-11-20-45
Endpoint Status: Creating
Waiting for gpt-sovits-sagemaker-endpoint endpoint to be in service


## Realtime inferecne with sagemaker endpoint

Remember to replace the EndpointName passed into the invoke function with the actual endpoint name of your deployment.

In [16]:
import json
import boto3
runtime_sm_client = boto3.client(service_name="sagemaker-runtime")

def invoke_endpoint(request):
    content_type = "application/json"
    request_body = request
    payload = json.dumps(request_body)
    print(payload)
    response = runtime_sm_client.invoke_endpoint(
        EndpointName=endpoint_name,
        ContentType=content_type,
        Body=payload,
    )
    result = response['Body'].read().decode()
    print('返回：',result)
    return result

In [17]:

request = {"refer_wav_path":"s3://tts-xq/test-data/音质好.wav",
    "prompt_text": "脚下当心！这位客官，想照顾我们往生堂的生意，也不必这么心急嘛？你没什么事吧？嗯？麻烦的家伙。",
    "prompt_language":"zh",
    "text":"逃课上网，打架斗殴，上课睡觉，样样俱全；你、你真是…孺子不可教也！！！",
    "text_language" :"zh",
    "output_s3uri":"s3://tts-xq/gpt_sovits_output/wav/"}
    
result = invoke_endpoint(request)

{"refer_wav_path": "s3://tts-xq/test-data/\u97f3\u8d28\u597d.wav", "prompt_text": "\u811a\u4e0b\u5f53\u5fc3\uff01\u8fd9\u4f4d\u5ba2\u5b98\uff0c\u60f3\u7167\u987e\u6211\u4eec\u5f80\u751f\u5802\u7684\u751f\u610f\uff0c\u4e5f\u4e0d\u5fc5\u8fd9\u4e48\u5fc3\u6025\u561b\uff1f\u4f60\u6ca1\u4ec0\u4e48\u4e8b\u5427\uff1f\u55ef\uff1f\u9ebb\u70e6\u7684\u5bb6\u4f19", "prompt_language": "zh", "text": "\u9003\u8bfe\u4e0a\u7f51\uff0c\u6253\u67b6\u6597\u6bb4\uff0c\u4e0a\u8bfe\u7761\u89c9\uff0c\u6837\u6837\u4ff1\u5168\uff1b\u4f60\u3001\u4f60\u771f\u662f\u2026\u5b7a\u5b50\u4e0d\u53ef\u6559\u4e5f\uff01\uff01\uff01", "text_language": "zh", "output_s3uri": "s3://tts-xq/gpt_sovits_output/wav/"}
返回： {"result": "s3://tts-xq/gpt_sovits_output/wav/gpt_sovits_1724428901895.mp3"}


In [18]:
result

'{"result": "s3://tts-xq/gpt_sovits_output/wav/gpt_sovits_1724428901895.mp3"}'

In [20]:
results_audio = eval(result)["result"]

In [21]:
# you can download it from s3 console
!aws s3 cp $results_audio ./

download: s3://tts-xq/gpt_sovits_output/wav/gpt_sovits_1724428901895.mp3 to ./gpt_sovits_1724428901895.mp3


### 调用时替换基础 gpt 模型和 Sovits 模型

In [None]:
# Change gpt_weights and sovits_weights

request = {"refer_wav_path":"s3://tts-xq/test-data/音质好.wav",
    "prompt_text": "脚下当心！这位客官，想照顾我们往生堂的生意，也不必这么心急嘛？你没什么事吧？嗯？麻烦的家伙。",
    "prompt_language":"zh",
    "text":"逃课上网，打架斗殴，上课睡觉，样样俱全；你、你真是…孺子不可教也！！！",
    "text_language" :"zh",
    "output_s3uri":"s3://tts-xq/gpt_sovits_output/wav/",
    "gpt_weights_path":"s3://asr-xq/gaoguai001-e15.ckpt",
    "sovits_weights_path":"s3://asr-xq/gaoguai001_e8_s96.pth"
}

result = invoke_endpoint(request)

## Streams test (only for stream branch deployment)

In [23]:
import requests

chunk_bytes=None

def upsert(lst, new_dict):
    for i, item in enumerate(lst):
        if new_dict['index'] == i:
            lst[i] = new_dict
            return lst
    lst.append(new_dict)
    return lst

def invoke_streams_endpoint(smr_client,endpointName, request):
    global chunk_bytes
    content_type = "application/json"
    payload = json.dumps(request,ensure_ascii=False)

    response_model = smr_client.invoke_endpoint_with_response_stream(
        EndpointName=endpointName,
        ContentType=content_type,
        Body=payload,
    )

    result = []
    print(response_model['ResponseMetadata'])
    event_stream = iter(response_model['Body'])
    index = 0
    try: 
        while True:
            event = next(event_stream)
            eventChunk = event['PayloadPart']['Bytes']
            chunk_dict = {}
            if index == 0:
                print("Received first chunk")
                chunk_dict['first_chunk'] = True
                chunk_dict['bytes'] = eventChunk
                chunk_bytes = eventChunk
                chunk_dict['last_chunk'] = False
                chunk_dict['index'] = index
            else:
                chunk_dict['first_chunk'] = False
                chunk_dict['bytes'] = eventChunk
                chunk_bytes = eventChunk
                chunk_dict['last_chunk'] = False
                chunk_dict['index'] = index
            print("chunk len:",len(chunk_dict['bytes']))
            result.append(chunk_dict)    
            index += 1
            #print('返回chunk：', chunk_dict['bytes'])
    except StopIteration:
        print("All chunks processed")
        chunk_dict = {}
        chunk_dict['first_chunk'] = False
        chunk_dict['bytes'] = chunk_bytes
        chunk_dict['last_chunk'] = True
        chunk_dict['index'] = index-1
        result = upsert(result,chunk_dict)
    print("result",result)
    return result


In [27]:
import json
import boto3
# endpointName="gpt-sovits-inference-2024-05-17-13-49-58-483"
runtime_sm_client = boto3.client(service_name="sagemaker-runtime")
#endpointName="gpt-sovits-sagemaker-endpoint2024-04-03-23-49-44"



request = {"refer_wav_path":"s3://tts-xq/test-data/音质好.wav",
    "prompt_text": "脚下当心！这位客官，想照顾我们往生堂的生意，也不必这么心急嘛？你没什么事吧？嗯？麻烦的家伙。",
    "prompt_language":"ja",
    "text":"『白夜行』はとても美しい小説で、私はとても夢中になって読んで、時には何時間も休まないで、私は中の主人公が大好きです",
    "text_language" :"ja",
    "output_s3uri":"s3://tts-xq/gpt_sovits_output/wav/",
    "cut_punc":"、"}


In [28]:
response=invoke_streams_endpoint(runtime_sm_client,endpoint_name,request)

{'RequestId': 'bcf07c35-f3f1-40f9-b14f-0ff6bd1d0e3f', 'HTTPStatusCode': 200, 'HTTPHeaders': {'x-amzn-requestid': 'bcf07c35-f3f1-40f9-b14f-0ff6bd1d0e3f', 'x-amzn-invoked-production-variant': 'AllTraffic', 'x-amzn-sagemaker-content-type': 'application/json', 'date': 'Fri, 23 Aug 2024 16:05:13 GMT', 'content-type': 'application/vnd.amazon.eventstream', 'transfer-encoding': 'chunked', 'connection': 'keep-alive'}, 'RetryAttempts': 0}
Received first chunk
chunk len: 76
chunk len: 76
chunk len: 76
chunk len: 76
All chunks processed
result [{'first_chunk': True, 'bytes': b'{"result": "s3://tts-xq/gpt_sovits_output/wav/gpt_sovits_1724429115196.mp3"}', 'last_chunk': False, 'index': 0}, {'first_chunk': False, 'bytes': b'{"result": "s3://tts-xq/gpt_sovits_output/wav/gpt_sovits_1724429116106.mp3"}', 'last_chunk': False, 'index': 1}, {'first_chunk': False, 'bytes': b'{"result": "s3://tts-xq/gpt_sovits_output/wav/gpt_sovits_1724429117000.mp3"}', 'last_chunk': False, 'index': 2}, {'first_chunk': False

In [29]:
# endpointName="gpt-sovits-sagemaker-endpoint-v2-2024-08-22-04-37-40"
sess.delete_endpoint(endpoint_name)
sess.delete_endpoint_config(endpoint_name)
model.delete_model()