# Yomitoku Pro Document Analyzer in SageMaker or CloudFormation

## Before use
Please deploy the service in advance using **CloudFormation** or **SageMaker Console**.

**Required Permissions**: Ensure your AWS user/role has SageMaker FullAccess permissions.

---

### Option 1 — Using SageMaker Console

#### Step 1: Model creation
1. Go to **AWS Marketplace** → Find your subscribed product → Click **Configure** (設定 in Japanese).  
2. Select **SageMaker Console** as the launch method.  
3. In SageMaker console, set the following:  
   - **Model name**: Any name (e.g., `my-ml-model`)  
   - **IAM role**: A role that allows SageMaker to access S3 and other resources  
   - **Container settings**: For Marketplace models, this is usually pre-filled automatically  
4. Click **Create** to register the model.  

---

#### Step 2: Endpoint creation
1. Open the page of the model you just created, then click **Create endpoint**.  
2. Configure the endpoint:  
   - **Endpoint name**: Any name (e.g., `my-ml-endpoint`)  
   - **Instance type**: Choose a supported type (recommended: `ml.g4dn.xlarge` or `ml.g5.xlarge`)  
   - **Number of instances**: Start with 1 (you can enable Auto Scaling later if needed)  
3. Click **Create**.  
4. Wait until the endpoint status becomes **InService**.  

---

✅ Once the endpoint is **InService**, you can start sending real-time inference requests.  

**To delete (stop billing):**  
1. Go to **SageMaker Console → Endpoints**  
2. Select your endpoint → Click **Delete**  
3. (Optional) Also delete the **Endpoint configuration** and **Model** if no longer needed  

**Note**: The instance type is selected during the **endpoint configuration step** after the model is created.

---

### Option 2 — Using CloudFormation
1. Go to **AWS Marketplace** product page → **Configure** → Select **CloudFormation**  
2. Click the provided **Launch Stack** link  
3. In the CloudFormation page:  
   - Set **Stack name** (any name)  
   - Confirm parameters (instance type: recommended `ml.g4dn.xlarge` or `ml.g5.xlarge`)  
4. Click **Create stack** → Wait until status is **CREATE_COMPLETE**  

**To delete (stop billing):**  
1. Go to **CloudFormation Console**  
2. Select your stack → Click **Delete**  
3. Wait until status is **DELETE_COMPLETE**  
---

## ご利用前に
事前に **CloudFormation** または **SageMaker** コンソールを使って、このサービスをデプロイしてください。

**必要な権限**: AWS ユーザー/ロールに SageMaker FullAccess 権限があることを確認してください。

---

## オプション1 — SageMaker コンソールでのデプロイ
### Step 1: モデル作成
1. **AWS Marketplace** → **サブスクリプションを管理** を開く。  
2. 対象のモデルの「アクション」メニューから **設定** をクリック。  
3. 起動方法として **SageMaker コンソール** を選択。  
4. SageMaker コンソールに遷移したら、以下を設定する：  
   - **モデル名**: 任意の名前（例: `my-ml-model`）  
   - **IAM ロール**: 必要に応じて SageMaker に S3 などへアクセスできるロールを選択  
   - **コンテナ設定**: Marketplace モデルの場合は自動で入力されることが多い  
5. 「**作成**」をクリックしてモデルを登録。  

### Step 2: エンドポイント作成
1. 作成したモデルのページを開き、「**エンドポイントを作成**」をクリック。  
2. 以下を設定する：  
   - **エンドポイント名**: 任意の名前（例: `my-ml-endpoint`）  
   - **インスタンスタイプ**: モデルがサポートするものから選択（例: `ml.g4dn.xlarge` または `ml.g5.xlarge` が推奨）  
   - **インスタンス数**: 通常は 1 から開始（スケーリングが必要なら Auto Scaling 設定も可能）  
3. 「**作成**」をクリック。  
4. ステータスが **InService** になるまで数分待機。  

✅ これでエンドポイントが利用可能となり、リアルタイム推論リクエストを送信できる。  


**終了する場合（課金を止めるには）**  
1. **SageMaker コンソール → エンドポイント** を開く  
2. 対象のエンドポイントを選択 → **削除** をクリック  
3. （任意）もう使わない場合は **エンドポイント設定** と **モデル** も削除  

🔎 **ポイント**
- **モデル作成** = どのコンテナ/アーティファクトを使うかを登録するステップ  
- **エンドポイント作成** = そのモデルをどのインスタンスで常駐稼働させるかを決めるステップ  


### オプション2 — CloudFormationでのデプロイ
1. **AWS Marketplace** → 製品ページ → **設定** → **CloudFormation** を選択  
2. 表示された **スタックの起動** リンクをクリック  
3. CloudFormation ページで以下を設定:  
   - **スタック名**（任意の名前）  
   - パラメータを確認（インスタンスタイプ: 推奨 `ml.g4dn.xlarge` または `ml.g5.xlarge`）  
4. **スタック作成** をクリック → ステータスが **CREATE_COMPLETE** になるまで待機  

**終了する場合（課金を止めるには）**  
1. **CloudFormation コンソール** を開く  
2. 対象のスタックを選択 → **削除** をクリック  
3. ステータスが **DELETE_COMPLETE** になるまで待機  

## Data Preparation

Sample Image：https://mlism-marketplace-documents.s3.ap-northeast-1.amazonaws.com/samples/gallery1.jpg

Sample PDF: https://www.soumu.go.jp/main_content/000975178.pdf

In [1]:
import requests

def download_file(url, filename):
    response = requests.get(url, stream=True)
    if response.status_code == 200:
        with open(filename, 'wb') as f:
            for chunk in response.iter_content(1024):
                f.write(chunk)
        file_type = "Image" if filename.endswith(".png") else "PDF"
        print(f"{file_type} saved as {filename}")
    else:
        print(f"Failed to download file. Status code: {response.status_code}")

download_file(
    "https://mlism-marketplace-documents.s3.ap-northeast-1.amazonaws.com/samples/gallery1.jpg",
    "image.png"
)
download_file(
    "https://www.soumu.go.jp/main_content/000975178.pdf",
    "image.pdf"
)


Image saved as image.png
PDF saved as image.pdf


In [2]:
import json
import boto3
from botocore.exceptions import ClientError
from pprint import pprint

## Config Setup

**Note**: This example assumes you have AWS credentials configured (via AWS CLI, environment variables, or IAM roles). For production use, consider using IAM roles instead of hardcoding credentials.

In [4]:
ENDPOINT_NAME = "yomitoku-example-endpoint"  # Replace with your actual endpoint name
AWS_REGION = "us-east-1"  # Replace with your AWS region (e.g., us-west-2, ap-northeast-1)
# You can replace your profile name here
session = boto3.Session(profile_name = "default", region_name = AWS_REGION)
sts = session.client("sts")

In [None]:

# Initialize SageMaker runtime client
# This uses your default AWS credentials (AWS CLI, environment variables, or IAM role)
# If you do not need MFA, you can uncomment the following lines
MFA_SERIAL     = None # if you do not need MFA, set to None
MFA_TOKEN_ENV  = None # if you do not need MFA, set to ""

# -------- Helper: MFA session --------
def with_mfa_session(base_sess: boto3.Session, mfa_serial: str, token: str) -> boto3.Session:
    """
    Create a session with MFA authentication.

    Args:
        base_sess: Base boto3 session (default creds / profile)
        mfa_serial: MFA device ARN (e.g. arn:aws:iam::...:mfa/xxxx)
        token: 6-digit MFA code
    """
    sts = base_sess.client("sts")
    resp = sts.get_session_token(
        SerialNumber=mfa_serial,
        TokenCode=token,
        DurationSeconds=43200  # 12h (受账号上限影响)
    )
    c = resp["Credentials"]
    return boto3.Session(
        region_name=base_sess.region_name,
        aws_access_key_id=c["AccessKeyId"],
        aws_secret_access_key=c["SecretAccessKey"],
        aws_session_token=c["SessionToken"],
    )

# -------- Build session (default or MFA) --------
base = boto3.Session(region_name=AWS_REGION)

if MFA_SERIAL:
    token = MFA_TOKEN_ENV or input("Enter MFA code: ").strip()
    try:
        session = with_mfa_session(base, MFA_SERIAL, token)
        print("✅ Using MFA session credentials")
    except ClientError as e:
        raise SystemExit(f"❌ MFA failed: {e}")
else:
    session = base
    print("ℹ️ Using default AWS credentials/profile")

# -------- Clients --------
sagemaker_runtime = session.client('sagemaker-runtime')
sagemaker         = session.client('sagemaker')

# -------- Optional: verify endpoint status --------
try:
    status = sagemaker.describe_endpoint(EndpointName=ENDPOINT_NAME)["EndpointStatus"]
    print(f"Endpoint status: {status}")
except ClientError as e:
    print(f"Error checking endpoint status: {e}")
    print("Make sure your endpoint name is correct and you have proper permissions.")

✅ Using MFA session credentials
Endpoint status: InService


## Supported File Types

Yomitoku Pro Document Analyzer supports the following file types:
- **image/jpeg** - JPEG images
- **image/png** - PNG images  
- **image/tiff** - TIFF images
- **application/pdf** - PDF files

**Important**: Make sure to set the correct `ContentType` when invoking the endpoint.

## Inference Examples

### Image Analysis (PNG)

In [7]:
with open("./image.png", "rb") as f:
    response = sagemaker_runtime.invoke_endpoint(
        EndpointName=ENDPOINT_NAME,
        ContentType="image/png",
        Body=f.read(),
    )

body_bytes = response["Body"].read()
result = json.loads(body_bytes)

# Display basic information
print("=== Image Analysis Results ===")

pprint(f"{result['result']}"[:500])

# Uncomment below to see full result
# pprint(result)

=== Image Analysis Results ===
("[{'preprocess': {'angle': 0.0, 'angle_score': 1.0}, 'paragraphs': [{'box': "
 "[1, 75, 747, 104], 'contents': 'TELEWORK\\nLEWORK', 'direction': "
 "'horizontal', 'order': 0, 'role': 'page_header', 'indent_level': None}, "
 "{'box': [907, 76, 2282, 105], 'contents': 'ORK\\nLEWORK TELEWORK', "
 "'direction': 'horizontal', 'order': 1, 'role': 'page_header', "
 "'indent_level': None}, {'box': [2367, 78, 3298, 106], 'contents': "
 "'テレワークのさらなる普及・定着に向け「テレワーク月間」を実施します!\\nTELEWORK', 'direction': 'horizontal', "
 "'order': 2, 'role': ")


### PDF Analysis

In [9]:
import json
from pprint import pprint

with open("./image.pdf", "rb") as f:
    response = sagemaker_runtime.invoke_endpoint(
        EndpointName=ENDPOINT_NAME,
        ContentType="application/pdf",
        Body=f.read(),
    )

body_bytes = response["Body"].read()
result = json.loads(body_bytes)
# Display basic information
print("=== Image Analysis Results ===")

pages = len(result["result"])
print(f"Number of pages analyzed: {pages}")

page_num = 0
for item in result["result"][:3]:
    page_num += 1
    print(f"Page {page_num}:")
    # top 500 characters
    pprint(f"{item}"[:500])

# Uncomment below to see full result
# pprint(result)


=== Image Analysis Results ===
Number of pages analyzed: 9
Page 1:
("{'preprocess': {'angle': 0.0, 'angle_score': 0.9999990463256836}, "
 "'paragraphs': [{'box': [234, 74, 1424, 128], 'contents': "
 "'もっと知りたい!現在·未来のくらしと生活の情報誌', 'direction': 'horizontal', 'order': 0, 'role': "
 "None, 'indent_level': None}, {'box': [248, 150, 1377, 452], 'contents': "
 "'総務省', 'direction': 'horizontal', 'order': 1, 'role': 'section_headings', "
 "'indent_level': None}, {'box': [81, 486, 313, 521], 'contents': '2024 "
 "年11月号', 'direction': 'horizontal', 'order': 2, 'role': None, 'indent_level': "
 "None}, {'")
Page 2:
("{'preprocess': {'angle': 0.0, 'angle_score': 1.0}, 'paragraphs': [{'box': "
 "[0, 76, 745, 106], 'contents': 'TELEWORK\\nLEWORK', 'direction': "
 "'horizontal', 'order': 0, 'role': 'page_header', 'indent_level': None}, "
 "{'box': [910, 77, 2289, 106], 'contents': 'RK\\nEWORK TELEWORK', "
 "'direction': 'horizontal', 'order': 1, 'role': 'page_header', "
 "'indent_level': None}, {'box': [

## Usage Tips

1. **Large Files**: For large PDF files, consider processing them in smaller batches if possible.
2. **Error Handling**: Always include proper error handling in production code.
3. **Result Structure**: The API returns structured data with bounding boxes, confidence scores, and hierarchical text blocks.
4. **Performance**: GPU instances (ml.g4dn.xlarge, ml.g5.xlarge) provide better performance for document analysis.

For detailed API documentation and result structure, please refer to the product documentation in AWS Marketplace.