# SAM Batch Product Extraction

Process multiple ad images through SAM inference service.

**What this does:**
1. Lists ad images from `AD_INPUT_STAGE/ads/`
2. Processes N images (controlled by `LIMIT`)
3. Calls SAM service function for each image
4. Outputs product crops to `AD_OUTPUT_STAGE/{run_id}/`

**Prerequisites:**
- Upload this notebook to Snowflake (Snowsight → Projects → Notebooks)
- Select warehouse: `SAM_DEMO_WH`
- Database/Schema: `SHALION_HF_DEMO.PRODUCT_EXTRACTION`


## Setup: Import Libraries and Get Snowpark Session


In [None]:
import json
from datetime import datetime
from snowflake.snowpark.context import get_active_session

# Get the active Snowpark session
session = get_active_session()


## Configuration: Batch Size and Run ID

Set how many images to process and generate a unique run ID for organizing output files.


In [None]:
# Control how many images to process
LIMIT = 5

# Unique run ID for output organization
run_id = datetime.now().strftime("run_%Y%m%d_%H%M%S")

print(f"Run ID: {run_id}")
print(f"Processing limit: {LIMIT} images")


## Step 1: List Available Ad Images

Query the input stage directory to find all JPEG/PNG images in the `ads/` folder.


In [None]:
images_df = session.sql("""
    SELECT 
        RELATIVE_PATH AS image_path,
        SIZE AS file_size_bytes,
        LAST_MODIFIED
    FROM DIRECTORY(@SHALION_HF_DEMO.PRODUCT_EXTRACTION.AD_INPUT_STAGE)
    WHERE RELATIVE_PATH LIKE 'ads/%'
      AND (RELATIVE_PATH LIKE '%.jpg' OR RELATIVE_PATH LIKE '%.jpeg' OR RELATIVE_PATH LIKE '%.png')
    ORDER BY LAST_MODIFIED DESC
""").limit(LIMIT)

images_df.show()


## Step 2: Process Images Through SAM Inference

Call `EXTRACT_PRODUCTS()` function which runs SAM on GPU to extract product regions from each image. Handles errors gracefully and continues processing if individual images fail.


In [None]:
results = []
errors = []

for row in images_df.collect():
    image_path = row['IMAGE_PATH']
    print(f"\nProcessing: {image_path}")
    
    try:
        # Call SAM inference service
        result_str = session.sql(f"""
            SELECT SHALION_HF_DEMO.PRODUCT_EXTRACTION.EXTRACT_PRODUCTS(
                '@SHALION_HF_DEMO.PRODUCT_EXTRACTION.AD_INPUT_STAGE/{image_path}',
                '{run_id}/{image_path.split("/")[-1].split(".")[0]}/'
            )
        """).collect()[0][0]
        
        result = json.loads(result_str)
        results.append({
            'image': image_path,
            'num_products': result['num_products'],
            'product_likely': result['product_likely'],
            'crops': result['crops']
        })
        
        print(f"  ✓ Found {result['num_products']} products")
        print(f"  ✓ Product likely: {result['product_likely']}")
        
    except Exception as e:
        error_msg = str(e)
        errors.append({'image': image_path, 'error': error_msg})
        print(f"  ✗ Failed: {error_msg[:100]}")

print(f"\n{'='*60}")
print(f"Batch complete: {len(results)} images processed successfully")
if errors:
    print(f"Failed: {len(errors)} images (see errors list)")
print(f"{'='*60}")


## Step 3: Display Results Summary

Show detailed results including successful extractions with crop URLs and any errors encountered.


In [None]:
print("\nResults Summary:")
print("=" * 60)
print(f"✓ Successful: {len(results)} images")
print(f"✗ Failed: {len(errors)} images")
print("=" * 60)

if results:
    print("\nSuccessful Extractions:")
    for r in results:
        print(f"{r['image']:40s} → {r['num_products']} products")
        for crop in r['crops']:
            print(f"  - {crop}")
    print(f"\nTotal crops generated: {sum(r['num_products'] for r in results)}")

if errors:
    print("\n" + "=" * 60)
    print("Failed Images:")
    for e in errors:
        print(f"✗ {e['image']}")
        print(f"  Error: {e['error'][:150]}")


## Download Output Images

Replace `{run_id}` with the actual run ID from the config cell above.

```bash
snow stage copy @SHALION_HF_DEMO.PRODUCT_EXTRACTION.AD_OUTPUT_STAGE/{run_id}/ ./output/{run_id}/ \
  --recursive \
  --database SHALION_HF_DEMO \
  --schema PRODUCT_EXTRACTION
```

**Example:**
```bash
# If run_id = run_20251118_025350
snow stage copy @SHALION_HF_DEMO.PRODUCT_EXTRACTION.AD_OUTPUT_STAGE/run_20251118_025350/ ./output/run_20251118_025350/ \
  --recursive \
  --database SHALION_HF_DEMO \
  --schema PRODUCT_EXTRACTION
```

Command will download all PNG product crops to your local `./output/{run_id}/` directory.
