## Operation 4!
### - Pushes the Stage 3 table prepped for S3 vector bucket using `PutVector` API

````
Regular S3 Bucket: sentence-data-ingestion
└── ML_EMBED_ASSETS/S3_VECTORS_STAGING/cohere_1024d/
    └── finrag_embeddings_s3vectors_cohere_1024d.parquet (365 MB)

Purpose: Staging/preparation area
Type: Standard S3 storage

S3 Vectors Bucket: <new-bucket-name>-vectors
└── Managed by S3 Vectors service
    └── Indexes (not raw parquet files)
        └── finrag-cohere-1024d-index
            ├── Optimized vector storage
            ├── HNSW index structure
            └── Metadata tables

Purpose: Production vector search
Type: S3 Vectors managed storage
Cost: $0.30/million vectors/month + query costs
````

---
#### Code Auth: Joel Markapudi. 




### Flow the service expects:
- S3 console screens you used only create the container for vectors (a vector bucket + a vector index) and set immutable index settings (dimension, distance metric, and any “non-filterable” metadata keys).
- S3 Vectors is a vector store interface layered into S3, not a Parquet-aware service. It doesn’t parse tabular files to auto-create vectors; you must present already-embedded float32 arrays with keys and metadata via the S3 Vectors APIs.
- Then you QueryVectors with a query embedding, optionally filtered by metadata. 
- S3 Vectors stores items of the form: { key, data: {float32: [..dim..]}, metadata: {k:v,..} }. You insert them with the PutVectors API. You then search with QueryVectors(topK, queryVector, filter=…).

### Pricing Model
- $0.30 per million vectors per month
- 203,076 vectors = 0.203 million vectors
- Monthly cost: 0.203 × $0.30 = $0.061/month (~$0.73/year)
- Vector data: float32[1024] arrays (~4 KB each)
- Filterable metadata: cik_int, report_year, section_name, sic, sentence_pos
- Non-filterable metadata: sentenceID, embedding_id, section_sentence_count
- Index structures: HNSW graph, metadata indexes
- Redundancy: S3 standard durability (11 9's)

```
AWS charges:
- PutVectors API calls: $0.00 (FREE)
- Data ingress to S3: $0.00 (FREE - standard AWS policy)
- Index building: $0.00 (FREE - handled automatically)
Insertion:
- 203,076 vectors ÷ 500 per batch = 407 API calls
- Cost: $0.00
Even at scale:
- 71.8 million vectors ÷ 500 = 143,600 API calls, Free.
```

### Pricing: QueryVectors API (Search)

$1.00 per million vector comparisons

What's a "vector comparison"?
- One distance calculation between query vector and stored vector
``` 
Example query:
topK = 10 (return 10 results)
Actual comparisons: ~100-500 vectors (HNSW efficiency)
Cost: 0.0001 to 0.0005 × $1.00 = $0.0001 to $0.0005 per query
```
- With metadata filters:
- topK = 10, filter = {cik_int: 1318605}
- Candidates: ~5,000 Tesla vectors (filtered first)
- Comparisons: ~100-500 (HNSW on filtered set)
- Cost: Same ~$0.0001 to $0.0005 per query

### Safety Shrinking exists but never triggers.
- Typical payload per vector:
- Embedding: 1024 × 4 bytes (float32) = 4,096 bytes = 4 KB
- Metadata: ~200 bytes (cik_int, report_year, short strings)
- Total: ~4.3 KB per vector
- 500 vectors × 4.3 KB = 2.15 MB ✓ (well under 20 MiB)
- **arn**: arn:aws:s3vectors:us-east-1:729472661729:bucket/finrag-embeddings-s3vectors/index/finrag-sentence-fact-embed-1024d
- **console**: console view is index-centric. see vector bucket and the vector indexes  created (name, ARN, creation time).

In [1]:
import sys
from pathlib import Path

# Add project root to path
project_root = Path.cwd().parent
if str(project_root) not in sys.path:
    sys.path.insert(0, str(project_root))

# Now import works
from loaders.ml_config_loader import MLConfig
print("✓ Config loader ready")

✓ Config loader ready


## Check Available methods on S3Vector's API.

In [2]:
import boto3

s3vectors = boto3.client('s3vectors', region_name='us-east-1')

# List all available methods
methods = [m for m in dir(s3vectors) if not m.startswith('_')]
print("Available S3 Vectors methods:")
for m in sorted(methods):
    print(f"  - {m}")

Available S3 Vectors methods:
  - can_paginate
  - close
  - create_index
  - create_vector_bucket
  - delete_index
  - delete_vector_bucket
  - delete_vector_bucket_policy
  - delete_vectors
  - exceptions
  - generate_presigned_url
  - get_index
  - get_paginator
  - get_vector_bucket
  - get_vector_bucket_policy
  - get_vectors
  - get_waiter
  - list_indexes
  - list_vector_buckets
  - list_vectors
  - meta
  - put_vector_bucket_policy
  - put_vectors
  - query_vectors
  - waiter_names


## S3 Vectors get_index Response Structure

In [3]:
# ============================================================================
# Inspect S3 Vectors Index Configuration
# ============================================================================

from s3vectors_index_inspector import inspect_s3vectors_index, get_index_status

# Option 1: Full verbose inspection (debugging)
response = inspect_s3vectors_index(
    vector_bucket="finrag-embeddings-s3vectors",
    index_name="finrag-sentence-fact-embed-1024d",
    verbose=True
)


[DEBUG] ✓ Found ModelPipeline via file path: D:\JoelDesktop folds_24\NEU FALL2025\MLops IE7374 Project\FinSights\ModelPipeline
[DEBUG] ✓ AWS credentials loaded from aws_credentials.env
S3 VECTORS INDEX INSPECTOR

Vector Bucket: finrag-embeddings-s3vectors
Index Name:    finrag-sentence-fact-embed-1024d
Region:        us-east-1

[Full Response]
{
  "ResponseMetadata": {
    "RequestId": "afc8cd9a-6b24-4282-b45f-3002f7429b58",
    "HostId": "",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "date": "Tue, 09 Dec 2025 16:52:38 GMT",
      "content-type": "application/json",
      "content-length": "481",
      "connection": "keep-alive",
      "x-amz-request-id": "afc8cd9a-6b24-4282-b45f-3002f7429b58",
      "access-control-allow-origin": "*",
      "vary": "origin, access-control-request-method, access-control-request-headers",
      "access-control-expose-headers": "*"
    },
    "RetryAttempts": 0
  },
  "index": {
    "vectorBucketName": "finrag-embeddings-s3vectors",
    "index

In [4]:
# Option 2: Quick status check (production)
status = get_index_status()
print(f"Index Status: {status}")

[DEBUG] ✓ Found ModelPipeline via file path: D:\JoelDesktop folds_24\NEU FALL2025\MLops IE7374 Project\FinSights\ModelPipeline
[DEBUG] ✓ AWS credentials loaded from aws_credentials.env
Index Status: {'exists': True, 'dimension': 1024, 'metric': 'cosine', 'created': datetime.datetime(2025, 11, 10, 14, 52, 15, tzinfo=tzlocal()), 'arn': 'arn:aws:s3vectors:us-east-1:729472661729:bucket/finrag-embeddings-s3vectors/index/finrag-sentence-fact-embed-1024d', 'non_filterable_keys': ['sentenceID', 'embedding_id', 'section_sentence_count']}


---
## S3 VECTORS DATA INSERTION
### Features: Preflight validation, retry logic, backoff, payload guards
---

In [6]:
# ============================================================================
# NOTEBOOK 04: S3 VECTORS BULK INSERTION
# ============================================================================

# Execution Parameters
INSERT_VECTORS = True
PROVIDER = "cohere_1024d"

# Test parameters: 1 company × 1 year
# CIK_FILTER = [1318605]  # Tesla
# YEAR_FILTER = [2016]     # Early year, smaller dataset

# Incremental insertion filters
CIK_FILTER = None  # None = all companies
YEAR_FILTER = [2021, 2022, 2023, 2024, 2025, 2012, 2013, 2014]  # Or None for all

BATCH_SIZE = 500
MAX_RETRIES = 7

# ============================================================================
# EXECUTION
# ============================================================================

if INSERT_VECTORS:
    from s3vectors_bulk_insertion import S3VectorsBulkInserter
    
    inserter = S3VectorsBulkInserter(
        provider=PROVIDER,
        cik_filter=CIK_FILTER,
        year_filter=YEAR_FILTER,
        batch_size=BATCH_SIZE,
        max_retries=MAX_RETRIES
    )
    
    summary = inserter.run()
    
    # Summary already printed by class, but dict available for logging
    print(f"\n[Programmatic Summary]")
    print(f"  {summary}")

[DEBUG] ✓ Found ModelPipeline via file path: D:\JoelDesktop folds_24\NEU FALL2025\MLops IE7374 Project\FinSights\ModelPipeline
[DEBUG] ✓ AWS credentials loaded from aws_credentials.env
S3 VECTORS BULK INSERTION PIPELINE
Provider: cohere_1024d
Vector Bucket: finrag-embeddings-s3vectors
Index Name: finrag-sentence-fact-embed-1024d
Batch Size: 500 vectors/request

[Preflight Check - Index Validation]
  Index: finrag-sentence-fact-embed-1024d
  ARN: arn:aws:s3vectors:us-east-1:729472661729:bucket/finrag-embeddings-s3vectors/index/finrag-sentence-fact-embed-1024d
  Created: 2025-11-10 14:52:15-05:00
  Dimension: 1024d
  Data Type: float32
  Distance Metric: cosine
  Non-filterable Keys: {'section_sentence_count', 'sentenceID', 'embedding_id'}
  ✓ Non-filterable keys match expected configuration
  ✓ Preflight validation passed

[Loading Stage 3 Data]
  Source: finrag_embeddings_s3vectors_cohere_1024d.parquet
  Filter: ALL companies
  Filter: Years in [2021, 2022, 2023, 2024, 2025, 2012, 2013

Inserting: 100%|██████████| 203972/203972 [49:49<00:00, 68.23vectors/s]


✓ INSERTION COMPLETE
  Vectors inserted: 203,972 / 203,972
  Success rate: 100.0%
  Duration: 2993.8s

[Index Ready for Queries]
  Query example:
  response = s3vectors.query_vectors(
      vectorBucketName='finrag-embeddings-s3vectors',
      indexName='finrag-sentence-fact-embed-1024d',
      queryVector=[...1024d embedding...],
      topK=10,
      filter={'cik_int': 1318605, 'report_year': 2021})

[Programmatic Summary]
  {'provider': 'cohere_1024d', 'cik_filter': None, 'year_filter': [2021, 2022, 2023, 2024, 2025, 2012, 2013, 2014], 'total_in_stage3': 203972, 'filtered_count': 203972, 'total_inserted': 203972, 'failed_batches': 0, 'shrunk_batches': 0, 'success_rate': 100.0, 'duration_seconds': 2993.8055713176727, 'batches_processed': 408}





In [None]:
# ============================================================================
# S3 VECTORS DATA INSERTION
# Features: Preflight validation, retry logic, backoff, payload guards
# ============================================================================

# Execution Parameters
INSERT_VECTORS_TO_S3VECTORS = True
BATCH_SIZE = 500                       # Optimal (AWS max per request)
S3VECTORS_PROVIDER = "cohere_1024d"

# ============================================================================
# IMPORTS
# ============================================================================

import sys
from pathlib import Path
sys.path.append(str(Path.cwd().parent / 'loaders'))

from loaders.ml_config_loader import MLConfig
import polars as pl
import numpy as np
import boto3
from botocore.exceptions import ClientError
from tqdm import tqdm
import time
import random
import json

# ============================================================================
# CONFIGURATION
# ============================================================================

config = MLConfig()

VECTOR_BUCKET = "finrag-embeddings-s3vectors"
INDEX_NAME = "finrag-sentence-fact-embed-1024d"

DIM = config.s3vectors_dimensions(S3VECTORS_PROVIDER)
REGION = config.region

s3vectors = boto3.client("s3vectors", region_name=REGION,
                         aws_access_key_id=config.aws_access_key,
                         aws_secret_access_key=config.aws_secret_key)

# ============================================================================
#  FEATURE 1: PREFLIGHT INDEX VALIDATION
# ============================================================================

def describe_and_validate_index(s3v_client, bucket, index, expected_dim):
    """
    Validate index configuration matches our data before insertion
    
    Checks:
    1. Dimension matches (1024)
    2. Distance metric appropriate for embeddings
    3. Non-filterable metadata keys declared correctly
    
    Prevents: Failed inserts due to config mismatch
    """
    print(f"\n[Preflight Check - Index Validation]")
    
    try:
        response = s3v_client.get_index(
            vectorBucketName=bucket,
            indexName=index
        )
    except ClientError as e:
        raise RuntimeError(f"Failed to get index: {e}")
    
    # Extract configuration (flat structure, lowercase keys)
    index_data = response['index']
    actual_dim = index_data['dimension']
    distance_metric = index_data['distanceMetric']
    data_type = index_data['dataType']
    
    # Get non-filterable metadata keys
    meta_cfg = index_data.get('metadataConfiguration', {})
    nonfilterable_keys = set(meta_cfg.get('nonFilterableMetadataKeys', []))
    
    print(f"  Index: {index}")
    print(f"  ARN: {index_data['indexArn']}")
    print(f"  Created: {index_data['creationTime']}")
    print(f"  Dimension: {actual_dim}")
    print(f"  Data Type: {data_type}")
    print(f"  Distance Metric: {distance_metric}")
    print(f"  Non-filterable Keys: {nonfilterable_keys or 'None'}")
    
    # Validate dimension
    if actual_dim != expected_dim:
        raise ValueError(
            f"Dimension mismatch!\n"
            f"  Index configured: {actual_dim}d\n"
            f"   embeddings: {expected_dim}d\n"
            f"  → Cannot insert mismatched dimensions"
        )
    
    # Validate data type
    if data_type != 'float32':
        print(f"  ⚠️  Warning: Index expects '{data_type}', we're sending float32")
    
    # Check distance metric (informational)
    if distance_metric not in ['cosine', 'euclidean', 'dotProduct']:
        print(f"  ⚠️  Warning: Unusual distance metric '{distance_metric}'")
    
    # Validate non-filterable metadata alignment
    expected_nonfilterable = {'sentenceID', 'embedding_id', 'section_sentence_count'}
    
    if nonfilterable_keys:
        if nonfilterable_keys == expected_nonfilterable:
            print(f"  ✓ Non-filterable keys match expected configuration")
        else:
            missing = expected_nonfilterable - nonfilterable_keys
            extra = nonfilterable_keys - expected_nonfilterable
            if missing:
                print(f"  Note: Expected non-filterable keys not configured: {missing}")
            if extra:
                print(f"  Note: Additional non-filterable keys: {extra}")
    else:
        print(f"  Note: No non-filterable keys configured")
        print(f"     All metadata will be filterable (may increase query costs)")
    
    print(f"  ✓ Preflight validation passed")
    
    return distance_metric, nonfilterable_keys


# ============================================================================
#  FEATURE 2: PAYLOAD SIZE GUARD
# ============================================================================

MAX_PAYLOAD_SIZE = 20 * 1024 * 1024  # 20 MiB (AWS hard limit)

def estimate_payload_size(vectors_batch):
    """
    Estimate JSON payload size to stay under 20 MiB limit
    
    AWS limit: PutVectors request ≤ 20 MiB
    Our typical: 500 vectors × 4KB each = ~2 MiB (safe)
    
    Risk: Large non-filterable metadata (text fields)
    """
    # Quick estimation via JSON serialization
    sample_payload = {"vectors": vectors_batch}
    estimated_bytes = len(json.dumps(sample_payload, default=str))
    return estimated_bytes

def shrink_batch_if_needed(vectors_batch, max_size=MAX_PAYLOAD_SIZE):
    """
    Dynamically reduce batch size if payload too large
    
    Returns: (potentially_smaller_batch, was_shrunk)
    """
    size = estimate_payload_size(vectors_batch)
    
    if size <= max_size:
        return vectors_batch, False
    
    # Binary search for acceptable batch size
    print(f"  Payload too large ({size/1024/1024:.1f} MB), shrinking batch...")
    
    while size > max_size and len(vectors_batch) > 1:
        vectors_batch = vectors_batch[:len(vectors_batch) // 2]
        size = estimate_payload_size(vectors_batch)
    
    print(f"     → Reduced to {len(vectors_batch)} vectors ({size/1024/1024:.1f} MB)")
    return vectors_batch, True

# ============================================================================
#  FEATURE 3: RESILIENT RETRY WITH EXPONENTIAL BACKOFF
# ============================================================================

def put_vectors_with_retry(s3v_client, bucket, index, vectors_batch, max_attempts=7):
    """
    Insert vectors with intelligent retry logic
    
    Handles:
    - 429 TooManyRequestsException (throttling) → Retry with backoff
    - 503 ServiceUnavailableException (capacity) → Retry with backoff
    - 4xx client errors → Fail fast (bad request, don't retry)
    - 5xx server errors → Retry with backoff
    
    Backoff strategy:
    - Initial: 0.5s + jitter
    - Exponential: doubles each retry (capped at 4s)
    - Jitter: ±0.25s randomization (prevents thundering herd)
    """
    
    # Guard against oversized payloads
    vectors_batch, was_shrunk = shrink_batch_if_needed(vectors_batch)
    
    attempt = 0
    delay = 0.5  # Start with 500ms
    
    while attempt < max_attempts:
        try:
            # Attempt insertion
            response = s3v_client.put_vectors(
                vectorBucketName=bucket,
                indexName=index,
                vectors=vectors_batch
            )
            
            # Success!
            return len(vectors_batch), was_shrunk
        
        except ClientError as e:
            error_code = e.response.get('Error', {}).get('Code', 'Unknown')
            http_status = e.response.get('ResponseMetadata', {}).get('HTTPStatusCode', 500)
            
            # Classify error type
            is_throttle = error_code == 'TooManyRequestsException'
            is_capacity = error_code == 'ServiceUnavailableException'
            is_client_error = 400 <= http_status < 500
            is_server_error = http_status >= 500
            
            # Client errors (4xx) except throttle → Fail fast
            if is_client_error and not is_throttle:
                print(f"\n  ❌ Client error (non-retryable): {error_code}")
                print(f"     Message: {e.response.get('Error', {}).get('Message', 'No details')}")
                raise  # Don't retry bad requests
            
            # Retryable errors: throttle, capacity, server errors
            if is_throttle or is_capacity or is_server_error:
                attempt += 1
                
                if attempt >= max_attempts:
                    print(f"\n  ❌ Max retries ({max_attempts}) exceeded")
                    raise
                
                # Calculate backoff with jitter
                jitter = random.random() * 0.25  # ±0.25s randomization
                sleep_time = min(delay + jitter, 4.0)  # Cap at 4s
                
                retry_reason = "throttled" if is_throttle else ("capacity" if is_capacity else "server error")
                print(f"  ⏳ Retry {attempt}/{max_attempts} ({retry_reason}), waiting {sleep_time:.2f}s...")
                
                time.sleep(sleep_time)
                delay *= 2  # Exponential backoff
                continue
            
            # Unknown error type
            print(f"\n  Unknown error: {error_code}")
            raise
        
        except Exception as e:
            # Unexpected error
            print(f"\n  ❌ Unexpected error: {type(e).__name__}: {e}")
            raise
    
    # Should never reach here
    raise RuntimeError("Retry logic error")

# ============================================================================
#  FEATURE 4: IDEMPOTENCY LOGGING
# ============================================================================

def log_batch_boundaries(batch_num, vectors_batch):
    """
    Log first/last keys for idempotency tracking
    
    Use case: If insertion fails mid-run, you can resume from last successful batch
    without re-inserting earlier vectors (S3 Vectors upserts on duplicate keys)
    """
    if not vectors_batch:
        return
    
    first_key = vectors_batch[0]['key']
    last_key = vectors_batch[-1]['key']
    
    print(f"  Batch {batch_num}: [{first_key}...{last_key}] ({len(vectors_batch)} vectors)")

# ============================================================================
# CORE HELPER FUNCTIONS 
# ============================================================================

def validate_embedding(embedding, expected_dim):
    """Convert embedding to float32 and validate dimensions"""
    arr = np.asarray(embedding, dtype=np.float32)
    if arr.shape[0] != expected_dim:
        raise ValueError(f"Dimension mismatch: got {arr.shape[0]}, expected {expected_dim}")
    return arr.tolist()

def batch_to_s3vectors_format(batch_df):
    """Convert Polars DataFrame batch to S3 Vectors PutVectors format"""
    vectors = []
    
    for row in batch_df.iter_rows(named=True):
        embedding_float32 = validate_embedding(row['embedding'], DIM)
        
        vector_item = {
            "key": str(row['sentenceID_numsurrogate']),
            "data": {"float32": embedding_float32},
            "metadata": {
                # Filterable (5 fields)
                "cik_int": int(row['cik_int']),
                "report_year": int(row['report_year']),
                "section_name": str(row['section_name']),
                "sic": str(row['sic']),
                "sentence_pos": int(row['sentence_pos']),
                
                # Non-filterable (3 fields)
                "sentenceID": str(row['sentenceID']),
                "embedding_id": str(row['embedding_id']),
                "section_sentence_count": int(row['section_sentence_count'])
            }
        }
        
        vectors.append(vector_item)
    
    return vectors

# ============================================================================
# MAIN INSERTION LOGIC
# ============================================================================

if INSERT_VECTORS_TO_S3VECTORS:
    
    print("="*70)
    print("S3 VECTORS DATA INSERTION -  VERSION")
    print("="*70)
    print(f"Vector Bucket: {VECTOR_BUCKET}")
    print(f"Index Name: {INDEX_NAME}")
    print(f"Provider: {S3VECTORS_PROVIDER}")
    print(f"Batch Size: {BATCH_SIZE} (AWS max)")
    
    # ========================================================================
    # STEP 1: PREFLIGHT VALIDATION
    # ========================================================================
    
    distance_metric, nonfilterable_keys = describe_and_validate_index(
        s3vectors, VECTOR_BUCKET, INDEX_NAME, DIM
    )
    
    # ========================================================================
    # STEP 2: LOAD STAGE 3 DATA
    # ========================================================================
    
    cache_path = config.get_s3vectors_cache_path(S3VECTORS_PROVIDER)
    
    if not cache_path.exists():
        raise FileNotFoundError(
            f"Stage 3 data not found: {cache_path}\n"
            f"Run BUILD_S3VECTORS_TABLE=True first"
        )
    
    print(f"\n[Loading Stage 3 Data]")
    print(f"  Source: {cache_path.name}")
    df_stage3 = pl.read_parquet(cache_path)
    total_rows = len(df_stage3)
    print(f"  Loaded: {total_rows:,} vectors")
    
    # Validate schema
    required_cols = ['sentenceID_numsurrogate', 'embedding', 'cik_int', 'report_year',
                     'section_name', 'sic', 'sentence_pos', 'sentenceID',
                     'embedding_id', 'section_sentence_count']
    missing = [c for c in required_cols if c not in df_stage3.columns]
    if missing:
        raise ValueError(f"Missing columns: {missing}")
    
    # ========================================================================
    # STEP 3: INSERT WITH RESILIENT RETRY
    # ========================================================================
    
    print(f"\n[Inserting Vectors with Retry Logic]")
    num_batches = (total_rows + BATCH_SIZE - 1) // BATCH_SIZE
    print(f"  Total batches: {num_batches}")
    print(f"  Retry strategy: Exponential backoff (max 7 attempts)")
    
    total_inserted = 0
    failed_batches = 0
    shrunk_batches = 0
    batch_num = 0
    
    with tqdm(total=total_rows, desc="Inserting", unit="vectors") as pbar:
        for i in range(0, total_rows, BATCH_SIZE):
            batch_num += 1
            
            # Get batch
            batch_df = df_stage3[i:i+BATCH_SIZE]
            vectors_batch = batch_to_s3vectors_format(batch_df)
            
            # Log batch boundaries (idempotency tracking)
            # log_batch_boundaries(batch_num, vectors_batch)  # Uncomment for debugging
            
            try:
                # Insert with retry
                inserted, was_shrunk = put_vectors_with_retry(
                    s3vectors, VECTOR_BUCKET, INDEX_NAME, vectors_batch
                )
                
                total_inserted += inserted
                if was_shrunk:
                    shrunk_batches += 1
                
                pbar.update(inserted)
                
            except Exception as e:
                print(f"\n  ❌ Batch {batch_num} failed permanently: {e}")
                failed_batches += 1
                continue
    
    # ========================================================================
    # STEP 4: SUMMARY
    # ========================================================================
    
    print(f"\n{'='*70}")
    print(f"✓ INSERTION COMPLETE")
    print(f"{'='*70}")
    print(f"  Vectors inserted: {total_inserted:,}")
    print(f"  Success rate: {total_inserted/total_rows*100:.1f}%")
    
    if failed_batches > 0:
        print(f"  Failed batches: {failed_batches}/{num_batches}")
    
    if shrunk_batches > 0:
        print(f"  Batches auto-shrunk (payload > 20MB): {shrunk_batches}")
    
    print(f"\n[Index Ready for Queries]")
    print(f"  Query example:")
    print(f"  response = s3vectors.query_vectors(")
    print(f"      vectorBucketName='{VECTOR_BUCKET}',")
    print(f"      indexName='{INDEX_NAME}',")
    print(f"      queryVector=[...1024d embedding...],")
    print(f"      topK=10,")
    print(f"      filter={{'cik_int': 1318605, 'report_year': 2020}})")
    print("="*70)

else:
    print("INSERT_VECTORS_TO_S3VECTORS = False. Skipping.")

[DEBUG] ✓ AWS credentials loaded from aws_credentials.env
S3 VECTORS DATA INSERTION -  VERSION
Vector Bucket: finrag-embeddings-s3vectors
Index Name: finrag-sentence-fact-embed-1024d
Provider: cohere_1024d
Batch Size: 500 (AWS max)

[Preflight Check - Index Validation]
  Index: finrag-sentence-fact-embed-1024d
  ARN: arn:aws:s3vectors:us-east-1:729472661729:bucket/finrag-embeddings-s3vectors/index/finrag-sentence-fact-embed-1024d
  Created: 2025-11-10 14:52:15-05:00
  Dimension: 1024
  Data Type: float32
  Distance Metric: cosine
  Non-filterable Keys: {'embedding_id', 'section_sentence_count', 'sentenceID'}
  ✓ Non-filterable keys match expected configuration
  ✓ Preflight validation passed

[Loading Stage 3 Data]
  Source: finrag_embeddings_s3vectors_cohere_1024d.parquet
  Loaded: 203,076 vectors

[Inserting Vectors with Retry Logic]
  Total batches: 407
  Retry strategy: Exponential backoff (max 7 attempts)


Inserting: 100%|██████████| 203076/203076 [17:38<00:00, 191.93vectors/s]


✓ INSERTION COMPLETE
  Vectors inserted: 203,076
  Success rate: 100.0%

[Index Ready for Queries]
  Query example:
  response = s3vectors.query_vectors(
      vectorBucketName='finrag-embeddings-s3vectors',
      indexName='finrag-sentence-fact-embed-1024d',
      queryVector=[...1024d embedding...],
      topK=10,
      filter={'cik_int': 1318605, 'report_year': 2020})



