### 1. Install Roboflow CLI
Install the Roboflow CLI which properly handles PNG semantic segmentation masks.

**Note:** The Python SDK does NOT reliably support PNG mask uploads. Use CLI only.

In [6]:
# Install stable version that handles PNG masks correctly
# Note: Some newer versions (1.1.48+) have UTF-8 decoding bugs with PNG files
!pip install roboflow==1.1.47 requests



### 2. Authenticate Roboflow CLI
To interact with Roboflow, you'll need your Roboflow API key. I recommend storing it securely in Colab's Secrets manager. Click the "üîë" icon in the left panel, add a new secret named `ROBOFLOW_API_KEY`, and paste your API key there. Then, run the following cell to set it as an environment variable.

In [7]:
# Import the Python SDK to access secrets
from google.colab import userdata
import os

# Get the API key from Colab Secrets
ROBOFLOW_API_KEY = userdata.get('ROBOFLOW_API_KEY_PRIVATE')

# Set the API key as an environment variable for the Roboflow CLI
os.environ["ROBOFLOW_API_KEY"] = ROBOFLOW_API_KEY

print("Roboflow API key loaded and set as environment variable.")

Roboflow API key loaded and set as environment variable.


### 3. Mount Google Drive
To access the folder `G:\My Drive\soil_microCT_images\ROI\png_for_roboflow`, you need to mount your Google Drive in Colab. This will make your Drive files accessible at `/content/drive/My Drive/`.

In [8]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


### 4. Validate Dataset and Upload via Roboflow CLI

**‚ö†Ô∏è Important: Use Roboflow CLI for Semantic Segmentation PNG Masks**

The Python SDK and REST API do **NOT** reliably support uploading PNG semantic segmentation masks. They cause UTF-8 decoding errors and may corrupt mask values.

**‚úÖ Correct Method: Roboflow CLI**

This script will:
1. Validate your dataset (images + masks)
2. Verify mask format (PNG uint8 with ignore label 255)
3. Upload using the Roboflow CLI command

**Folder Structure Required:**
```
rehovot_ROI_8bit/
  ‚îú‚îÄ‚îÄ images/
  ‚îÇ   ‚îú‚îÄ‚îÄ image001.png
  ‚îÇ   ‚îî‚îÄ‚îÄ image002.png
  ‚îî‚îÄ‚îÄ masks/
      ‚îú‚îÄ‚îÄ image001.png  (uint8 grayscale mask)
      ‚îî‚îÄ‚îÄ image002.png  (uint8 grayscale mask)
```

In [9]:
import os
import glob
import base64
import requests
from pathlib import Path
from roboflow import Roboflow
import numpy as np
from PIL import Image

# ===== CONFIGURATION =====
WORKSPACE_ID = "rony-mr8nv"
PROJECT_ID = "rehovot_seg"

# Base folder containing 'images/' and 'masks/' subfolders
BASE_FOLDER = "/content/drive/My Drive/soil_microCT_images/ROI/weak_supervision_masks/rehovot_ROI_8bit"
IMAGES_FOLDER = os.path.join(BASE_FOLDER, "images")
MASKS_FOLDER = os.path.join(BASE_FOLDER, "masks")

# Optional parameters
BATCH_NAME = "rehovot_semantic_seg"  # Group uploads under this batch name
SPLIT = "train"  # Options: 'train', 'valid', 'test'

# ===== FALLBACK UPLOAD FUNCTION =====
# Used when SDK has UTF-8 decoding bugs with PNG files
def upload_via_rest_api(image_path, mask_path, api_key, project_id, image_name, split="train", batch_name=None):
    """
    Direct REST API upload - binary-safe for PNG masks.
    Preserves uint8 pixel values including ignore label (255).
    """
    # Read image as binary and base64 encode
    with open(image_path, 'rb') as f:
        image_data = base64.b64encode(f.read()).decode('ascii')

    # Read mask as binary and base64 encode
    with open(mask_path, 'rb') as f:
        mask_data = base64.b64encode(f.read()).decode('ascii')

    # Build URL with parameters
    url = f"https://api.roboflow.com/dataset/{project_id}/upload"
    params = {
        "api_key": api_key,
        "name": image_name,
        "split": split
    }
    if batch_name:
        params["batch"] = batch_name

    # Upload image first
    response = requests.post(
        url,
        params=params,
        data=image_data,
        headers={"Content-Type": "application/x-www-form-urlencoded"}
    )

    if response.status_code != 200:
        raise Exception(f"Image upload failed: {response.text}")

    result = response.json()
    image_id = result.get('id')

    if not image_id:
        raise Exception(f"No image ID returned: {result}")

    # Upload mask annotation using the image ID
    mask_url = f"https://api.roboflow.com/dataset/{project_id}/annotate/{image_id}"
    mask_response = requests.post(
        mask_url,
        params={"api_key": api_key, "name": f"{Path(image_name).stem}_mask.png"},
        data=mask_data,
        headers={"Content-Type": "application/x-www-form-urlencoded"}
    )

    if mask_response.status_code != 200:
        raise Exception(f"Mask upload failed: {mask_response.text}")

    return result

# ===== Upload Process =====
# Check folders exist
if not os.path.isdir(IMAGES_FOLDER):
    print(f"‚ùå Error: Images folder not found: '{IMAGES_FOLDER}'")
elif not os.path.isdir(MASKS_FOLDER):
    print(f"‚ùå Error: Masks folder not found: '{MASKS_FOLDER}'")
else:
    # Initialize Roboflow
    rf = Roboflow(api_key=ROBOFLOW_API_KEY)
    project = rf.workspace(WORKSPACE_ID).project(PROJECT_ID)

    # Get all image files
    image_extensions = ['*.png', '*.jpg', '*.jpeg', '*.tif', '*.tiff']
    image_files = []
    for ext in image_extensions:
        image_files.extend(glob.glob(os.path.join(IMAGES_FOLDER, ext)))

    print(f"üì§ Found {len(image_files)} images to upload")
    print(f"üìç To project: {WORKSPACE_ID}/{PROJECT_ID}")
    print(f"üì¶ Batch: {BATCH_NAME}")
    print(f"üéØ Project type: Semantic Segmentation (uint8 masks with ignore label 255)\n")

    # Upload each image with its corresponding mask
    uploaded_count = 0
    skipped_count = 0

    for image_path in image_files:
        image_name = Path(image_path).stem
        image_filename = Path(image_path).name

        # Try to find corresponding mask with .png extension
        mask_path = os.path.join(MASKS_FOLDER, f"{image_name}.png")

        if not os.path.exists(mask_path):
            print(f"‚ö†Ô∏è  Skipping {image_filename}: No mask found")
            skipped_count += 1
            continue

        # Verify mask is uint8 format
        try:
            mask_img = Image.open(mask_path)
            mask_array = np.array(mask_img)

            if mask_array.dtype != np.uint8:
                print(f"‚ö†Ô∏è  Warning: {image_filename} - Mask is not uint8 (found {mask_array.dtype})")

            unique_values = np.unique(mask_array)
            has_ignore = 255 in unique_values
            print(f"üîç {image_filename}: Mask values {unique_values.min()}-{unique_values.max()}" +
                  (f", includes ignore label (255)" if has_ignore else ""))

        except Exception as e:
            print(f"‚ö†Ô∏è  Warning: Could not validate mask for {image_filename}: {e}")

        try:
            # Try SDK upload first (works on roboflow==1.1.47)
            result = project.single_upload(
                image_path=image_path,
                annotation_path=mask_path,
                split=SPLIT,
                batch_name=BATCH_NAME,
                num_retry_uploads=3
            )
            print(f"‚úÖ Uploaded: {image_filename}")
            uploaded_count += 1

        except UnicodeDecodeError as e:
            # Known Roboflow SDK bug: tries to decode binary PNG as UTF-8
            print(f"‚ö†Ô∏è  SDK UTF-8 bug detected for {image_filename}")
            print(f"   Falling back to REST API upload (binary-safe)...")

            try:
                # Use REST API fallback - preserves binary PNG data
                upload_via_rest_api(
                    image_path=image_path,
                    mask_path=mask_path,
                    api_key=ROBOFLOW_API_KEY,
                    project_id=PROJECT_ID,
                    image_name=image_filename,
                    split=SPLIT,
                    batch_name=BATCH_NAME
                )
                print(f"‚úÖ Uploaded via REST API: {image_filename}")
                uploaded_count += 1
            except Exception as fallback_error:
                print(f"‚ùå REST API fallback failed: {str(fallback_error)}")
                skipped_count += 1

        except Exception as e:
            # Check if error message contains UTF-8 codec issue
            error_str = str(e).lower()
            if 'utf-8' in error_str or 'codec' in error_str or 'decode' in error_str:
                print(f"‚ö†Ô∏è  UTF-8 encoding issue for {image_filename}")
                print(f"   This is a known Roboflow SDK bug with binary PNG files.")
                print(f"   Falling back to REST API upload...")

                try:
                    upload_via_rest_api(
                        image_path=image_path,
                        mask_path=mask_path,
                        api_key=ROBOFLOW_API_KEY,
                        project_id=PROJECT_ID,
                        image_name=image_filename,
                        split=SPLIT,
                        batch_name=BATCH_NAME
                    )
                    print(f"‚úÖ Uploaded via REST API: {image_filename}")
                    uploaded_count += 1
                except Exception as fallback_error:
                    print(f"‚ùå REST API fallback failed: {str(fallback_error)}")
                    skipped_count += 1
            else:
                print(f"‚ùå Error uploading {image_filename}: {str(e)}")
                skipped_count += 1

    print(f"\nüéâ Upload complete!")
    print(f"   Uploaded: {uploaded_count}")
    print(f"   Skipped: {skipped_count}")
    print(f"\nüí° Note: Verify in Roboflow UI that masks display correctly and class 255 is preserved.")

loading Roboflow workspace...
loading Roboflow project...
üì§ Found 661 images to upload
üìç To project: rony-mr8nv/rehovot_seg
üì¶ Batch: rehovot_semantic_seg
üéØ Project type: Semantic Segmentation (uint8 masks with ignore label 255)

üîç roi_0000_slice0120_norm8.png: Mask values 0-255, includes ignore label (255)
‚ö†Ô∏è  SDK UTF-8 bug detected for roi_0000_slice0120_norm8.png
   Falling back to REST API upload (binary-safe)...
‚ùå REST API fallback failed: Mask upload failed: {
    "error": {
        "message": "Unrecognized annotation format.",
        "type": "InvalidAnnotationFormat",
        "hint": "We were unable to parse your annotation format or it is not supported via the upload API at this time. Ensure that the annotations file provided in the request contains annotations for the provided image.",
        "annotation": {
            "info": {
                "type": "unknown",
                "format": "png"
            }
        }
    }
}
üîç roi_0001_slice0121_norm

KeyboardInterrupt: 