# Upstream SDK CKAN Integration Demo

This notebook demonstrates the CKAN integration capabilities of the Upstream SDK for publishing environmental monitoring data to CKAN data portals.

## Overview

The Upstream SDK provides seamless integration with CKAN (Comprehensive Knowledge Archive Network) data portals for:
- 📊 **Dataset Publishing**: Automatically create CKAN datasets from campaign data
- 📁 **Resource Management**: Upload sensor configurations and measurement data as resources
- 🏢 **Organization Support**: Publish data under specific CKAN organizations
- 🔄 **Update Management**: Update existing datasets with new data
- 🏷️ **Metadata Integration**: Rich metadata tagging and categorization

## Features Demonstrated

- CKAN client setup and configuration
- Campaign data export and preparation
- Dataset creation with comprehensive metadata
- Resource management (sensors and measurements)
- Organization and permission handling
- Error handling and validation

## Prerequisites

- Valid Upstream account credentials
- Access to a CKAN portal with API credentials
- Existing campaign data (or run UpstreamSDK_Core_Demo.ipynb first)
- Python 3.9+ environment with required packages

## Related Notebooks

- **UpstreamSDK_Core_Demo.ipynb**: Core SDK functionality and campaign creation

## Installation and Setup

In [1]:
# Install required packages
#!pip install upstream-sdk
!pip install -e .
# Import required libraries
import os
import json
import getpass
from pathlib import Path
from datetime import datetime
from typing import Dict, Any, Optional, List
from io import BytesIO

# Import Upstream SDK modules
from upstream.client import UpstreamClient
from upstream.ckan import CKANIntegration

Obtaining file:///Users/mosorio/repos/tacc/upstream/sdk
  Installing build dependencies ... [?25ldone
[?25h  Checking if build backend supports build_editable ... [?25ldone
[?25h  Getting requirements to build editable ... [?25ldone
[?25h  Preparing editable metadata (pyproject.toml) ... [?25ldone
Building wheels for collected packages: upstream-sdk
  Building editable for upstream-sdk (pyproject.toml) ... [?25ldone
[?25h  Created wheel for upstream-sdk: filename=upstream_sdk-1.0.1-0.editable-py3-none-any.whl size=9407 sha256=79cab2fc9f667c297064e8565955bc70e89a61c4ddd08c1d767f0d20e5895c30
  Stored in directory: /private/var/folders/qn/xpsy3ssx5hbbb_ndr2sbt5w80000gn/T/pip-ephem-wheel-cache-p0_uqrqj/wheels/47/dc/ae/1a3abd774032839edac85dcd8bb9739031dd6ccef29fca9667
Successfully built upstream-sdk
Installing collected packages: upstream-sdk
  Attempting uninstall: upstream-sdk
    Found existing installation: upstream-sdk 1.0.1
    Uninstalling upstream-sdk-1.0.1:
      Successf

## 1. Configuration and Authentication

The Upstream SDK supports multiple configuration methods for flexible deployment:

### 📁 Option 1: Configuration File (Recommended)
Create a `config.yaml` file in the notebook directory with the following structure:

```yaml
upstream:
  username: "your_tacc_username"
  password: "your_password" 
  base_url: "https://upstream-dso.tacc.utexas.edu/dev"

ckan:
  url: "https://ckan.tacc.utexas.edu"
  organization: "setx-uifl"
  api_key: "your_ckan_api_key"  # Optional for read-only operations

upload:
  chunk_size: 10000
  max_file_size_mb: 50
  timeout_seconds: 30
  retry_attempts: 3
```

### 🌍 Option 2: Environment Variables
Set these environment variables:
- `UPSTREAM_USERNAME`: Your TACC username
- `UPSTREAM_PASSWORD`: Your password
- `UPSTREAM_BASE_URL`: API base URL
- `CKAN_URL`: CKAN portal URL  
- `CKAN_ORGANIZATION`: Target organization

### ⌨️ Option 3: Interactive Input (Fallback)
If no config file or environment variables are found, the notebook will prompt for credentials.

**Configuration Options:**
- **Upstream API**: Username/password authentication
- **CKAN Portal**: API key or access token authentication  
- **Organization**: CKAN organization for dataset publishing

In [2]:
# Configuration
UPSTREAM_BASE_URL = "https://upstream-dso.tacc.utexas.edu/dev"
# CKAN Configuration - Update these for your CKAN portal
CKAN_URL = "https://ckan.tacc.utexas.edu"   # Replace with your CKAN portal URL
CKAN_ORGANIZATION = "setx-uifl"      # Replace with your organization name

#For local development, uncomment the line below:
#UPSTREAM_BASE_URL = 'http://localhost:8000'
#CKAN_URL = 'http://ckan.tacc.cloud:5000'
#CKAN_ORGANIZATION = 'org'

print("🔧 Configuration Settings:")
print(f"   Upstream API: {UPSTREAM_BASE_URL}")
print(f"   CKAN Portal: {CKAN_URL}")
print(f"   CKAN Organization: {CKAN_ORGANIZATION}")

🔧 Configuration Settings:
   Upstream API: https://upstream-dso.tacc.utexas.edu/dev
   CKAN Portal: https://ckan.tacc.utexas.edu
   CKAN Organization: setx-uifl


In [3]:
# Configuration and credential handling using UpstreamClient's built-in methods
import getpass
from pathlib import Path

def get_client_credentials():
    """
    Get credentials and initialize UpstreamClient using config file or manual input.

    The UpstreamClient supports multiple initialization methods:
    1. config_file parameter - loads from YAML/JSON config file
    2. from_config() class method - dedicated config file loader
    3. from_environment() class method - loads from environment variables
    4. Direct parameter passing - username, password, etc.
    """
    config_path = Path("config.yaml")

    if config_path.exists():
        print("📄 Loading configuration from config.yaml...")
        try:
            # Use UpstreamClient's built-in config file support
            print(f"✅ Configuration file found: {config_path}")
            print("Will initialize client using config_file parameter")
            return None, None, config_path  # Return config_path for client initialization

        except Exception as e:
            print(f"❌ Error accessing config.yaml: {e}")
            print("Falling back to manual credential input...")
    else:
        print("📄 config.yaml not found, using manual credential input...")

    # Fallback to manual credential input
    print("🔐 Please enter your TACC credentials:")
    upstream_username = input("Tacc Username: ")
    upstream_password = getpass.getpass("Upstream Password: ")

    return upstream_username, upstream_password, None

# Get credentials and config path
upstream_username, upstream_password, config_file_path = get_client_credentials()

📄 Loading configuration from config.yaml...
✅ Configuration file found: config.yaml
Will initialize client using config_file parameter


In [4]:
# Initialize Upstream client with CKAN integration using optimal method
try:
    # Choose initialization method based on available configuration
    if config_file_path:
        # Method 1: Use config file with UpstreamClient's built-in support
        print("🔧 Initializing client from config file...")
        client = UpstreamClient(config_file=config_file_path)
        print(f"✅ Client initialized from config file: {config_file_path}")
        UPSTREAM_BASE_URL = client.get_config().get("base_url", UPSTREAM_BASE_URL)
        CKAN_URL = client.get_config().get("ckan_url", CKAN_URL)
        CKAN_ORGANIZATION = client.get_config().get("ckan_organization", CKAN_ORGANIZATION)
    else:
        # Method 2: Use manual credentials with predefined settings
        print("🔧 Initializing client with manual credentials...")

        # Get CKAN API credentials (optional for manual mode)
        print("\n🔑 CKAN API credentials (optional for demo):")
        ckan_api_key = getpass.getpass("CKAN API Key (press Enter to skip): ")

        # Prepare additional configuration for manual mode
        additional_config = {"timeout": 30}
        if ckan_api_key:
            additional_config["api_key"] = ckan_api_key
            print("✅ CKAN API key configured")
        else:
            print("ℹ️  Running in read-only CKAN mode")

        client = UpstreamClient(
            username=upstream_username,
            password=upstream_password,
            base_url=UPSTREAM_BASE_URL,
            ckan_url=CKAN_URL,
            ckan_organization=CKAN_ORGANIZATION,
            **additional_config
        )
        print("✅ Client initialized with manual credentials")

    # Test Upstream authentication
    if client.authenticate():
        print("✅ Upstream authentication successful!")
        print(f"🔗 Connected to: {client.get_config()['base_url']}")

        # Check CKAN integration
        if client.ckan:
            print("✅ CKAN integration enabled!")
            print(f"🔗 CKAN Portal: {client.get_config()['ckan_url']}")
        else:
            print("⚠️  CKAN integration not configured")

        # Display client configuration summary
        config_summary = client.get_config()
        print(f"\n📋 Client Configuration Summary:")
        print(f"   • Base URL: {config_summary.get('base_url', 'Not set')}")
        print(f"   • CKAN URL: {config_summary.get('ckan_url', 'Not set')}")
        print(f"   • CKAN Organization: {config_summary.get('ckan_organization', 'Not set')}")
        print(f"   • Username: {config_summary.get('username', 'Not set')}")

    else:
        print("❌ Upstream authentication failed!")
        raise Exception("Upstream authentication failed")

except Exception as e:
    print(f"❌ Setup error: {e}")
    print("\nTroubleshooting tips:")
    if config_file_path:
        print("   • Check config.yaml format and credentials")
        print("   • Ensure all required fields are present in config file")
    else:
        print("   • Verify username and password are correct")
    print("   • Check network connectivity to Upstream API")
    print("   • Verify CKAN portal accessibility")
    raise

🔧 Initializing client from config file...
✅ Client initialized from config file: config.yaml
✅ Upstream authentication successful!
🔗 Connected to: https://upstream-dso.tacc.utexas.edu/dev
✅ CKAN integration enabled!
🔗 CKAN Portal: https://ckan.tacc.utexas.edu

📋 Client Configuration Summary:
   • Base URL: https://upstream-dso.tacc.utexas.edu/dev
   • CKAN URL: https://ckan.tacc.utexas.edu
   • CKAN Organization: dso-internal
   • Username: mosorio


## 2. Campaign Selection and Data Preparation

Let's select an existing campaign with data to publish to CKAN. If you don't have existing data, run the core demo notebook first.

In [5]:
# List available campaigns
print("📋 Available campaigns for CKAN publishing:")
try:
    campaigns = client.list_campaigns(limit=10)

    if campaigns.total == 0:
        print("❌ No campaigns found. Please run UpstreamSDK_Core_Demo.ipynb first to create sample data.")
        raise Exception("No campaigns available")

    print(f"Found {campaigns.total} campaigns:")
    for i, campaign in enumerate(campaigns.items[:5]):
        print(f"  {i+1}. ID: {campaign.id} - {campaign.name}")
        print(f"     Description: {campaign.description[:80]}...")
        print(f"     Contact: {campaign.contact_name} ({campaign.contact_email})")
        print()

    # Select campaign (use the first one or let user choose)
    selected_campaign = campaigns.items[0]
    campaign_id = selected_campaign.id

    print(f"📊 Selected campaign for CKAN publishing:")
    print(f"   ID: {campaign_id}")
    print(f"   Name: {selected_campaign.name}")

except Exception as e:
    print(f"❌ Error listing campaigns: {e}")
    raise

📋 Available campaigns for CKAN publishing:
Found 7 campaigns:
  1. ID: 3 - UploadFile Test
     Description: A test campaign for UploadFile functionality....
     Contact: Vlad (vlad@example.com)

  2. ID: 7 - Beaumont Sniffer Run
     Description: Beaumont Run of the SNIFFER air quality sensor for VOCUS data...
     Contact: Pawell Misztal (Pawell@utexas.edu)

  3. ID: 8 - Beaumont Sniffer Run Test 2
     Description: Beaumont Run of the SNIFFER air quality sensor for VOCUS data...
     Contact: Pawell Misztal (Pawell@utexas.edu)

  4. ID: 9 - Beaumont Stream Gauge
     Description: Beaumont Stream Gauge Campaign...
     Contact: Nick Brake ()

  5. ID: 10 - Beaumont Stream Gauge
     Description: Beaumont Stream Gauge Campaign...
     Contact: Nick Brake ()

📊 Selected campaign for CKAN publishing:
   ID: 3
   Name: UploadFile Test


In [6]:
# Get stations for the selected campaign
print(f"📍 Finding stations in campaign {campaign_id}...")
try:
    stations = client.list_stations(campaign_id=campaign_id)

    if stations.total == 0:
        print("❌ No stations found in this campaign. Please create stations and upload data first.")
        raise Exception("No stations available")

    print(f"Found {stations.total} stations:")
    for station in stations.items:
        print(f"  • ID: {station.id} - {station.name}")
        print(f"    Description: {station.description[:80]}...")
        print()

    # Select the first station
    selected_station = stations.items[0]
    station_id = selected_station.id

    print(f"📡 Selected station for CKAN publishing:")
    print(f"   ID: {station_id}")
    print(f"   Name: {selected_station.name}")

except Exception as e:
    print(f"❌ Error listing stations: {e}")
    raise

📍 Finding stations in campaign 3...
Found 1 stations:
  • ID: 8 - UploadFile Test
    Description: A test station for UploadFile functionality....

📡 Selected station for CKAN publishing:
   ID: 8
   Name: UploadFile Test


In [7]:
# Check for existing data in the station
print(f"🔍 Checking data availability for station {station_id}...")
try:
    # List sensors to verify data exists
    sensors = client.sensors.list(campaign_id=campaign_id, station_id=station_id)

    if not sensors.items:
        print("❌ No sensors found in this station. Please upload sensor data first.")
        raise Exception("No sensor data available")
    print(sensors.items)
    total_measurements = 0
    for sensor in sensors.items:
        if sensor.statistics:
            total_measurements += sensor.statistics.count
    print(total_measurements)

    print(f"✅ Data validation successful:")
    print(f"   • Sensors: {len(sensors.items)}")
    print(f"   • Total measurements: {total_measurements}")
    print(f"   • Sensor types: {', '.join([s.variablename for s in sensors.items[:3]])}{'...' if len(sensors.items) > 3 else ''}")

    if total_measurements == 0:
        print("⚠️  Warning: No measurement data found. CKAN publishing will include sensor configuration only.")
    else:
        print("✅ Ready for CKAN publishing with full dataset!")

except Exception as e:
    print(f"❌ Error checking data availability: {e}")
    raise

🔍 Checking data availability for station 8...
[SensorItem(id=1589, alias='12.3623', description=None, postprocess=True, postprocessscript='', units='Counts Per Second', variablename='No BestGuess Formula', statistics=SensorStatistics(max_value=0.5498839736, min_value=-0.0003183083283, avg_value=0.00080247399729703, stddev_value=0.0196335000474048, percentile_90=-0.0003183083283, percentile_95=-0.0003183083283, percentile_99=-0.0003183083283, count=1800, first_measurement_value=-0.0003183083283, first_measurement_collectiontime=datetime.datetime(2023, 2, 23, 17, 36, 1, 8006, tzinfo=TzInfo(UTC)), last_measurement_time=datetime.datetime(2023, 2, 23, 18, 6, 0, 288007, tzinfo=TzInfo(UTC)), last_measurement_value=-0.0003183083283, stats_last_updated=datetime.datetime(2025, 5, 24, 20, 13, 1, 275069, tzinfo=TzInfo(UTC)))), SensorItem(id=1590, alias='12.3783', description=None, postprocess=True, postprocessscript='', units='Counts Per Second', variablename='No BestGuess Formula', statistics=Sen

## 3. CKAN Portal Exploration

Before publishing, let's explore the CKAN portal to understand its structure and existing datasets.

In [8]:
# Initialize CKAN client for exploration using client configuration
if client.ckan:
    ckan = client.ckan
    ckan_url = client.get_config().get('ckan_url', 'Not configured')
    print(f"🌐 Using integrated CKAN client: {ckan_url}")
else:
    # Get configuration from client for standalone CKAN client
    client_config = client.get_config()
    ckan_url = client_config.get('ckan_url')

    if ckan_url:
        # Create standalone CKAN client using client configuration
        ckan = CKANIntegration(ckan_url=ckan_url, config=client_config)
        print(f"🌐 Created standalone CKAN client: {ckan_url}")
    else:
        # Fallback to predefined CKAN URL if available
        try:
            ckan_url = CKAN_URL  # Use predefined URL if available
            ckan = CKANIntegration(ckan_url=ckan_url, config={})
            print(f"🌐 Using fallback CKAN client: {ckan_url}")
        except NameError:
            print("❌ No CKAN configuration available")
            print("Please ensure CKAN URL is configured in config.yaml or predefined variables")
            ckan = None
            ckan_url = None

🌐 Using integrated CKAN client: https://ckan.tacc.utexas.edu


In [9]:
# List existing organizations
print("🏢 Available CKAN organizations:")
try:
    organizations = ckan.list_organizations()

    if organizations:
        print(f"Found {len(organizations)} organizations:")
        for org in organizations[:5]:  # Show first 5
            print(f"  • {org['name']}: {org['title']}")
            print(f"    Description: {(org.get('description') or 'No description')[:60]}...")
            print(f"    Packages: {org.get('package_count', 0)}")
            print()

        # Check if our target organization exists
        org_names = [org['name'] for org in organizations]
        if CKAN_ORGANIZATION in org_names:
            print(f"✅ Target organization '{CKAN_ORGANIZATION}' found!")
        else:
            print(f"⚠️  Target organization '{CKAN_ORGANIZATION}' not found.")
            print("   Publishing will use test dataset mode.")
    else:
        print("No organizations found or access restricted.")

except Exception as e:
    print(f"⚠️  Could not list organizations: {e}")
    print("Continuing with dataset publishing...")

🏢 Available CKAN organizations:
Found 5 organizations:
  • dso-internal: DSO Internal
    Description: No description...
    Packages: 8

  • dynamo: DYNAMO
    Description: Analysis and Model Integration MINT and Cookbooks...
    Packages: 11

  • planet-texas-2050: Planet Texas 2050
    Description: Planet Texas 2050's interdisciplinary research teams work on...
    Packages: 21

  • setx-uifl: Southeast Texas Urban Integrated Field Lab
    Description: Southeast Texas Urban Integrated Field Lab (SETx-UIFL) is a ...
    Packages: 12

  • twdb-subside: TWDB: Subside
    Description: Subsidence data from the Texas Water Development Board...
    Packages: 6

✅ Target organization 'dso-internal' found!


In [10]:
# Search for existing Upstream datasets
print("🔍 Searching for existing Upstream datasets in CKAN:")
try:
    upstream_datasets = ckan.list_datasets(
        tags=["upstream", "environmental"],
        limit=10
    )

    if upstream_datasets:
        print(f"Found {len(upstream_datasets)} Upstream-related datasets:")
        for dataset in upstream_datasets[:3]:  # Show first 3
            print(f"  • {dataset['name']}: {dataset['title']}")
            print(f"    Notes: {(dataset.get('notes') or 'No description')[:80]}...")
            print(f"    Resources: {len(dataset.get('resources', []))}")
            print(f"    Tags: {', '.join([tag['name'] for tag in dataset.get('tags', [])])}")
            print()
    else:
        print("No existing Upstream datasets found.")
        print("This will be the first Upstream dataset in this portal!")

except Exception as e:
    print(f"⚠️  Could not search datasets: {e}")
    print("Proceeding with dataset creation...")

🔍 Searching for existing Upstream datasets in CKAN:
Found 1 Upstream-related datasets:
  • cow-bayou-near-mauriceville: Cow Bayou near Mauriceville
    Notes: Beaumont Run stream gauge at Cow Bayou

Project: SETx-UIFL Beaumont

Contact: Ni...
    Resources: 5
    Tags: campaign-SETx-UIFL Beaumont, environmental-data, sensor-station, stream-gauge, upstream, water-level



## 4. Data Export and Preparation

Before publishing to CKAN, let's export the campaign data and examine its structure.

In [11]:
# Get detailed campaign information
print(f"📊 Retrieving detailed campaign information...")
try:
    campaign_details = client.get_campaign(campaign_id)

    print(f"✅ Campaign Details Retrieved:")
    print(f"   Name: {campaign_details.name}")
    print(f"   Description: {campaign_details.description}")
    print(f"   Contact: {campaign_details.contact_name} ({campaign_details.contact_email})")
    print(f"   Allocation: {campaign_details.allocation}")
    print(f"   Start Date: {campaign_details.start_date}")
    print(f"   End Date: {campaign_details.end_date}")

    # Check campaign summary if available
    if hasattr(campaign_details, 'summary') and campaign_details.summary:
        summary = campaign_details.summary
        print(f"\n📈 Campaign Summary:")
        if hasattr(summary, 'total_stations'):
            print(f"   • Total Stations: {summary.total_stations}")
        if hasattr(summary, 'total_sensors'):
            print(f"   • Total Sensors: {summary.total_sensors}")
        if hasattr(summary, 'total_measurements'):
            print(f"   • Total Measurements: {summary.total_measurements}")
        if hasattr(summary, 'sensor_types'):
            print(f"   • Sensor Types: {', '.join(summary.sensor_types)}")

except Exception as e:
    print(f"❌ Error retrieving campaign details: {e}")
    raise

📊 Retrieving detailed campaign information...
✅ Campaign Details Retrieved:
   Name: UploadFile Test
   Description: A test campaign for UploadFile functionality.
   Contact: Vlad (vlad@example.com)
   Allocation: PT2050-DataX
   Start Date: 2025-04-04 20:49:58.516000
   End Date: 2026-04-04 20:49:58.516000

📈 Campaign Summary:
   • Sensor Types: 409.032, 13.495, 301.108, 143.143, 405.303, 16.793, 155.179, 14.6626, 55.0383, 404.434, 291.027, 186.041, 54.0544, 479.023, 218.177, 231.044, 532.013, 215.092, 16.082, 167.106, 350.309, 311.33, 267.194, 323.291, 89.0598, 16.2362, 216.028, 350.39, 385.032, 258.78, 188.026, 294.326, 310.045, 183.074, 16.7826, 308.025, 233.214, 240.24, 347.259, 385.327, 223.096, 12.7656, 280.308, 178.152, 15.5539, 131.106, 336.227, 396.384, 316.171, 16.5858, 129.913, 193.195, 375.399, 176.007, 71.0853, 14.819, 175.17, 342.328, 185.119, 223.168, 390.42, 14.9483, 15.198, 232.154, 89.034, 74.0555, 403.011, 315.123, 336.037, 258.207, 270.211, 14.6214, 113.025, 349.38

In [12]:
# Export station data for CKAN publishing
print(f"📤 Exporting station data for CKAN publishing...")
try:
    # Export sensor configuration
    print("   Exporting sensor configuration...")
    station_sensors_data = client.stations.export_station_sensors(
        station_id=station_id,
        campaign_id=campaign_id
    )

    # Export measurement data
    print("   Exporting measurement data...")
    station_measurements_data = client.stations.export_station_measurements(
        station_id=station_id,
        campaign_id=campaign_id
    )

    # Check exported data sizes
    sensors_size = len(station_sensors_data.getvalue()) if hasattr(station_sensors_data, 'getvalue') else 0
    measurements_size = len(station_measurements_data.getvalue()) if hasattr(station_measurements_data, 'getvalue') else 0

    print(f"✅ Data export completed:")
    print(f"   • Sensors data: {sensors_size:,} bytes")
    print(f"   • Measurements data: {measurements_size:,} bytes")
    print(f"   • Total data size: {(sensors_size + measurements_size):,} bytes")

    if sensors_size == 0:
        print("⚠️  Warning: Sensors data is empty")
    if measurements_size == 0:
        print("⚠️  Warning: Measurements data is empty")

    print("✅ Ready for CKAN publication!")

except Exception as e:
    print(f"❌ Error exporting station data: {e}")
    raise

📤 Exporting station data for CKAN publishing...
   Exporting sensor configuration...
Exporting sensors for station 8 in campaign 3
   Exporting measurement data...
✅ Data export completed:
   • Sensors data: 108,943 bytes
   • Measurements data: 60,986,174 bytes
   • Total data size: 61,095,117 bytes
✅ Ready for CKAN publication!


## 5. CKAN Dataset Creation and Publishing

Now let's publish the campaign data to CKAN using the integrated publishing functionality.

In [13]:
# Prepare dataset metadata
dataset_name = f"upstream-campaign-{campaign_id}"
print(f"🏷️  Preparing dataset metadata for: {dataset_name}")

# Create comprehensive metadata
dataset_metadata = {
    "name": dataset_name,
    "title": campaign_details.name,
    "notes": f"""{campaign_details.description}

This dataset contains environmental sensor data collected through the Upstream platform.

**Campaign Information:**
- Campaign ID: {campaign_id}
- Contact: {campaign_details.contact_name} ({campaign_details.contact_email})
- Allocation: {campaign_details.allocation}
- Duration: {campaign_details.start_date} to {campaign_details.end_date}

**Data Structure:**
- Sensors Configuration: Contains sensor metadata, units, and processing information
- Measurement Data: Time-series environmental measurements with geographic coordinates

**Access and Usage:**
Data is provided in CSV format for easy analysis and integration with various tools.""",
    "tags": ["environmental", "sensors", "upstream", "monitoring", "time-series"],
    "extras": [
        {"key": "campaign_id", "value": str(campaign_id)},
        {"key": "station_id", "value": str(station_id)},
        {"key": "source", "value": "Upstream Platform"},
        {"key": "data_type", "value": "environmental_sensor_data"},
        {"key": "contact_email", "value": campaign_details.contact_email},
        {"key": "allocation", "value": campaign_details.allocation},
        {"key": "export_date", "value": datetime.now().isoformat()}
    ],
    "license_id": "cc-by",  # Creative Commons Attribution
}

print(f"📋 Dataset Metadata Prepared:")
print(f"   • Name: {dataset_metadata['name']}")
print(f"   • Title: {dataset_metadata['title']}")
print(f"   • Tags: {', '.join(dataset_metadata['tags'])}")
print(f"   • License: {dataset_metadata['license_id']}")
print(f"   • Extra fields: {len(dataset_metadata['extras'])}")
print(f"   • Notes: {dataset_metadata['notes']}")

🏷️  Preparing dataset metadata for: upstream-campaign-3
📋 Dataset Metadata Prepared:
   • Name: upstream-campaign-3
   • Title: UploadFile Test
   • Tags: environmental, sensors, upstream, monitoring, time-series
   • License: cc-by
   • Extra fields: 7
   • Notes: A test campaign for UploadFile functionality.

This dataset contains environmental sensor data collected through the Upstream platform.

**Campaign Information:**
- Campaign ID: 3
- Contact: Vlad (vlad@example.com)
- Allocation: PT2050-DataX
- Duration: 2025-04-04 20:49:58.516000 to 2026-04-04 20:49:58.516000

**Data Structure:**
- Sensors Configuration: Contains sensor metadata, units, and processing information
- Measurement Data: Time-series environmental measurements with geographic coordinates

**Access and Usage:**
Data is provided in CSV format for easy analysis and integration with various tools.


# Publish campaign data to CKAN using integrated method
print(f"📤 Publishing campaign data to CKAN...")
station_name = client.stations.get(station_id=station_id, campaign_id=campaign_id).name

try:
    # Use the integrated CKAN publishing method
    publication_result = client.publish_to_ckan(
        campaign_id=campaign_id,
        station_id=station_id,
    )

    print(f"✅ CKAN Publication Successful!")
    print(f"\n📊 Publication Summary:")
    print(f"   • Success: {publication_result['success']}")
    print(f"   • Dataset Name: {publication_result['dataset']['name']}")
    print(f"   • Dataset ID: {publication_result['dataset']['id']}")
    print(f"   • Resources Created: {len(publication_result['resources'])}")
    print(f"   • CKAN URL: {publication_result['ckan_url']}")
    print(f"   • Message: {publication_result['message']}")

    # Store results for further operations
    published_dataset = publication_result['dataset']
    published_resources = publication_result['resources']
    ckan_dataset_url = publication_result['ckan_url']

    print(f"\n🎉 Your data is now publicly available at:")
    print(f"   {ckan_dataset_url}")

except Exception as e:
    print(f"❌ CKAN publication failed: {e}")
    print("\nTroubleshooting tips:")
    print("   • Check CKAN API credentials")
    print("   • Verify organization permissions")
    print("   • Ensure CKAN portal is accessible")
    print("   • Check dataset name uniqueness")
    raise

In [None]:
# Demonstrate Custom Metadata Publishing
print("🎨 Demonstrating Custom Metadata Publishing...")

# Example 1: Basic custom metadata
print("\n📝 Example 1: Adding custom dataset metadata")
custom_dataset_metadata = {
    "project_name": "Water Quality Monitoring Study",
    "funding_agency": "Environmental Protection Agency",
    "grant_number": "EPA-2024-WQ-001",
    "study_period": "2024-2025",
    "principal_investigator": "Dr. Jane Smith",
    "institution": "University of Environmental Sciences",
    "data_quality_level": "Level 2 - Quality Controlled"
}

print("Custom dataset metadata to be added:")
for key, value in custom_dataset_metadata.items():
    print(f"   • {key}: {value}")

# Example 2: Custom resource metadata
print("\n📄 Example 2: Adding custom resource metadata")
custom_resource_metadata = {
    "calibration_date": "2024-01-15",
    "calibration_method": "NIST-traceable standards",
    "processing_version": "v2.1",
    "quality_control": "Automated + Manual Review",
    "uncertainty_bounds": "±2% of reading",
    "data_completeness": "98.5%"
}

print("Custom resource metadata to be added to both sensors.csv and measurements.csv:")
for key, value in custom_resource_metadata.items():
    print(f"   • {key}: {value}")

# Example 3: Custom tags
print("\n🏷️ Example 3: Adding custom tags")
custom_tags = [
    "water-quality",
    "epa-funded",
    "university-research",
    "quality-controlled",
    "long-term-monitoring"
]

print(f"Custom tags (added to base tags): {', '.join(custom_tags)}")
print(f"Final tags will be: {', '.join(['environmental', 'sensors', 'upstream'] + custom_tags)}")

# Example 4: Additional CKAN dataset parameters
print("\n⚙️ Example 4: Additional CKAN dataset parameters")
additional_params = {
    "license_id": "cc-by-4.0",  # Creative Commons Attribution 4.0
    "version": "2.1",
    "author": "Environmental Research Team",
    "author_email": "research@university.edu",
    "maintainer": "Dr. Jane Smith",
    "maintainer_email": "jane.smith@university.edu"
}

print("Additional CKAN dataset parameters:")
for key, value in additional_params.items():
    print(f"   • {key}: {value}")

print("\n💡 These examples show how to enrich your CKAN datasets with project-specific metadata!")

🎨 Demonstrating Custom Metadata Publishing...

📝 Example 1: Adding custom dataset metadata
Custom dataset metadata to be added:
   • project_name: Water Quality Monitoring Study
   • funding_agency: Environmental Protection Agency
   • grant_number: EPA-2024-WQ-001
   • study_period: 2024-2025
   • principal_investigator: Dr. Jane Smith
   • institution: University of Environmental Sciences
   • data_quality_level: Level 2 - Quality Controlled

📄 Example 2: Adding custom resource metadata
Custom resource metadata to be added to both sensors.csv and measurements.csv:
   • calibration_date: 2024-01-15
   • calibration_method: NIST-traceable standards
   • processing_version: v2.1
   • quality_control: Automated + Manual Review
   • uncertainty_bounds: ±2% of reading
   • data_completeness: 98.5%

🏷️ Example 3: Adding custom tags
Custom tags (added to base tags): water-quality, epa-funded, university-research, quality-controlled, long-term-monitoring
Final tags will be: environmental, sen

In [15]:
# Publish with Custom Metadata - Practical Example
print("🚀 Publishing with Custom Metadata - Practical Example")
print("=" * 60)

# Create a new dataset name for the custom metadata example
custom_dataset_campaign_id = f"{campaign_id}-custom-meta"

try:
    # Publish campaign data with ALL custom metadata options
    print("📤 Publishing campaign with comprehensive custom metadata...")

    custom_publication_result = client.publish_to_ckan(
        campaign_id=campaign_id,
        station_id=station_id,

        # Custom dataset metadata (added to CKAN extras)
        dataset_metadata=custom_dataset_metadata,

        # Custom resource metadata (added to both CSV files)
        resource_metadata=custom_resource_metadata,

        # Custom tags (combined with base tags)
        custom_tags=custom_tags,

        # Control auto-publishing
        auto_publish=True,

        # Additional CKAN dataset parameters
        **additional_params
    )

    print("✅ Custom Metadata Publication Successful!")
    print(f"\n📊 Publication Results:")
    print(f"   • Dataset Name: {custom_publication_result['dataset']['name']}")
    print(f"   • Dataset ID: {custom_publication_result['dataset']['id']}")
    print(f"   • Resources: {len(custom_publication_result['resources'])}")
    print(f"   • CKAN URL: {custom_publication_result['ckan_url']}")

    # Store for verification
    custom_dataset = custom_publication_result['dataset']
    custom_ckan_url = custom_publication_result['ckan_url']

    print(f"\n🌟 Enhanced dataset available at:")
    print(f"   {custom_ckan_url}")

    print(f"\n🔍 What's different with custom metadata:")
    print(f"   ✓ Extended dataset metadata with project details")
    print(f"   ✓ Enhanced resource metadata with quality information")
    print(f"   ✓ Improved discoverability through custom tags")
    print(f"   ✓ Professional licensing and authorship information")
    print(f"   ✓ Version tracking and maintenance contacts")

except Exception as e:
    print(f"❌ Custom metadata publication failed: {e}")
    print("This might be due to CKAN permissions or network issues.")

🚀 Publishing with Custom Metadata - Practical Example
📤 Publishing campaign with comprehensive custom metadata...
Exporting sensors for station 8 in campaign 3


Failed to publish campaign to CKAN: Failed to create CKAN dataset: 403 Client Error: FORBIDDEN for url: https://ckan.tacc.utexas.edu/api/3/action/package_create


❌ Custom metadata publication failed: CKAN publication failed: Failed to create CKAN dataset: 403 Client Error: FORBIDDEN for url: https://ckan.tacc.utexas.edu/api/3/action/package_create
This might be due to CKAN permissions or network issues.


In [16]:
# Compare Standard vs Custom Metadata Results
print("🔍 Comparing Standard vs Custom Metadata Results")
print("=" * 55)

try:
    # Retrieve the custom metadata dataset for comparison
    custom_dataset_details = ckan.get_dataset(custom_dataset['name'])

    print("📋 Metadata Comparison:")
    print("\n1️⃣ DATASET-LEVEL METADATA (CKAN Extras)")
    print("   Standard publish_to_ckan() includes:")
    standard_extras = ["source", "data_type", "campaign_id", "campaign_name",
                      "campaign_description", "campaign_contact_name", "campaign_contact_email"]
    for extra in standard_extras:
        print(f"      • {extra}")

    print("\n   Custom metadata adds:")
    custom_extras = list(custom_dataset_metadata.keys())
    for extra in custom_extras:
        print(f"      • {extra}")

    print(f"\n   📊 Total extras in custom dataset: {len(custom_dataset_details.get('extras', []))}")

    # Show some actual custom extras from the dataset
    print("\n   🔍 Sample custom extras retrieved from CKAN:")
    for extra in custom_dataset_details.get('extras', [])[:8]:  # Show first 8
        if extra['key'] in custom_dataset_metadata:
            print(f"      ✓ {extra['key']}: {extra['value']}")

    print("\n2️⃣ TAGS COMPARISON")
    dataset_tags = [tag['name'] for tag in custom_dataset_details.get('tags', [])]
    base_tags = ["environmental", "sensors", "upstream"]
    added_tags = [tag for tag in dataset_tags if tag not in base_tags]

    print(f"   Base tags: {', '.join(base_tags)}")
    print(f"   Custom tags added: {', '.join(added_tags)}")
    print(f"   📊 Total tags: {len(dataset_tags)}")

    print("\n3️⃣ DATASET PARAMETERS")
    print(f"   License: {custom_dataset_details.get('license_title', 'Not set')}")
    print(f"   Version: {custom_dataset_details.get('version', 'Not set')}")
    print(f"   Author: {custom_dataset_details.get('author', 'Not set')}")
    print(f"   Maintainer: {custom_dataset_details.get('maintainer', 'Not set')}")

    print("\n4️⃣ RESOURCE METADATA")
    resources = custom_dataset_details.get('resources', [])
    if resources:
        print(f"   Found {len(resources)} resources with enhanced metadata")
        sample_resource = resources[0]  # Check first resource

        # Count how many custom metadata fields are present
        custom_fields_found = 0
        for field_name in custom_resource_metadata.keys():
            if field_name in sample_resource:
                custom_fields_found += 1
                print(f"      ✓ {field_name}: {sample_resource[field_name]}")

        print(f"   📊 Custom resource fields added: {custom_fields_found}/{len(custom_resource_metadata)}")

    print("\n💡 Benefits of Custom Metadata:")
    print("   🎯 Improved searchability and discoverability")
    print("   📚 Better documentation and context")
    print("   🔍 Enhanced filtering and categorization")
    print("   📊 Professional presentation and credibility")
    print("   🤝 Clear contact and attribution information")
    print("   ⚖️ Proper licensing and usage terms")

except Exception as e:
    print(f"⚠️ Could not retrieve custom dataset details: {e}")
    print("The comparison will use the information we provided during publishing.")

print(f"\n📚 Usage Guidelines:")
print("• Use dataset_metadata for project-level information")
print("• Use resource_metadata for data quality and processing details")
print("• Use custom_tags for improved discoverability")
print("• Use additional parameters for CKAN-specific fields")
print("• All custom metadata is preserved and searchable in CKAN")

🔍 Comparing Standard vs Custom Metadata Results
⚠️ Could not retrieve custom dataset details: name 'custom_dataset' is not defined
The comparison will use the information we provided during publishing.

📚 Usage Guidelines:
• Use dataset_metadata for project-level information
• Use resource_metadata for data quality and processing details
• Use custom_tags for improved discoverability
• Use additional parameters for CKAN-specific fields
• All custom metadata is preserved and searchable in CKAN


In [17]:
# Publish campaign data to CKAN using integrated method
print(f"📤 Publishing campaign data to CKAN...")
station_name = client.stations.get(station_id=station_id, campaign_id=campaign_id).name

try:
    # Use the integrated CKAN publishing method
    publication_result = client.publish_to_ckan(
        campaign_id=campaign_id,
        station_id=station_id,
    )

    print(f"✅ CKAN Publication Successful!")
    print(f"\n📊 Publication Summary:")
    print(f"   • Success: {publication_result['success']}")
    print(f"   • Dataset Name: {publication_result['dataset']['name']}")
    print(f"   • Dataset ID: {publication_result['dataset']['id']}")
    print(f"   • Resources Created: {len(publication_result['resources'])}")
    print(f"   • CKAN URL: {publication_result['ckan_url']}")
    print(f"   • Message: {publication_result['message']}")

    # Store results for further operations
    published_dataset = publication_result['dataset']
    published_resources = publication_result['resources']
    ckan_dataset_url = publication_result['ckan_url']

    print(f"\n🎉 Your data is now publicly available at:")
    print(f"   {ckan_dataset_url}")

except Exception as e:
    print(f"❌ CKAN publication failed: {e}")
    print("\nTroubleshooting tips:")
    print("   • Check CKAN API credentials")
    print("   • Verify organization permissions")
    print("   • Ensure CKAN portal is accessible")
    print("   • Check dataset name uniqueness")
    raise

📤 Publishing campaign data to CKAN...
Exporting sensors for station 8 in campaign 3


Failed to publish campaign to CKAN: Failed to create CKAN dataset: 403 Client Error: FORBIDDEN for url: https://ckan.tacc.utexas.edu/api/3/action/package_create


❌ CKAN publication failed: CKAN publication failed: Failed to create CKAN dataset: 403 Client Error: FORBIDDEN for url: https://ckan.tacc.utexas.edu/api/3/action/package_create

Troubleshooting tips:
   • Check CKAN API credentials
   • Verify organization permissions
   • Ensure CKAN portal is accessible
   • Check dataset name uniqueness


APIError: CKAN publication failed: Failed to create CKAN dataset: 403 Client Error: FORBIDDEN for url: https://ckan.tacc.utexas.edu/api/3/action/package_create

## 6. Dataset Verification and Exploration

Let's verify the published dataset and explore its contents in CKAN.

In [None]:
# Verify the published dataset
print(f"🔍 Verifying published dataset in CKAN...")

try:
    # Retrieve the dataset from CKAN to verify it was created correctly
    verified_dataset = ckan.get_dataset(published_dataset['name'])

    print(f"✅ Dataset verification successful!")
    print(f"\n📋 Dataset Information:")
    print(f"   • Name: {verified_dataset['name']}")
    print(f"   • Title: {verified_dataset['title']}")
    print(f"   • State: {verified_dataset['state']}")
    print(f"   • Private: {verified_dataset.get('private', 'Unknown')}")
    print(f"   • License: {verified_dataset.get('license_title', 'Not specified')}")
    print(f"   • Created: {verified_dataset.get('metadata_created', 'Unknown')}")
    print(f"   • Modified: {verified_dataset.get('metadata_modified', 'Unknown')}")

    # Show organization info if available
    if verified_dataset.get('organization'):
        org = verified_dataset['organization']
        print(f"   • Organization: {org.get('title', org.get('name', 'Unknown'))}")

    # Show tags
    if verified_dataset.get('tags'):
        tags = [tag['name'] for tag in verified_dataset['tags']]
        print(f"   • Tags: {', '.join(tags)}")

    # Show extras
    if verified_dataset.get('extras'):
        print(f"   • Extra metadata fields: {len(verified_dataset['extras'])}")
        for extra in verified_dataset['extras'][:3]:  # Show first 3
            print(f"     - {extra['key']}: {extra['value']}")

except Exception as e:
    print(f"❌ Dataset verification failed: {e}")

🔍 Verifying published dataset in CKAN...
✅ Dataset verification successful!

📋 Dataset Information:
   • Name: upstream-campaign-3
   • Title: UploadFile Test
   • State: active
   • Private: False
   • License: cc-by-4.0
   • Created: 2025-07-25T16:58:18.064946
   • Modified: 2025-07-25T17:00:34.651091
   • Organization: org
   • Tags: environmental, sensors, upstream
   • Extra metadata fields: 8
     - campaign_allocation: PT2050-DataX
     - campaign_contact_email: vlad@example.com
     - campaign_contact_name: Vlad


In [None]:
# Examine the published resources
print(f"📁 Examining published resources...")

try:
    resources = verified_dataset.get('resources', [])

    if resources:
        print(f"Found {len(resources)} resources:")

        for i, resource in enumerate(resources, 1):
            print(f"\n   📄 Resource {i}: {resource['name']}")
            print(f"      • ID: {resource['id']}")
            print(f"      • Format: {resource.get('format', 'Unknown')}")
            print(f"      • Size: {resource.get('size', 'Unknown')} bytes")
            print(f"      • Description: {resource.get('description', 'No description')}")
            print(f"      • Created: {resource.get('created', 'Unknown')}")
            print(f"      • URL: {resource.get('url', 'Not available')}")

            # Show download information
            if resource.get('url'):
                download_url = resource['url']
                if not download_url.startswith('http'):
                    download_url = f"{CKAN_URL}{download_url}"
                print(f"      • Download: {download_url}")

        print(f"\n✅ All resources published successfully!")

    else:
        print("⚠️  No resources found in the dataset")

except Exception as e:
    print(f"❌ Error examining resources: {e}")

📁 Examining published resources...
Found 4 resources:

   📄 Resource 1: UploadFile Test - Sensors Configuration - 2025-07-25T12:58:18Z
      • ID: d200a54d-dd06-4830-9c19-63982973271d
      • Format: CSV
      • Size: 108943 bytes
      • Description: Sensor configuration and metadata
      • Created: 2025-07-25T16:58:18.374289
      • URL: http://ckan.tacc.cloud:5000/dataset/c998fbd1-303f-42c6-843b-b391e321250d/resource/d200a54d-dd06-4830-9c19-63982973271d/download/uploaded_file
      • Download: http://ckan.tacc.cloud:5000/dataset/c998fbd1-303f-42c6-843b-b391e321250d/resource/d200a54d-dd06-4830-9c19-63982973271d/download/uploaded_file

   📄 Resource 2: UploadFile Test - Measurement Data - 2025-07-25T12:58:18Z
      • ID: 01092c33-fb14-472e-b2f5-b167aca8df8d
      • Format: CSV
      • Size: 60986174 bytes
      • Description: Environmental sensor measurements
      • Created: 2025-07-25T16:58:20.516707
      • URL: http://ckan.tacc.cloud:5000/dataset/c998fbd1-303f-42c6-843b-b391e3212

## 7. Dataset Management Operations

Let's demonstrate additional CKAN management operations like updating datasets and managing resources.

In [None]:
# Update dataset with additional metadata
print(f"🔄 Demonstrating dataset update operations...")

try:
    # Add update timestamp and additional tags
    current_tags = [tag['name'] for tag in verified_dataset.get('tags', [])]
    updated_tags = current_tags + ["demo", "notebook-generated"]

    # Update the dataset
    updated_dataset = ckan.update_dataset(
        dataset_id=published_dataset['name'],
        tags=updated_tags,
        notes=f"{verified_dataset.get('notes', '')}\n\n**Last Updated:** {datetime.now().strftime('%Y-%m-%d %H:%M:%S UTC')} (via Upstream SDK Demo)"
    )

    print(f"✅ Dataset updated successfully!")
    print(f"   • New tags added: demo, notebook-generated")
    print(f"   • Description updated with timestamp")
    print(f"   • Total tags: {len(updated_dataset.get('tags', []))}")

except Exception as e:
    print(f"⚠️  Dataset update failed: {e}")
    print("This may be due to insufficient permissions or CKAN configuration.")

🔄 Demonstrating dataset update operations...
✅ Dataset updated successfully!
   • New tags added: demo, notebook-generated
   • Description updated with timestamp
   • Total tags: 5


## 11. Cleanup and Resource Management

Let's demonstrate proper cleanup and resource management.

In [None]:
# Dataset management options
print(f"🧹 Dataset Management and Cleanup Options:")

print(f"\n📊 Current Dataset Status:")
print(f"   • Dataset Name: {published_dataset['name']}")
print(f"   • Dataset ID: {published_dataset['id']}")
print(f"   • CKAN URL: {ckan_dataset_url}")
print(f"   • Resources: {len(published_resources)}")

print(f"\n🔧 Management Options:")
print(f"   1. Keep dataset active (recommended for production)")
print(f"   2. Make dataset private (hide from public)")
print(f"   3. Archive dataset (mark as deprecated)")
print(f"   4. Delete dataset (only for test data)")

# For demo purposes, we'll show how to manage the dataset
print(f"\n💡 For this demo, we'll keep the dataset active.")
print(f"   Your published data will remain available at:")
print(f"   {ckan_dataset_url}")

# Uncomment the following section if you want to delete the demo dataset
"""
# CAUTION: Uncomment only for cleanup of test datasets
print(f"\n⚠️  Demo dataset cleanup:")
try:
    # Delete the demo dataset (only for demo purposes)
    deletion_result = ckan.delete_dataset(published_dataset['name'])
    if deletion_result:
        print(f"   ✅ Demo dataset deleted successfully")
    else:
        print(f"   ❌ Dataset deletion failed")
except Exception as e:
    print(f"   ⚠️  Could not delete dataset: {e}")
    print(f"   This may be due to insufficient permissions or CKAN configuration.")
"""

print(f"\n🔄 Resource Cleanup:")
try:
    # Close any open file handles
    if 'station_sensors_data' in locals():
        station_sensors_data.close()
    if 'station_measurements_data' in locals():
        station_measurements_data.close()


    print(f"   ✅ File handles closed")
except Exception as e:
    print(f"   ⚠️  Error closing file handles: {e}")

🧹 Dataset Management and Cleanup Options:

📊 Current Dataset Status:
   • Dataset Name: upstream-campaign-3
   • Dataset ID: c998fbd1-303f-42c6-843b-b391e321250d
   • CKAN URL: http://ckan.tacc.cloud:5000/dataset/upstream-campaign-3
   • Resources: 2

🔧 Management Options:
   1. Keep dataset active (recommended for production)
   2. Make dataset private (hide from public)
   3. Archive dataset (mark as deprecated)
   4. Delete dataset (only for test data)

💡 For this demo, we'll keep the dataset active.
   Your published data will remain available at:
   http://ckan.tacc.cloud:5000/dataset/upstream-campaign-3

🔄 Resource Cleanup:
   ✅ File handles closed


In [None]:
# Logout and final cleanup
print(f"👋 Session cleanup and logout...")

try:
    # Logout from Upstream
    client.logout()
    print(f"   ✅ Logged out from Upstream successfully")
except Exception as e:
    print(f"   ❌ Logout error: {e}")

print(f"\n🎉 CKAN Integration Demo Completed Successfully!")

print(f"\n📚 Summary of What We Accomplished:")
print(f"   ✅ Connected to both Upstream and CKAN platforms")
print(f"   ✅ Selected and validated campaign data")
print(f"   ✅ Exported sensor and measurement data")
print(f"   ✅ Created comprehensive CKAN dataset with metadata")
print(f"   ✅ Published resources (sensors, measurements, metadata)")
print(f"   ✅ Demonstrated dataset management operations")
print(f"   ✅ Explored data discovery and search capabilities")
print(f"   ✅ Showed automated publishing workflows")

print(f"\n🌐 Your Data is Now Publicly Available:")
print(f"   📊 Dataset: {published_dataset['name']}")
print(f"   🔗 URL: {ckan_dataset_url}")
print(f"   📁 Resources: {len(published_resources)} files available for download")

print(f"\n📖 Next Steps:")
print(f"   • Explore your published data in the CKAN web interface")
print(f"   • Set up automated publishing workflows for production")
print(f"   • Configure organization permissions and access controls")
print(f"   • Integrate CKAN APIs with other data analysis tools")
print(f"   • Monitor dataset usage and access patterns")

👋 Session cleanup and logout...
   ✅ Logged out from Upstream successfully

🎉 CKAN Integration Demo Completed Successfully!

📚 Summary of What We Accomplished:
   ✅ Connected to both Upstream and CKAN platforms
   ✅ Selected and validated campaign data
   ✅ Exported sensor and measurement data
   ✅ Created comprehensive CKAN dataset with metadata
   ✅ Published resources (sensors, measurements, metadata)
   ✅ Demonstrated dataset management operations
   ✅ Explored data discovery and search capabilities
   ✅ Showed automated publishing workflows

🌐 Your Data is Now Publicly Available:
   📊 Dataset: upstream-campaign-3
   🔗 URL: http://ckan.tacc.cloud:5000/dataset/upstream-campaign-3
   📁 Resources: 2 files available for download

📖 Next Steps:
   • Explore your published data in the CKAN web interface
   • Set up automated publishing workflows for production
   • Configure organization permissions and access controls
   • Integrate CKAN APIs with other data analysis tools
   • Monitor 

## Summary

This notebook demonstrated the comprehensive CKAN integration capabilities of the Upstream SDK:

✅ **Authentication & Setup** - Configured both Upstream and CKAN credentials  
✅ **Data Export** - Retrieved campaign data and prepared for publishing  
✅ **Dataset Creation** - Created CKAN datasets with rich metadata  
✅ **Resource Management** - Published multiple data resources (sensors, measurements, metadata)  
✅ **Portal Exploration** - Discovered existing datasets and organizations  
✅ **Update Operations** - Demonstrated dataset and resource updates  
✅ **Search & Discovery** - Showed data findability through tags and organization  
✅ **Automation Workflows** - Built reusable publishing processes  
✅ **Best Practices** - Covered naming, metadata, and performance considerations  

## Key Features

- **Seamless Integration**: Direct connection between Upstream campaigns and CKAN datasets
- **Rich Metadata**: Automatic generation of comprehensive dataset descriptions and tags
- **Multi-Resource Support**: Separate resources for sensors, measurements, and metadata
- **Update Management**: Smart handling of dataset updates and versioning
- **Error Handling**: Robust error handling and validation throughout the process
- **Automation Ready**: Workflow patterns suitable for production automation

## Production Considerations

- **Authentication**: Use environment variables or configuration files for credentials
- **Monitoring**: Implement logging and monitoring for automated publishing workflows
- **Permissions**: Configure appropriate CKAN organization permissions and access controls
- **Validation**: Add comprehensive data validation before publishing
- **Backup**: Maintain backup copies of datasets before updates

## Related Documentation

- [CKAN API Documentation](https://docs.ckan.org/en/latest/api/)

---

*This notebook demonstrates CKAN integration for the Upstream SDK. For core platform functionality, see UpstreamSDK_Core_Demo.ipynb*