# Upstream SDK CKAN Integration Demo

This notebook demonstrates the CKAN integration capabilities of the Upstream SDK for publishing environmental monitoring data to CKAN data portals.

## Overview

The Upstream SDK provides seamless integration with CKAN (Comprehensive Knowledge Archive Network) data portals for:
- üìä **Dataset Publishing**: Automatically create CKAN datasets from campaign data
- üìÅ **Resource Management**: Upload sensor configurations and measurement data as resources
- üè¢ **Organization Support**: Publish data under specific CKAN organizations
- üîÑ **Update Management**: Update existing datasets with new data
- üè∑Ô∏è **Metadata Integration**: Rich metadata tagging and categorization

## Features Demonstrated

- CKAN client setup and configuration
- Campaign data export and preparation
- Dataset creation with comprehensive metadata
- Resource management (sensors and measurements)
- Organization and permission handling
- Error handling and validation

## Prerequisites

- Valid Upstream account credentials
- Access to a CKAN portal with API credentials
- Existing campaign data (or run UpstreamSDK_Core_Demo.ipynb first)
- Python 3.7+ environment with required packages

## Related Notebooks

- **UpstreamSDK_Core_Demo.ipynb**: Core SDK functionality and campaign creation

## Installation and Setup

In [43]:
# Install required packages
#!pip install upstream-sdk
!pip install -e .
# Import required libraries
import os
import json
import getpass
from pathlib import Path
from datetime import datetime
from typing import Dict, Any, Optional, List
from io import BytesIO

# Import Upstream SDK modules
from upstream.client import UpstreamClient
from upstream.ckan import CKANIntegration
from upstream.exceptions import APIError, ValidationError, ConfigurationError

Obtaining file:///Users/mosorio/repos/tacc/upstream/sdk
  Installing build dependencies ... [?25ldone
[?25h  Checking if build backend supports build_editable ... [?25ldone
[?25h  Getting requirements to build editable ... [?25ldone
[?25h  Preparing editable metadata (pyproject.toml) ... [?25ldone
Building wheels for collected packages: upstream-sdk
  Building editable for upstream-sdk (pyproject.toml) ... [?25ldone
[?25h  Created wheel for upstream-sdk: filename=upstream_sdk-1.0.0-0.editable-py3-none-any.whl size=8428 sha256=129b231ab891d5a4f934ed23a0b7f631d320439c394b7f5a81e26ee4eb71898a
  Stored in directory: /private/var/folders/qn/xpsy3ssx5hbbb_ndr2sbt5w80000gn/T/pip-ephem-wheel-cache-oajp2zgr/wheels/47/dc/ae/1a3abd774032839edac85dcd8bb9739031dd6ccef29fca9667
Successfully built upstream-sdk
Installing collected packages: upstream-sdk
  Attempting uninstall: upstream-sdk
    Found existing installation: upstream-sdk 1.0.0
    Uninstalling upstream-sdk-1.0.0:
      Successf

## 1. Configuration and Authentication

First, let's set up authentication for both Upstream and CKAN platforms.

**Configuration Options:**
- **Upstream API**: Username/password authentication
- **CKAN Portal**: API key or access token authentication
- **Organization**: CKAN organization for dataset publishing

In [36]:
# Configuration
UPSTREAM_BASE_URL = "https://upstream-dso.tacc.utexas.edu/dev"
# For local development, uncomment the line below:
UPSTREAM_BASE_URL = 'http://localhost:8000'

# CKAN Configuration - Update these for your CKAN portal
CKAN_URL = "https://ckan.tacc.utexas.edu"   # Replace with your CKAN portal URL
CKAN_ORGANIZATION = "setx-uifl"      # Replace with your organization name

#For local development, uncomment the line below:
CKAN_URL = 'http://ckan.tacc.cloud:5000'
CKAN_ORGANIZATION = 'org'

print("üîß Configuration Settings:")
print(f"   Upstream API: {UPSTREAM_BASE_URL}")
print(f"   CKAN Portal: {CKAN_URL}")
print(f"   CKAN Organization: {CKAN_ORGANIZATION}")

üîß Configuration Settings:
   Upstream API: http://localhost:8000
   CKAN Portal: http://ckan.tacc.cloud:5000
   CKAN Organization: org


In [44]:
# Get Upstream credentials
print("üîê Please enter your TACC credentials:")
upstream_username = input("Tacc Username: ")
upstream_password = getpass.getpass("Upstream Password: ")

# Get CKAN credentials (optional - for read-only operations)
print("\nüîë CKAN API credentials (optional for demo):")
ckan_api_key = getpass.getpass("CKAN API Key (press Enter to skip): ")

# Prepare CKAN configuration
ckan_config = {
    "timeout": 30
}

if ckan_api_key:
    ckan_config["api_key"] = ckan_api_key
    print("‚úÖ CKAN API key configured")
else:
    print("‚ÑπÔ∏è  Running in read-only CKAN mode")

üîê Please enter your TACC credentials:

üîë CKAN API credentials (optional for demo):
‚úÖ CKAN API key configured


In [45]:
# Initialize Upstream client with CKAN integration
try:
    client = UpstreamClient(
        username=upstream_username,
        password=upstream_password,
        base_url=UPSTREAM_BASE_URL,
        ckan_url=CKAN_URL,
        ckan_organization=CKAN_ORGANIZATION,
        **ckan_config
    )
    print('‚úÖ Upstream client initialized')

    # Test Upstream authentication
    if client.authenticate():
        print("‚úÖ Upstream authentication successful!")
        print(f"üîó Connected to: {UPSTREAM_BASE_URL}")

        # Check CKAN integration
        if client.ckan:
            print("‚úÖ CKAN integration enabled!")
            print(f"üîó CKAN Portal: {CKAN_URL}")
        else:
            print("‚ö†Ô∏è  CKAN integration not configured")
    else:
        print("‚ùå Upstream authentication failed!")
        raise Exception("Upstream authentication failed")

except Exception as e:
    print(f"‚ùå Setup error: {e}")
    raise

{'username': 'mosorio', 'password': 'mY7m58NndJt3HpXJ', 'base_url': 'http://localhost:8000', 'ckan_url': 'http://ckan.tacc.cloud:5000', 'ckan_organization': 'org', 'timeout': 30, 'max_retries': 3, 'chunk_size': 10000, 'max_chunk_size_mb': 50, 'api_key': 'eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJqdGkiOiJZWDFWQmlkalpydzloQmNLT0M0VnJHZkpNcDFhSUJ2STFZXzZYUlFYZ0g1aTAxVi1mSXJlRUJzazVTOThoZkJGTHVfcm5Hb2lwLW5JeTBvWSIsImlhdCI6MTc1MzEzMDczNX0.4IJdemk0a4pkrRVH4Q5ENt6SnIXmQsuGoBphyIN_wu0'}
‚úÖ Upstream client initialized
‚úÖ Upstream authentication successful!
üîó Connected to: http://localhost:8000
‚úÖ CKAN integration enabled!
üîó CKAN Portal: http://ckan.tacc.cloud:5000


## 2. Campaign Selection and Data Preparation

Let's select an existing campaign with data to publish to CKAN. If you don't have existing data, run the core demo notebook first.

In [46]:
# List available campaigns
print("üìã Available campaigns for CKAN publishing:")
try:
    campaigns = client.list_campaigns(limit=10)

    if campaigns.total == 0:
        print("‚ùå No campaigns found. Please run UpstreamSDK_Core_Demo.ipynb first to create sample data.")
        raise Exception("No campaigns available")

    print(f"Found {campaigns.total} campaigns:")
    for i, campaign in enumerate(campaigns.items[:5]):
        print(f"  {i+1}. ID: {campaign.id} - {campaign.name}")
        print(f"     Description: {campaign.description[:80]}...")
        print(f"     Contact: {campaign.contact_name} ({campaign.contact_email})")
        print()

    # Select campaign (use the first one or let user choose)
    selected_campaign = campaigns.items[0]
    campaign_id = selected_campaign.id

    print(f"üìä Selected campaign for CKAN publishing:")
    print(f"   ID: {campaign_id}")
    print(f"   Name: {selected_campaign.name}")

except Exception as e:
    print(f"‚ùå Error listing campaigns: {e}")
    raise

üìã Available campaigns for CKAN publishing:
Found 2 campaigns:
  1. ID: 1 - Test Campaign 2024
     Description: A test campaign for development purposes...
     Contact: John Doe (john.doe@example.com)

  2. ID: 2 - Weather Station Network
     Description: Network of weather stations across Texas...
     Contact: Jane Smith (jane.smith@example.com)

üìä Selected campaign for CKAN publishing:
   ID: 1
   Name: Test Campaign 2024


In [47]:
# Get stations for the selected campaign
print(f"üìç Finding stations in campaign {campaign_id}...")
try:
    stations = client.list_stations(campaign_id=str(campaign_id))

    if stations.total == 0:
        print("‚ùå No stations found in this campaign. Please create stations and upload data first.")
        raise Exception("No stations available")

    print(f"Found {stations.total} stations:")
    for station in stations.items:
        print(f"  ‚Ä¢ ID: {station.id} - {station.name}")
        print(f"    Description: {station.description[:80]}...")
        print()

    # Select the first station
    selected_station = stations.items[0]
    station_id = selected_station.id

    print(f"üì° Selected station for CKAN publishing:")
    print(f"   ID: {station_id}")
    print(f"   Name: {selected_station.name}")

except Exception as e:
    print(f"‚ùå Error listing stations: {e}")
    raise

üìç Finding stations in campaign 1...
Found 2 stations:
  ‚Ä¢ ID: 6 - Test Station Alpha
    Description: Test station for development and testing purposes...

  ‚Ä¢ ID: 7 - Mobile CO2 Station
    Description: Mobile station measuring CO2 levels around Austin...

üì° Selected station for CKAN publishing:
   ID: 6
   Name: Test Station Alpha


In [48]:
# Check for existing data in the station
print(f"üîç Checking data availability for station {station_id}...")
try:
    # List sensors to verify data exists
    sensors = client.sensors.list(campaign_id=campaign_id, station_id=station_id)

    if not sensors.items:
        print("‚ùå No sensors found in this station. Please upload sensor data first.")
        raise Exception("No sensor data available")
    print(sensors.items)
    total_measurements = 0
    for sensor in sensors.items:
        if sensor.statistics:
            total_measurements += sensor.statistics.count
    print(total_measurements)

    print(f"‚úÖ Data validation successful:")
    print(f"   ‚Ä¢ Sensors: {len(sensors.items)}")
    print(f"   ‚Ä¢ Total measurements: {total_measurements}")
    print(f"   ‚Ä¢ Sensor types: {', '.join([s.variablename for s in sensors.items[:3]])}{'...' if len(sensors.items) > 3 else ''}")

    if total_measurements == 0:
        print("‚ö†Ô∏è  Warning: No measurement data found. CKAN publishing will include sensor configuration only.")
    else:
        print("‚úÖ Ready for CKAN publishing with full dataset!")

except Exception as e:
    print(f"‚ùå Error checking data availability: {e}")
    raise

üîç Checking data availability for station 6...
[SensorItem(id=4759, alias='12.9236', description=None, postprocess=True, postprocessscript='', units='Counts Per Second', variablename='No BestGuess Formula', statistics=SensorStatistics(max_value=1.576119412, min_value=-0.0004216404381, avg_value=0.000661913111494773, stddev_value=0.0374270791210834, percentile_90=-0.0004216404381, percentile_95=-0.0004216404381, percentile_99=-0.0004216404381, count=1800, first_measurement_value=-0.0004216404381, first_measurement_collectiontime=datetime.datetime(2023, 2, 23, 17, 36, 1, 8006, tzinfo=TzInfo(UTC)), last_measurement_time=datetime.datetime(2023, 2, 23, 18, 6, 0, 288008, tzinfo=TzInfo(UTC)), last_measurement_value=-0.0004216404381, stats_last_updated=datetime.datetime(2025, 5, 26, 14, 11, 29, 303319, tzinfo=TzInfo(UTC)))), SensorItem(id=4764, alias='13.0106', description=None, postprocess=True, postprocessscript='', units='Counts Per Second', variablename='No BestGuess Formula', statistics

## 3. CKAN Portal Exploration

Before publishing, let's explore the CKAN portal to understand its structure and existing datasets.

In [78]:
# Initialize standalone CKAN client for exploration
if client.ckan:
    ckan = client.ckan
else:
    # Create standalone CKAN client for exploration
    ckan = CKANIntegration(ckan_url=CKAN_URL, config=ckan_config)

print(f"üåê Exploring CKAN portal: {CKAN_URL}")

üåê Exploring CKAN portal: http://ckan.tacc.cloud:5000


In [79]:
# List existing organizations
print("üè¢ Available CKAN organizations:")
try:
    organizations = ckan.list_organizations()

    if organizations:
        print(f"Found {len(organizations)} organizations:")
        for org in organizations[:5]:  # Show first 5
            print(f"  ‚Ä¢ {org['name']}: {org['title']}")
            print(f"    Description: {(org.get('description') or 'No description')[:60]}...")
            print(f"    Packages: {org.get('package_count', 0)}")
            print()

        # Check if our target organization exists
        org_names = [org['name'] for org in organizations]
        if CKAN_ORGANIZATION in org_names:
            print(f"‚úÖ Target organization '{CKAN_ORGANIZATION}' found!")
        else:
            print(f"‚ö†Ô∏è  Target organization '{CKAN_ORGANIZATION}' not found.")
            print("   Publishing will use test dataset mode.")
    else:
        print("No organizations found or access restricted.")

except Exception as e:
    print(f"‚ö†Ô∏è  Could not list organizations: {e}")
    print("Continuing with dataset publishing...")

üè¢ Available CKAN organizations:
Found 1 organizations:
  ‚Ä¢ org: org
    Description: No description...
    Packages: 3

‚úÖ Target organization 'org' found!


In [80]:
# Search for existing Upstream datasets
print("üîç Searching for existing Upstream datasets in CKAN:")
try:
    upstream_datasets = ckan.list_datasets(
        tags=["upstream", "environmental"],
        limit=10
    )

    if upstream_datasets:
        print(f"Found {len(upstream_datasets)} Upstream-related datasets:")
        for dataset in upstream_datasets[:3]:  # Show first 3
            print(f"  ‚Ä¢ {dataset['name']}: {dataset['title']}")
            print(f"    Notes: {(dataset.get('notes') or 'No description')[:80]}...")
            print(f"    Resources: {len(dataset.get('resources', []))}")
            print(f"    Tags: {', '.join([tag['name'] for tag in dataset.get('tags', [])])}")
            print()
    else:
        print("No existing Upstream datasets found.")
        print("This will be the first Upstream dataset in this portal!")

except Exception as e:
    print(f"‚ö†Ô∏è  Could not search datasets: {e}")
    print("Proceeding with dataset creation...")

üîç Searching for existing Upstream datasets in CKAN:
Found 1 Upstream-related datasets:
  ‚Ä¢ upstream-campaign-1: Test Campaign 2024
    Notes: A test campaign for development purposes

**Last Updated:** 2025-07-22 09:27:19 ...
    Resources: 3
    Tags: demo, environmental, notebook-generated, sensors, upstream



## 4. Data Export and Preparation

Before publishing to CKAN, let's export the campaign data and examine its structure.

In [81]:
# Get detailed campaign information
print(f"üìä Retrieving detailed campaign information...")
try:
    campaign_details = client.get_campaign(str(campaign_id))

    print(f"‚úÖ Campaign Details Retrieved:")
    print(f"   Name: {campaign_details.name}")
    print(f"   Description: {campaign_details.description}")
    print(f"   Contact: {campaign_details.contact_name} ({campaign_details.contact_email})")
    print(f"   Allocation: {campaign_details.allocation}")
    print(f"   Start Date: {campaign_details.start_date}")
    print(f"   End Date: {campaign_details.end_date}")

    # Check campaign summary if available
    if hasattr(campaign_details, 'summary') and campaign_details.summary:
        summary = campaign_details.summary
        print(f"\nüìà Campaign Summary:")
        if hasattr(summary, 'total_stations'):
            print(f"   ‚Ä¢ Total Stations: {summary.total_stations}")
        if hasattr(summary, 'total_sensors'):
            print(f"   ‚Ä¢ Total Sensors: {summary.total_sensors}")
        if hasattr(summary, 'total_measurements'):
            print(f"   ‚Ä¢ Total Measurements: {summary.total_measurements}")
        if hasattr(summary, 'sensor_types'):
            print(f"   ‚Ä¢ Sensor Types: {', '.join(summary.sensor_types)}")

except Exception as e:
    print(f"‚ùå Error retrieving campaign details: {e}")
    raise

üìä Retrieving detailed campaign information...
‚úÖ Campaign Details Retrieved:
   Name: Test Campaign 2024
   Description: A test campaign for development purposes
   Contact: John Doe (john.doe@example.com)
   Allocation: TEST-123
   Start Date: 2024-01-01 00:00:00
   End Date: 2024-12-31 00:00:00

üìà Campaign Summary:
   ‚Ä¢ Sensor Types: 13.1166, 13.179, 13.2128, 13.9727, 12.6297, 12.7066, 12.406, 13.2734, 12.9024, 13.6867, 12.545, 13.9101, 13.772, 13.2514, 12.912, 13.949, 14.1434, 12.7656, 12.5357, 14.1713, 13.401, 13.9604, 12.8275, 12.3783, 12.965, 12.6082, 12.9808, 12.7304, 12.7819, 12.8789, 13.3175, 12.9236, 12.5759, 13.495, 12.4756, 13.9896, 13.0106, 13.9288, 13.7623, 13.3276, 13.836, 12.6956, 13.7045, 12.4996, 13.2393, 12.3623, 13.0845, 13.305, 12.7966, 13.7982, 12.861, 12.511, 12.6785, 13.9978, 13.0306, 12.5194, 13.0589, 12.9535, 12.891, 12.8073, 13.1392, 14.1328, 13.6109, 13.2639, 14.0814, 12.6519, 13.4724, 14.0136, 12.7213, 13.2285, 13.5151, 12.4156, 13.2931, 12.9425, 1

In [89]:
# Export station data for CKAN publishing
print(f"üì§ Exporting station data for CKAN publishing...")
try:
    # Export sensor configuration
    print("   Exporting sensor configuration...")
    station_sensors_data = client.stations.export_station_sensors(
        station_id=str(station_id),
        campaign_id=str(campaign_id)
    )

    # Export measurement data
    print("   Exporting measurement data...")
    station_measurements_data = client.stations.export_station_measurements(
        station_id=str(station_id),
        campaign_id=str(campaign_id)
    )

    # Check exported data sizes
    sensors_size = len(station_sensors_data.getvalue()) if hasattr(station_sensors_data, 'getvalue') else 0
    measurements_size = len(station_measurements_data.getvalue()) if hasattr(station_measurements_data, 'getvalue') else 0

    print(f"‚úÖ Data export completed:")
    print(f"   ‚Ä¢ Sensors data: {sensors_size:,} bytes")
    print(f"   ‚Ä¢ Measurements data: {measurements_size:,} bytes")
    print(f"   ‚Ä¢ Total data size: {(sensors_size + measurements_size):,} bytes")

    if sensors_size == 0:
        print("‚ö†Ô∏è  Warning: Sensors data is empty")
    if measurements_size == 0:
        print("‚ö†Ô∏è  Warning: Measurements data is empty")

    print("‚úÖ Ready for CKAN publication!")

except Exception as e:
    print(f"‚ùå Error exporting station data: {e}")
    raise

üì§ Exporting station data for CKAN publishing...
   Exporting sensor configuration...
   Exporting measurement data...
‚úÖ Data export completed:
   ‚Ä¢ Sensors data: 0 bytes
   ‚Ä¢ Measurements data: 3,386,767 bytes
   ‚Ä¢ Total data size: 3,386,767 bytes
‚úÖ Ready for CKAN publication!


## 5. CKAN Dataset Creation and Publishing

Now let's publish the campaign data to CKAN using the integrated publishing functionality.

In [90]:
# Prepare dataset metadata
dataset_name = f"upstream-campaign-{campaign_id}"
print(f"üè∑Ô∏è  Preparing dataset metadata for: {dataset_name}")

# Create comprehensive metadata
dataset_metadata = {
    "name": dataset_name,
    "title": campaign_details.name,
    "notes": f"""{campaign_details.description}

This dataset contains environmental sensor data collected through the Upstream platform.

**Campaign Information:**
- Campaign ID: {campaign_id}
- Contact: {campaign_details.contact_name} ({campaign_details.contact_email})
- Allocation: {campaign_details.allocation}
- Duration: {campaign_details.start_date} to {campaign_details.end_date}

**Data Structure:**
- Sensors Configuration: Contains sensor metadata, units, and processing information
- Measurement Data: Time-series environmental measurements with geographic coordinates

**Access and Usage:**
Data is provided in CSV format for easy analysis and integration with various tools.""",
    "tags": ["environmental", "sensors", "upstream", "monitoring", "time-series"],
    "extras": [
        {"key": "campaign_id", "value": str(campaign_id)},
        {"key": "station_id", "value": str(station_id)},
        {"key": "source", "value": "Upstream Platform"},
        {"key": "data_type", "value": "environmental_sensor_data"},
        {"key": "contact_email", "value": campaign_details.contact_email},
        {"key": "allocation", "value": campaign_details.allocation},
        {"key": "export_date", "value": datetime.now().isoformat()}
    ],
    "license_id": "cc-by",  # Creative Commons Attribution
}

print(f"üìã Dataset Metadata Prepared:")
print(f"   ‚Ä¢ Name: {dataset_metadata['name']}")
print(f"   ‚Ä¢ Title: {dataset_metadata['title']}")
print(f"   ‚Ä¢ Tags: {', '.join(dataset_metadata['tags'])}")
print(f"   ‚Ä¢ License: {dataset_metadata['license_id']}")
print(f"   ‚Ä¢ Extra fields: {len(dataset_metadata['extras'])}")

üè∑Ô∏è  Preparing dataset metadata for: upstream-campaign-1
üìã Dataset Metadata Prepared:
   ‚Ä¢ Name: upstream-campaign-1
   ‚Ä¢ Title: Test Campaign 2024
   ‚Ä¢ Tags: environmental, sensors, upstream, monitoring, time-series
   ‚Ä¢ License: cc-by
   ‚Ä¢ Extra fields: 7


In [65]:
!pip install -e .


Obtaining file:///Users/mosorio/repos/tacc/upstream/sdk
  Installing build dependencies ... [?25ldone
[?25h  Checking if build backend supports build_editable ... [?25ldone
[?25h  Getting requirements to build editable ... [?25ldone
[?25h  Preparing editable metadata (pyproject.toml) ... [?25ldone
Building wheels for collected packages: upstream-sdk
  Building editable for upstream-sdk (pyproject.toml) ... [?25ldone
[?25h  Created wheel for upstream-sdk: filename=upstream_sdk-1.0.1-0.editable-py3-none-any.whl size=8429 sha256=e0a4454b188369bd60816a62e755026dfb1639216c759579a9dd80eb63f45c72
  Stored in directory: /private/var/folders/qn/xpsy3ssx5hbbb_ndr2sbt5w80000gn/T/pip-ephem-wheel-cache-cmh349j6/wheels/47/dc/ae/1a3abd774032839edac85dcd8bb9739031dd6ccef29fca9667
Successfully built upstream-sdk
Installing collected packages: upstream-sdk
  Attempting uninstall: upstream-sdk
    Found existing installation: upstream-sdk 1.0.1
    Uninstalling upstream-sdk-1.0.1:
      Successf

In [91]:
# Publish campaign data to CKAN using integrated method
print(f"üì§ Publishing campaign data to CKAN...")

try:
    # Use the integrated CKAN publishing method
    print(client.ckan.session.headers)
    publication_result = client.publish_to_ckan(
        campaign_id=str(campaign_id),
        station_id=str(station_id)
    )

    print(f"‚úÖ CKAN Publication Successful!")
    print(f"\nüìä Publication Summary:")
    print(f"   ‚Ä¢ Success: {publication_result['success']}")
    print(f"   ‚Ä¢ Dataset Name: {publication_result['dataset']['name']}")
    print(f"   ‚Ä¢ Dataset ID: {publication_result['dataset']['id']}")
    print(f"   ‚Ä¢ Resources Created: {len(publication_result['resources'])}")
    print(f"   ‚Ä¢ CKAN URL: {publication_result['ckan_url']}")
    print(f"   ‚Ä¢ Message: {publication_result['message']}")

    # Store results for further operations
    published_dataset = publication_result['dataset']
    published_resources = publication_result['resources']
    ckan_dataset_url = publication_result['ckan_url']

    print(f"\nüéâ Your data is now publicly available at:")
    print(f"   {ckan_dataset_url}")

except Exception as e:
    print(f"‚ùå CKAN publication failed: {e}")
    print("\nTroubleshooting tips:")
    print("   ‚Ä¢ Check CKAN API credentials")
    print("   ‚Ä¢ Verify organization permissions")
    print("   ‚Ä¢ Ensure CKAN portal is accessible")
    print("   ‚Ä¢ Check dataset name uniqueness")
    raise

üì§ Publishing campaign data to CKAN...
{'User-Agent': 'python-requests/2.32.4', 'Accept-Encoding': 'gzip, deflate', 'Accept': '*/*', 'Connection': 'keep-alive', 'Authorization': 'eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJqdGkiOiJZWDFWQmlkalpydzloQmNLT0M0VnJHZkpNcDFhSUJ2STFZXzZYUlFYZ0g1aTAxVi1mSXJlRUJzazVTOThoZkJGTHVfcm5Hb2lwLW5JeTBvWSIsImlhdCI6MTc1MzEzMDczNX0.4IJdemk0a4pkrRVH4Q5ENt6SnIXmQsuGoBphyIN_wu0'}
‚úÖ CKAN Publication Successful!

üìä Publication Summary:
   ‚Ä¢ Success: True
   ‚Ä¢ Dataset Name: upstream-campaign-1
   ‚Ä¢ Dataset ID: 496cae48-2dce-44b8-a4b9-5ecdce78dd95
   ‚Ä¢ Resources Created: 2
   ‚Ä¢ CKAN URL: http://ckan.tacc.cloud:5000/dataset/upstream-campaign-1
   ‚Ä¢ Message: Campaign data published to CKAN: upstream-campaign-1

üéâ Your data is now publicly available at:
   http://ckan.tacc.cloud:5000/dataset/upstream-campaign-1


## 6. Dataset Verification and Exploration

Let's verify the published dataset and explore its contents in CKAN.

In [69]:
# Verify the published dataset
print(f"üîç Verifying published dataset in CKAN...")

try:
    # Retrieve the dataset from CKAN to verify it was created correctly
    verified_dataset = ckan.get_dataset(published_dataset['name'])

    print(f"‚úÖ Dataset verification successful!")
    print(f"\nüìã Dataset Information:")
    print(f"   ‚Ä¢ Name: {verified_dataset['name']}")
    print(f"   ‚Ä¢ Title: {verified_dataset['title']}")
    print(f"   ‚Ä¢ State: {verified_dataset['state']}")
    print(f"   ‚Ä¢ Private: {verified_dataset.get('private', 'Unknown')}")
    print(f"   ‚Ä¢ License: {verified_dataset.get('license_title', 'Not specified')}")
    print(f"   ‚Ä¢ Created: {verified_dataset.get('metadata_created', 'Unknown')}")
    print(f"   ‚Ä¢ Modified: {verified_dataset.get('metadata_modified', 'Unknown')}")

    # Show organization info if available
    if verified_dataset.get('organization'):
        org = verified_dataset['organization']
        print(f"   ‚Ä¢ Organization: {org.get('title', org.get('name', 'Unknown'))}")

    # Show tags
    if verified_dataset.get('tags'):
        tags = [tag['name'] for tag in verified_dataset['tags']]
        print(f"   ‚Ä¢ Tags: {', '.join(tags)}")

    # Show extras
    if verified_dataset.get('extras'):
        print(f"   ‚Ä¢ Extra metadata fields: {len(verified_dataset['extras'])}")
        for extra in verified_dataset['extras'][:3]:  # Show first 3
            print(f"     - {extra['key']}: {extra['value']}")

except Exception as e:
    print(f"‚ùå Dataset verification failed: {e}")

üîç Verifying published dataset in CKAN...
‚úÖ Dataset verification successful!

üìã Dataset Information:
   ‚Ä¢ Name: upstream-campaign-1
   ‚Ä¢ Title: Test Campaign 2024
   ‚Ä¢ State: active
   ‚Ä¢ Private: False
   ‚Ä¢ License: None
   ‚Ä¢ Created: 2025-07-22T13:26:30.140218
   ‚Ä¢ Modified: 2025-07-22T13:26:31.159425
   ‚Ä¢ Organization: org
   ‚Ä¢ Tags: environmental, sensors, upstream
   ‚Ä¢ Extra metadata fields: 3
     - campaign_id: 1
     - data_type: environmental_sensor_data
     - source: Upstream Platform


In [70]:
# Examine the published resources
print(f"üìÅ Examining published resources...")

try:
    resources = verified_dataset.get('resources', [])

    if resources:
        print(f"Found {len(resources)} resources:")

        for i, resource in enumerate(resources, 1):
            print(f"\n   üìÑ Resource {i}: {resource['name']}")
            print(f"      ‚Ä¢ ID: {resource['id']}")
            print(f"      ‚Ä¢ Format: {resource.get('format', 'Unknown')}")
            print(f"      ‚Ä¢ Size: {resource.get('size', 'Unknown')} bytes")
            print(f"      ‚Ä¢ Description: {resource.get('description', 'No description')}")
            print(f"      ‚Ä¢ Created: {resource.get('created', 'Unknown')}")
            print(f"      ‚Ä¢ URL: {resource.get('url', 'Not available')}")

            # Show download information
            if resource.get('url'):
                download_url = resource['url']
                if not download_url.startswith('http'):
                    download_url = f"{CKAN_URL}{download_url}"
                print(f"      ‚Ä¢ Download: {download_url}")

        print(f"\n‚úÖ All resources published successfully!")

    else:
        print("‚ö†Ô∏è  No resources found in the dataset")

except Exception as e:
    print(f"‚ùå Error examining resources: {e}")

üìÅ Examining published resources...
Found 2 resources:

   üìÑ Resource 1: Sensors Configuration
      ‚Ä¢ ID: 06fc0c44-bd8e-408e-b8a3-50b84338e5ba
      ‚Ä¢ Format: CSV
      ‚Ä¢ Size: 5502 bytes
      ‚Ä¢ Description: Sensor configuration and metadata
      ‚Ä¢ Created: 2025-07-22T13:26:30.333154
      ‚Ä¢ URL: http://ckan.tacc.cloud:5000/dataset/496cae48-2dce-44b8-a4b9-5ecdce78dd95/resource/06fc0c44-bd8e-408e-b8a3-50b84338e5ba/download/uploaded_file
      ‚Ä¢ Download: http://ckan.tacc.cloud:5000/dataset/496cae48-2dce-44b8-a4b9-5ecdce78dd95/resource/06fc0c44-bd8e-408e-b8a3-50b84338e5ba/download/uploaded_file

   üìÑ Resource 2: Measurement Data
      ‚Ä¢ ID: 8fd5f872-6fa9-4b5a-809b-325ecc761cbd
      ‚Ä¢ Format: CSV
      ‚Ä¢ Size: 3386767 bytes
      ‚Ä¢ Description: Environmental sensor measurements
      ‚Ä¢ Created: 2025-07-22T13:26:30.817944
      ‚Ä¢ URL: http://ckan.tacc.cloud:5000/dataset/496cae48-2dce-44b8-a4b9-5ecdce78dd95/resource/8fd5f872-6fa9-4b5a-809b-325ecc761cbd/

## 7. Dataset Management Operations

Let's demonstrate additional CKAN management operations like updating datasets and managing resources.

In [71]:
# Update dataset with additional metadata
print(f"üîÑ Demonstrating dataset update operations...")

try:
    # Add update timestamp and additional tags
    current_tags = [tag['name'] for tag in verified_dataset.get('tags', [])]
    updated_tags = current_tags + ["demo", "notebook-generated"]

    # Update the dataset
    updated_dataset = ckan.update_dataset(
        dataset_id=published_dataset['name'],
        tags=updated_tags,
        notes=f"{verified_dataset.get('notes', '')}\n\n**Last Updated:** {datetime.now().strftime('%Y-%m-%d %H:%M:%S UTC')} (via Upstream SDK Demo)"
    )

    print(f"‚úÖ Dataset updated successfully!")
    print(f"   ‚Ä¢ New tags added: demo, notebook-generated")
    print(f"   ‚Ä¢ Description updated with timestamp")
    print(f"   ‚Ä¢ Total tags: {len(updated_dataset.get('tags', []))}")

except Exception as e:
    print(f"‚ö†Ô∏è  Dataset update failed: {e}")
    print("This may be due to insufficient permissions or CKAN configuration.")

üîÑ Demonstrating dataset update operations...
‚úÖ Dataset updated successfully!
   ‚Ä¢ New tags added: demo, notebook-generated
   ‚Ä¢ Description updated with timestamp
   ‚Ä¢ Total tags: 5


In [72]:
# Demonstrate resource management
print(f"üìé Demonstrating resource management...")

try:
    # Create a metadata resource with campaign summary
    metadata_content = {
        "campaign_info": {
            "id": str(campaign_id),
            "name": campaign_details.name,
            "description": campaign_details.description,
            "contact": {
                "name": campaign_details.contact_name,
                "email": campaign_details.contact_email
            },
            "allocation": campaign_details.allocation,
            "dates": {
                "start": str(campaign_details.start_date),
                "end": str(campaign_details.end_date)
            }
        },
        "station_info": {
            "id": str(station_id),
            "name": selected_station.name,
            "description": selected_station.description
        },
        "export_info": {
            "timestamp": datetime.now().isoformat(),
            "sdk_version": "1.0.0",
            "format_version": "1.0"
        }
    }

    # Create a JSON metadata file
    metadata_json = json.dumps(metadata_content, indent=2)
    metadata_file = BytesIO(metadata_json.encode('utf-8'))
    metadata_file.name = "campaign_metadata.json"

    # Add as a resource
    metadata_resource = ckan.create_resource(
        dataset_id=published_dataset['id'],
        name="Campaign Metadata",
        file_obj=metadata_file,
        format="JSON",
        description="Comprehensive metadata about the campaign, station, and export process",
        resource_type="metadata"
    )

    print(f"‚úÖ Metadata resource created successfully!")
    print(f"   ‚Ä¢ Resource ID: {metadata_resource['id']}")
    print(f"   ‚Ä¢ Name: {metadata_resource['name']}")
    print(f"   ‚Ä¢ Format: {metadata_resource['format']}")
    print(f"   ‚Ä¢ Size: {len(metadata_json)} bytes")

except Exception as e:
    print(f"‚ö†Ô∏è  Resource creation failed: {e}")
    print("This may be due to insufficient permissions or CKAN configuration.")

üìé Demonstrating resource management...
‚úÖ Metadata resource created successfully!
   ‚Ä¢ Resource ID: f1522ba6-2086-4743-a209-faf616e9c1d6
   ‚Ä¢ Name: Campaign Metadata
   ‚Ä¢ Format: JSON
   ‚Ä¢ Size: 624 bytes


## 8. Data Discovery and Search

Let's demonstrate how published data can be discovered and searched in CKAN.

In [73]:
# Search for datasets using various criteria
print(f"üîç Demonstrating CKAN data discovery capabilities...")

# Search by tags
print(f"\n1. üìå Search by tags ('environmental', 'upstream'):")
try:
    tag_results = ckan.list_datasets(
        tags=["environmental", "upstream"],
        limit=5
    )

    if tag_results:
        print(f"   Found {len(tag_results)} datasets with environmental/upstream tags:")
        for dataset in tag_results:
            print(f"   ‚Ä¢ {dataset['name']}: {dataset['title']}")
            tags = [tag['name'] for tag in dataset.get('tags', [])]
            print(f"     Tags: {', '.join(tags)}")
    else:
        print("   No datasets found with these tags")

except Exception as e:
    print(f"   ‚ùå Tag search failed: {e}")

üîç Demonstrating CKAN data discovery capabilities...

1. üìå Search by tags ('environmental', 'upstream'):
   Found 1 datasets with environmental/upstream tags:
   ‚Ä¢ upstream-campaign-1: Test Campaign 2024
     Tags: demo, environmental, notebook-generated, sensors, upstream


In [77]:
# Search by organization (if configured)
if CKAN_ORGANIZATION:
    print(f"\n2. üè¢ Search by organization ('{CKAN_ORGANIZATION}'):")
    try:
        org_results = ckan.list_datasets(
            organization=CKAN_ORGANIZATION,
            limit=5
        )

        if org_results:
            print(f"   Found {len(org_results)} datasets in organization:")
            for dataset in org_results:
                print(f"   ‚Ä¢ {dataset['name']}: {dataset['title']}")
                if dataset.get('organization'):
                    org = dataset['organization']
                    print(f"     Organization: {org.get('title', org.get('name'))}")
        else:
            print(f"   No datasets found in organization '{CKAN_ORGANIZATION}'")

    except Exception as e:
        print(f"   ‚ùå Organization search failed: {e}")
else:
    print(f"\n2. üè¢ Organization search skipped (no organization configured)")


2. üè¢ Search by organization ('org'):
   No datasets found in organization 'org'


In [75]:
# General dataset search
print(f"\n3. üìä General dataset search:")
try:
    general_results = ckan.list_datasets(limit=10)

    if general_results:
        print(f"   Found {len(general_results)} total datasets (showing first 10):")
        for i, dataset in enumerate(general_results[:5], 1):
            print(f"   {i}. {dataset['name']}")
            print(f"      Title: {dataset['title']}")
            print(f"      Resources: {len(dataset.get('resources', []))}")
            if dataset.get('organization'):
                org = dataset['organization']
                print(f"      Organization: {org.get('title', org.get('name'))}")
            print()

        if len(general_results) > 5:
            print(f"   ... and {len(general_results) - 5} more datasets")
    else:
        print("   No datasets found")

except Exception as e:
    print(f"   ‚ùå General search failed: {e}")


3. üìä General dataset search:
   Found 3 total datasets (showing first 10):
   1. upstream-campaign-1
      Title: Test Campaign 2024
      Resources: 3
      Organization: org

   2. test-dataset-integration3
      Title: test-dataset-integration3
      Resources: 0
      Organization: org

   3. test-dataset-integration2
      Title: test-dataset-integration2
      Resources: 0
      Organization: org



## 9. Best Practices and Advanced Features

Let's explore best practices for CKAN integration and advanced features.

In [None]:
# Demonstrate data validation and quality checks
print(f"üí° CKAN Integration Best Practices:")

print(f"\n1. üìã Dataset Naming Conventions:")
print(f"   ‚Ä¢ Use consistent prefixes (e.g., 'upstream-campaign-{campaign_id}')")
print(f"   ‚Ä¢ Include version information for updated datasets")
print(f"   ‚Ä¢ Use lowercase and hyphens for URL-friendly names")
print(f"   ‚Ä¢ Example: upstream-campaign-{campaign_id}-v2")

print(f"\n2. üè∑Ô∏è  Metadata Best Practices:")
print(f"   ‚Ä¢ Use comprehensive descriptions with context")
print(f"   ‚Ä¢ Include contact information and data lineage")
print(f"   ‚Ä¢ Add standardized tags for discoverability")
print(f"   ‚Ä¢ Use extras for machine-readable metadata")
print(f"   ‚Ä¢ Specify appropriate licenses")

print(f"\n3. üìÅ Resource Organization:")
print(f"   ‚Ä¢ Separate data files by type (sensors, measurements, metadata)")
print(f"   ‚Ä¢ Use descriptive resource names and descriptions")
print(f"   ‚Ä¢ Include format specifications (CSV headers, units)")
print(f"   ‚Ä¢ Provide data dictionaries for complex datasets")

print(f"\n4. üîÑ Update Management:")
print(f"   ‚Ä¢ Version datasets when structure changes")
print(f"   ‚Ä¢ Update modification timestamps")
print(f"   ‚Ä¢ Maintain backward compatibility when possible")
print(f"   ‚Ä¢ Document changes in dataset descriptions")

In [None]:
# Performance and monitoring considerations
print(f"\n‚ö° Performance and Monitoring:")

# Check dataset and resource sizes
total_resources = len(verified_dataset.get('resources', []))
total_size = sum(int(r.get('size', 0)) for r in verified_dataset.get('resources', []) if r.get('size'))

print(f"\nüìä Current Dataset Metrics:")
print(f"   ‚Ä¢ Total Resources: {total_resources}")
print(f"   ‚Ä¢ Total Size: {total_size:,} bytes ({total_size/1024/1024:.2f} MB)")
print(f"   ‚Ä¢ Average Resource Size: {(total_size/total_resources)/1024:.1f} KB" if total_resources > 0 else "   ‚Ä¢ No resources with size information")

print(f"\nüí° Optimization Recommendations:")
if total_size > 50 * 1024 * 1024:  # 50 MB
    print(f"   ‚ö†Ô∏è  Large dataset detected ({total_size/1024/1024:.1f} MB)")
    print(f"   ‚Ä¢ Consider data compression")
    print(f"   ‚Ä¢ Split into smaller time-based chunks")
    print(f"   ‚Ä¢ Use streaming for large file processing")
else:
    print(f"   ‚úÖ Dataset size is reasonable ({total_size/1024/1024:.1f} MB)")

if total_resources > 10:
    print(f"   ‚ö†Ô∏è  Many resources ({total_resources})")
    print(f"   ‚Ä¢ Consider consolidating related resources")
    print(f"   ‚Ä¢ Use clear naming conventions")
else:
    print(f"   ‚úÖ Resource count is manageable ({total_resources})")

print(f"\nüîç Monitoring Recommendations:")
print(f"   ‚Ä¢ Monitor dataset access patterns")
print(f"   ‚Ä¢ Track resource download statistics")
print(f"   ‚Ä¢ Set up automated data freshness checks")
print(f"   ‚Ä¢ Implement data quality validation pipelines")

## 10. Integration Workflows

Let's demonstrate automated workflows for continuous data publishing.

In [None]:
# Demonstrate automated publishing workflow
print(f"üîÑ Automated CKAN Publishing Workflow:")

def automated_campaign_publisher(client, campaign_id, station_id=None, update_existing=True):
    """
    Automated workflow for publishing campaign data to CKAN.

    This function demonstrates a complete workflow that could be
    automated for regular data publishing.
    """
    workflow_steps = []

    try:
        # Step 1: Validate campaign
        workflow_steps.append("Validating campaign data...")
        print(f"   1Ô∏è‚É£  Validating campaign {campaign_id}...")
        campaign = client.get_campaign(str(campaign_id))

        # Step 2: Get stations
        workflow_steps.append("Retrieving station information...")
        print(f"   2Ô∏è‚É£  Retrieving stations...")
        stations = client.list_stations(campaign_id=str(campaign_id))

        if not stations.items:
            raise Exception("No stations found in campaign")

        target_station = stations.items[0] if not station_id else next(
            (s for s in stations.items if s.id == station_id), None
        )

        if not target_station:
            raise Exception(f"Station {station_id} not found")

        # Step 3: Check for existing dataset
        workflow_steps.append("Checking for existing CKAN dataset...")
        print(f"   3Ô∏è‚É£  Checking existing datasets...")
        dataset_name = f"upstream-campaign-{campaign_id}"

        dataset_exists = False
        try:
            existing_dataset = client.ckan.get_dataset(dataset_name)
            dataset_exists = True
            print(f"       Found existing dataset: {dataset_name}")
        except:
            print(f"       No existing dataset found")

        # Step 4: Publish or update
        if dataset_exists and update_existing:
            workflow_steps.append("Updating existing dataset...")
            print(f"   4Ô∏è‚É£  Updating existing dataset...")
        else:
            workflow_steps.append("Creating new dataset...")
            print(f"   4Ô∏è‚É£  Creating new dataset...")

        # Step 5: Publish data
        workflow_steps.append("Publishing data to CKAN...")
        print(f"   5Ô∏è‚É£  Publishing campaign data...")
        result = client.publish_to_ckan(
            campaign_id=str(campaign_id),
            station_id=str(target_station.id)
        )

        # Step 6: Validation
        workflow_steps.append("Validating published dataset...")
        print(f"   6Ô∏è‚É£  Validating publication...")

        return {
            "success": True,
            "dataset_name": dataset_name,
            "ckan_url": result['ckan_url'],
            "steps_completed": len(workflow_steps),
            "workflow_steps": workflow_steps
        }

    except Exception as e:
        return {
            "success": False,
            "error": str(e),
            "steps_completed": len(workflow_steps),
            "workflow_steps": workflow_steps,
            "failed_at_step": len(workflow_steps) + 1
        }

# Run the workflow demonstration
print(f"\nüöÄ Running automated workflow for campaign {campaign_id}...")
workflow_result = automated_campaign_publisher(
    client=client,
    campaign_id=campaign_id,
    station_id=station_id,
    update_existing=True
)

print(f"\nüìã Workflow Results:")
print(f"   ‚Ä¢ Success: {workflow_result['success']}")
print(f"   ‚Ä¢ Steps Completed: {workflow_result['steps_completed']}")

if workflow_result['success']:
    print(f"   ‚Ä¢ Dataset: {workflow_result['dataset_name']}")
    print(f"   ‚Ä¢ URL: {workflow_result['ckan_url']}")
    print(f"   ‚úÖ Automated publishing workflow completed successfully!")
else:
    print(f"   ‚Ä¢ Error: {workflow_result['error']}")
    print(f"   ‚Ä¢ Failed at step: {workflow_result['failed_at_step']}")
    print(f"   ‚ùå Workflow failed - see error details above")

## 11. Cleanup and Resource Management

Let's demonstrate proper cleanup and resource management.

In [None]:
# Dataset management options
print(f"üßπ Dataset Management and Cleanup Options:")

print(f"\nüìä Current Dataset Status:")
print(f"   ‚Ä¢ Dataset Name: {published_dataset['name']}")
print(f"   ‚Ä¢ Dataset ID: {published_dataset['id']}")
print(f"   ‚Ä¢ CKAN URL: {ckan_dataset_url}")
print(f"   ‚Ä¢ Resources: {len(published_resources)}")

print(f"\nüîß Management Options:")
print(f"   1. Keep dataset active (recommended for production)")
print(f"   2. Make dataset private (hide from public)")
print(f"   3. Archive dataset (mark as deprecated)")
print(f"   4. Delete dataset (only for test data)")

# For demo purposes, we'll show how to manage the dataset
print(f"\nüí° For this demo, we'll keep the dataset active.")
print(f"   Your published data will remain available at:")
print(f"   {ckan_dataset_url}")

# Uncomment the following section if you want to delete the demo dataset
"""
# CAUTION: Uncomment only for cleanup of test datasets
print(f"\n‚ö†Ô∏è  Demo dataset cleanup:")
try:
    # Delete the demo dataset (only for demo purposes)
    deletion_result = ckan.delete_dataset(published_dataset['name'])
    if deletion_result:
        print(f"   ‚úÖ Demo dataset deleted successfully")
    else:
        print(f"   ‚ùå Dataset deletion failed")
except Exception as e:
    print(f"   ‚ö†Ô∏è  Could not delete dataset: {e}")
    print(f"   This may be due to insufficient permissions or CKAN configuration.")
"""

print(f"\nüîÑ Resource Cleanup:")
try:
    # Close any open file handles
    if 'station_sensors_data' in locals():
        station_sensors_data.close()
    if 'station_measurements_data' in locals():
        station_measurements_data.close()
    if 'metadata_file' in locals():
        metadata_file.close()

    print(f"   ‚úÖ File handles closed")
except Exception as e:
    print(f"   ‚ö†Ô∏è  Error closing file handles: {e}")

In [None]:
# Logout and final cleanup
print(f"üëã Session cleanup and logout...")

try:
    # Logout from Upstream
    client.logout()
    print(f"   ‚úÖ Logged out from Upstream successfully")
except Exception as e:
    print(f"   ‚ùå Logout error: {e}")

print(f"\nüéâ CKAN Integration Demo Completed Successfully!")

print(f"\nüìö Summary of What We Accomplished:")
print(f"   ‚úÖ Connected to both Upstream and CKAN platforms")
print(f"   ‚úÖ Selected and validated campaign data")
print(f"   ‚úÖ Exported sensor and measurement data")
print(f"   ‚úÖ Created comprehensive CKAN dataset with metadata")
print(f"   ‚úÖ Published resources (sensors, measurements, metadata)")
print(f"   ‚úÖ Demonstrated dataset management operations")
print(f"   ‚úÖ Explored data discovery and search capabilities")
print(f"   ‚úÖ Showed automated publishing workflows")

print(f"\nüåê Your Data is Now Publicly Available:")
print(f"   üìä Dataset: {published_dataset['name']}")
print(f"   üîó URL: {ckan_dataset_url}")
print(f"   üìÅ Resources: {len(published_resources)} files available for download")

print(f"\nüìñ Next Steps:")
print(f"   ‚Ä¢ Explore your published data in the CKAN web interface")
print(f"   ‚Ä¢ Set up automated publishing workflows for production")
print(f"   ‚Ä¢ Configure organization permissions and access controls")
print(f"   ‚Ä¢ Integrate CKAN APIs with other data analysis tools")
print(f"   ‚Ä¢ Monitor dataset usage and access patterns")

## Summary

This notebook demonstrated the comprehensive CKAN integration capabilities of the Upstream SDK:

‚úÖ **Authentication & Setup** - Configured both Upstream and CKAN credentials  
‚úÖ **Data Export** - Retrieved campaign data and prepared for publishing  
‚úÖ **Dataset Creation** - Created CKAN datasets with rich metadata  
‚úÖ **Resource Management** - Published multiple data resources (sensors, measurements, metadata)  
‚úÖ **Portal Exploration** - Discovered existing datasets and organizations  
‚úÖ **Update Operations** - Demonstrated dataset and resource updates  
‚úÖ **Search & Discovery** - Showed data findability through tags and organization  
‚úÖ **Automation Workflows** - Built reusable publishing processes  
‚úÖ **Best Practices** - Covered naming, metadata, and performance considerations  

## Key Features

- **Seamless Integration**: Direct connection between Upstream campaigns and CKAN datasets
- **Rich Metadata**: Automatic generation of comprehensive dataset descriptions and tags
- **Multi-Resource Support**: Separate resources for sensors, measurements, and metadata
- **Update Management**: Smart handling of dataset updates and versioning
- **Error Handling**: Robust error handling and validation throughout the process
- **Automation Ready**: Workflow patterns suitable for production automation

## Production Considerations

- **Authentication**: Use environment variables or configuration files for credentials
- **Monitoring**: Implement logging and monitoring for automated publishing workflows
- **Permissions**: Configure appropriate CKAN organization permissions and access controls
- **Validation**: Add comprehensive data validation before publishing
- **Backup**: Maintain backup copies of datasets before updates

## Related Documentation

- [Upstream SDK Documentation](https://upstream-sdk.readthedocs.io/)
- [CKAN API Documentation](https://docs.ckan.org/en/latest/api/)
- [Environmental Data Publishing Best Practices](https://www.example.com/best-practices)

---

*This notebook demonstrates CKAN integration for the Upstream SDK. For core platform functionality, see UpstreamSDK_Core_Demo.ipynb*