# Using TimeDB API

This notebook demonstrates how to use the TimeDB REST API to read and write time series data.

**Note**: This example assumes no user authentication (users_table is not created). In production, you would typically use authentication with API keys.

## What we'll cover:
1. Setting up the database schema (using SDK - admin task)
2. Starting the API server
3. Inserting time series data using the REST API
4. Reading time series data using the REST API
5. Updating records using the REST API

In [1]:
from timedb import TimeDataClient
import pandas as pd
import requests
import json
from datetime import datetime, timezone, timedelta
from typing import Dict, Any

# Create TimeDB client
td = TimeDataClient()

# API base URL (adjust if your API is running on a different host/port)
API_BASE_URL = "http://127.0.0.1:8000"
print("✓ Imports successful")

✓ Imports successful


## Part 1: Setup Database Schema

First, we'll use the SDK to create the database schema. This is typically done once by an administrator. The API cannot create or delete the database schema - this must be done through the SDK or CLI.

In [2]:
# Delete existing schema (optional - only if you want to start fresh)
# Uncomment the line below if you want to start with a clean database
td.delete()

# Create database schema
td.create()

Creating database schema...
✓ Schema created successfully


## Part 2: Start the API Server

Before we can use the API, we need to start the API server. 

**Note**: The API server runs in a blocking manner. In a notebook, we'll start it in a background process so we can continue using the notebook.

In [3]:
# Start the API server in a separate terminal:
# timedb api --host 127.0.0.1 --port 8000

# Or using subprocess (for notebook use):
import subprocess
import time

# Kill any existing API server
subprocess.run(["pkill", "-f", "uvicorn.*timedb"], capture_output=True)
time.sleep(1)

# Start API server in background
process = subprocess.Popen(
    ["timedb", "api", "--host", "127.0.0.1", "--port", "8000"],
    stdout=subprocess.DEVNULL,
    stderr=subprocess.DEVNULL
)
time.sleep(3)  # Wait for server to start

# Check if API is running
try:
    response = requests.get(f"{API_BASE_URL}/")
    print("✓ API is running")
    print(f"  Name: {response.json()['name']}")
    print(f"  Version: {response.json().get('version', 'unknown')}")
except Exception as e:
    print(f"❌ API not running: {e}")

✓ API is running
  Name: TimeDB API
  Version: 0.1.1


## Part 3: Insert Data Using the API

Now let's create some sample time series data and insert it using the REST API.

In [4]:
# Headers for API requests (no authentication in this example)
headers = {"Content-Type": "application/json"}

In [None]:
# First, create the time series using the /series endpoint
# Use data_class='overlapping' so we can demonstrate updates later
# (updates only work on overlapping, not flat)
series_to_create = [
    {
        "name": "temperature",
        "description": "Temperature measurements in Celsius",
        "unit": "celsius",
        "labels": {"location": "office"},
        "data_class": "overlapping"
    },
    {
        "name": "humidity",
        "description": "Relative humidity percentage",
        "unit": "percent",
        "labels": {"location": "office"},
        "data_class": "overlapping"
    }
]

created_series = {}
for series_info in series_to_create:
    response = requests.post(
        f"{API_BASE_URL}/series",
        json=series_info,
        headers=headers
    )
    response.raise_for_status()
    result = response.json()
    series_name = series_info["name"]
    created_series[series_name] = result["series_id"]
    print(f"✓ Created series '{series_name}': {result['series_id']}")
    print(f"  Message: {result['message']}")

print(f"\n✓ Created {len(created_series)} time series")

In [None]:
# Create sample time series data
base_time = datetime(2025, 1, 1, 0, 0, tzinfo=timezone.utc)
dates = [base_time + timedelta(hours=i) for i in range(24)]

# Prepare request payload for API
# Include series_id to reference the pre-created series (avoids duplicate creation)
value_rows = []
for i, date in enumerate(dates):
    # Add temperature value
    value_rows.append({
        "valid_time": date.isoformat(),
        "value_key": "temperature",
        "series_id": created_series["temperature"],
        "value": 20.0 + i * 0.3  # Temperature rising
    })
    # Add humidity value
    value_rows.append({
        "valid_time": date.isoformat(),
        "value_key": "humidity",
        "series_id": created_series["humidity"],
        "value": 60.0 - i * 0.5  # Humidity decreasing
    })

# Create batch request with batch_start_time
create_batch_request = {
    "batch_start_time": datetime.now(timezone.utc).isoformat(),
    "value_rows": value_rows
}

print(f"Prepared {len(value_rows)} value rows to insert")
print(f"Time range: {dates[0]} to {dates[-1]}")
print(f"Series: {', '.join(created_series.keys())}")

### 3.1: Upload the Data

Now let's upload the time series data using the `/upload` endpoint.

In [7]:
# Upload data via API
response = requests.post(
    f"{API_BASE_URL}/upload",
    json=create_batch_request,
    headers=headers
)
response.raise_for_status()

result = response.json()
print(f"✓ Created batch with ID: {result['batch_id']}")
print(f"  Message: {result['message']}")
print(f"\nSeries IDs returned:")
for series_name, series_id in result['series_ids'].items():
    print(f"  {series_name}: {series_id}")

# Store batch_id and series_ids for later use
batch_id = result['batch_id']
series_ids = result['series_ids']  # Maps series name -> series_id

✓ Created batch with ID: cc941cc2-c68c-44a0-a366-61c4a154ee01
  Message: Batch created successfully

Series IDs returned:
  temperature: 8aefded6-b343-4000-9adf-baf9c961017c
  humidity: 09f74032-7ed8-4e6a-8dd9-8647425a7b36


### 3.2: List All Time Series

After uploading data, you can list all available time series to see their metadata.

In [8]:
# List all time series
response = requests.get(f"{API_BASE_URL}/list_timeseries", headers=headers)
response.raise_for_status()

timeseries_list = response.json()
print(f"✓ Found {len(timeseries_list)} time series")
print("\nSeries information:")
for series_id, series_info in timeseries_list.items():
    print(f"  {series_id}:")
    print(f"    Name: {series_info['name']}")
    print(f"    Description: {series_info.get('description', 'N/A')}")
    print(f"    Unit: {series_info['unit']}")
    print(f"    Labels: {series_info.get('labels', {})}")

✓ Found 4 time series

Series information:
  09f74032-7ed8-4e6a-8dd9-8647425a7b36:
    Name: humidity
    Description: None
    Unit: dimensionless
    Labels: {}
  3e605643-b991-405b-a08f-f10f353cc53d:
    Name: humidity
    Description: Relative humidity percentage
    Unit: percent
    Labels: {'location': 'office'}
  8aefded6-b343-4000-9adf-baf9c961017c:
    Name: temperature
    Description: None
    Unit: dimensionless
    Labels: {}
  9daee92f-982a-4697-96a0-39a8f7cb3c94:
    Name: temperature
    Description: Temperature measurements in Celsius
    Unit: celsius
    Labels: {'location': 'office'}


## Part 4: Read Data Using the API

Let's read the time series data we just inserted using the API.

In [9]:
# Read data via API
params = {
    "start_valid": base_time.isoformat(),
    "end_valid": (base_time + timedelta(hours=24)).isoformat(),
    "mode": "flat",  # "flat" returns latest known_time per valid_time
}

response = requests.get(f"{API_BASE_URL}/values", params=params, headers=headers)
response.raise_for_status()

data = response.json()
print(f"✓ Retrieved {data['count']} records via API")

# Convert to DataFrame for easier viewing
if data['count'] > 0:
    df_api = pd.DataFrame(data['data'])
    # Convert ISO strings back to datetime
    df_api['valid_time'] = pd.to_datetime(df_api['valid_time'])
    print("\nFirst few rows:")
    print(df_api.head(10))
    print(f"\nDataFrame shape: {df_api.shape}")
    print(f"Columns: {list(df_api.columns)}")
else:
    print("No data found")

✓ Retrieved 48 records via API

First few rows:
                 valid_time                             series_id  value  \
0 2025-01-01 00:00:00+00:00  09f74032-7ed8-4e6a-8dd9-8647425a7b36   60.0   
1 2025-01-01 00:00:00+00:00  8aefded6-b343-4000-9adf-baf9c961017c   20.0   
2 2025-01-01 01:00:00+00:00  09f74032-7ed8-4e6a-8dd9-8647425a7b36   59.5   
3 2025-01-01 01:00:00+00:00  8aefded6-b343-4000-9adf-baf9c961017c   20.3   
4 2025-01-01 02:00:00+00:00  09f74032-7ed8-4e6a-8dd9-8647425a7b36   59.0   
5 2025-01-01 02:00:00+00:00  8aefded6-b343-4000-9adf-baf9c961017c   20.6   
6 2025-01-01 03:00:00+00:00  09f74032-7ed8-4e6a-8dd9-8647425a7b36   58.5   
7 2025-01-01 03:00:00+00:00  8aefded6-b343-4000-9adf-baf9c961017c   20.9   
8 2025-01-01 04:00:00+00:00  09f74032-7ed8-4e6a-8dd9-8647425a7b36   58.0   
9 2025-01-01 04:00:00+00:00  8aefded6-b343-4000-9adf-baf9c961017c   21.2   

          name           unit labels  
0     humidity  dimensionless     {}  
1  temperature  dimensionless     {} 

In [10]:
df_api

Unnamed: 0,valid_time,series_id,value,name,unit,labels
0,2025-01-01 00:00:00+00:00,09f74032-7ed8-4e6a-8dd9-8647425a7b36,60.0,humidity,dimensionless,{}
1,2025-01-01 00:00:00+00:00,8aefded6-b343-4000-9adf-baf9c961017c,20.0,temperature,dimensionless,{}
2,2025-01-01 01:00:00+00:00,09f74032-7ed8-4e6a-8dd9-8647425a7b36,59.5,humidity,dimensionless,{}
3,2025-01-01 01:00:00+00:00,8aefded6-b343-4000-9adf-baf9c961017c,20.3,temperature,dimensionless,{}
4,2025-01-01 02:00:00+00:00,09f74032-7ed8-4e6a-8dd9-8647425a7b36,59.0,humidity,dimensionless,{}
5,2025-01-01 02:00:00+00:00,8aefded6-b343-4000-9adf-baf9c961017c,20.6,temperature,dimensionless,{}
6,2025-01-01 03:00:00+00:00,09f74032-7ed8-4e6a-8dd9-8647425a7b36,58.5,humidity,dimensionless,{}
7,2025-01-01 03:00:00+00:00,8aefded6-b343-4000-9adf-baf9c961017c,20.9,temperature,dimensionless,{}
8,2025-01-01 04:00:00+00:00,09f74032-7ed8-4e6a-8dd9-8647425a7b36,58.0,humidity,dimensionless,{}
9,2025-01-01 04:00:00+00:00,8aefded6-b343-4000-9adf-baf9c961017c,21.2,temperature,dimensionless,{}


### 4.1: Read with Different Modes

The API supports two query modes:
- **"flat"**: Returns the latest value per (valid_time, series_id), determined by most recent known_time
- **"overlapping"**: Returns all forecast revisions with their known_time, useful for backtesting

Let's try the overlapping mode:

In [11]:
# Read in overlapping mode to see all forecast revisions
params_overlapping = {
    "start_valid": base_time.isoformat(),
    "end_valid": (base_time + timedelta(hours=6)).isoformat(),  # Smaller range for clarity
    "mode": "overlapping",  # This mode shows all known_time revisions
}

response = requests.get(f"{API_BASE_URL}/values", params=params_overlapping, headers=headers)
response.raise_for_status()

data_overlapping = response.json()
print(f"✓ Retrieved {data_overlapping['count']} records in overlapping mode")

if data_overlapping['count'] > 0:
    df_overlapping = pd.DataFrame(data_overlapping['data'])
    df_overlapping['valid_time'] = pd.to_datetime(df_overlapping['valid_time'])
    if 'known_time' in df_overlapping.columns:
        df_overlapping['known_time'] = pd.to_datetime(df_overlapping['known_time'])
    print("\nFirst few rows (showing forecast revisions):")
    print(df_overlapping.head(10))

✓ Retrieved 12 records in overlapping mode

First few rows (showing forecast revisions):
                        known_time                valid_time  \
0 2026-02-02 16:25:47.556612+00:00 2025-01-01 00:00:00+00:00   
1 2026-02-02 16:25:47.556612+00:00 2025-01-01 00:00:00+00:00   
2 2026-02-02 16:25:47.556612+00:00 2025-01-01 01:00:00+00:00   
3 2026-02-02 16:25:47.556612+00:00 2025-01-01 01:00:00+00:00   
4 2026-02-02 16:25:47.556612+00:00 2025-01-01 02:00:00+00:00   
5 2026-02-02 16:25:47.556612+00:00 2025-01-01 02:00:00+00:00   
6 2026-02-02 16:25:47.556612+00:00 2025-01-01 03:00:00+00:00   
7 2026-02-02 16:25:47.556612+00:00 2025-01-01 03:00:00+00:00   
8 2026-02-02 16:25:47.556612+00:00 2025-01-01 04:00:00+00:00   
9 2026-02-02 16:25:47.556612+00:00 2025-01-01 04:00:00+00:00   

                              series_id  value         name           unit  \
0  09f74032-7ed8-4e6a-8dd9-8647425a7b36   60.0     humidity  dimensionless   
1  8aefded6-b343-4000-9adf-baf9c961017c   20.0  te

## Part 5: Insert More Data

Let's insert another batch with updated values to demonstrate how the API handles multiple batches.

In [None]:
# Create new time series data for a second batch
new_base_time = datetime(2025, 1, 2, 0, 0, tzinfo=timezone.utc)
new_dates = [new_base_time + timedelta(hours=i) for i in range(12)]

# Prepare request payload for a new batch
# Include series_id to reference the existing series (avoids duplicate creation)
value_rows_new = []
for i, date in enumerate(new_dates):
    # Add temperature value (updated forecast)
    value_rows_new.append({
        "valid_time": date.isoformat(),
        "value_key": "temperature",
        "series_id": created_series["temperature"],
        "value": 25.0 + i * 0.2  # Different values than first batch
    })
    # Add humidity value (updated forecast)
    value_rows_new.append({
        "valid_time": date.isoformat(),
        "value_key": "humidity",
        "series_id": created_series["humidity"],
        "value": 50.0 - i * 0.3  # Different values than first batch
    })

create_batch_request_new = {
    "batch_start_time": datetime.now(timezone.utc).isoformat(),
    "value_rows": value_rows_new
}

print(f"Prepared {len(value_rows_new)} value rows for second batch")
print(f"Time range: {new_dates[0]} to {new_dates[-1]}")

# Insert the new batch
response = requests.post(
    f"{API_BASE_URL}/upload",
    json=create_batch_request_new,
    headers=headers
)
response.raise_for_status()

result_new = response.json()
print(f"\n✓ Created second batch with ID: {result_new['batch_id']}")
print(f"  Message: {result_new['message']}")

In [None]:
# Read the newly inserted data
params_new = {
    "start_valid": new_base_time.isoformat(),
    "end_valid": (new_base_time + timedelta(hours=12)).isoformat(),
    "mode": "flat"
}

response = requests.get(f"{API_BASE_URL}/values", params=params_new, headers=headers)
response.raise_for_status()

data_new = response.json()
print(f"✓ Retrieved {data_new['count']} records for the new time range")

if data_new['count'] > 0:
    df_new = pd.DataFrame(data_new['data'])
    df_new['valid_time'] = pd.to_datetime(df_new['valid_time'])
    print("\nData from second batch:")
    print(df_new.head(10))

## Part 6: Update Records Using the API

The API supports updating existing **projection** records (flat are immutable). To update a record, you need:
- `batch_id`: The batch that created the record
- `tenant_id`: The tenant ID (defaults to zeros UUID if not authenticated)
- `valid_time`: The time the value is valid for
- `series_id`: The series identifier

Updates create a new version with a new `known_time` while preserving the original for audit trail.

Let's demonstrate updating a record.

In [None]:
# Get series_id from the read response
params_for_update = {
    "start_valid": base_time.isoformat(),
    "end_valid": (base_time + timedelta(hours=1)).isoformat(),
    "mode": "flat"
}

response = requests.get(f"{API_BASE_URL}/values", params=params_for_update, headers=headers)
response.raise_for_status()
data_for_update = response.json()

if data_for_update['count'] > 0:
    # Get the first record
    first_record = data_for_update['data'][0]
    print("Sample record structure:")
    print(f"  valid_time: {first_record.get('valid_time', 'N/A')}")
    print(f"  name: {first_record.get('name', 'N/A')}")
    print(f"  series_id: {first_record.get('series_id', 'N/A')}")
    print(f"  value: {first_record.get('value', 'N/A')}")
    
    # Default tenant_id for non-authenticated requests
    default_tenant_id = "00000000-0000-0000-0000-000000000000"
    
    # Use the series_id from our pre-created series
    update_request = {
        "updates": [
            {
                "batch_id": batch_id,  # From our first insert
                "tenant_id": default_tenant_id,
                "valid_time": base_time.isoformat(),
                "series_id": created_series["temperature"],
                "value": 22.5,  # Update the temperature value
                "annotation": "Updated via API"  # Add an annotation
            }
        ]
    }
    
    print(f"\nUpdating record:")
    print(f"  batch_id: {batch_id}")
    print(f"  valid_time: {base_time.isoformat()}")
    print(f"  series: temperature")
    print(f"  new value: 22.5")
    
    # Send update request
    response = requests.put(
        f"{API_BASE_URL}/values",
        json=update_request,
        headers=headers
    )
    response.raise_for_status()
    
    update_result = response.json()
    print(f"\n✓ Update result:")
    print(f"  Updated: {len(update_result['updated'])} records")
    print(f"  Skipped (no-op): {len(update_result['skipped_no_ops'])} records")
    
    if update_result['updated']:
        print(f"\nUpdated record:")
        for updated in update_result['updated']:
            print(f"  overlapping_id: {updated.get('overlapping_id', 'N/A')}")
else:
    print("No records found to update")

## Part 7: Verify the Update

Let's read the data again to verify the update was applied.

In [None]:
# Read the updated record
params_verify = {
    "start_valid": base_time.isoformat(),
    "end_valid": (base_time + timedelta(hours=1)).isoformat(),
    "mode": "flat"
}

response = requests.get(f"{API_BASE_URL}/values", params=params_verify, headers=headers)
response.raise_for_status()
data_verify = response.json()

if data_verify['count'] > 0:
    df_verify = pd.DataFrame(data_verify['data'])
    df_verify['valid_time'] = pd.to_datetime(df_verify['valid_time'])
    
    # Filter for temperature at the updated time
    temp_records = df_verify[
        (df_verify['name'] == 'temperature') & 
        (df_verify['valid_time'] == base_time)
    ]
    
    print(f"✓ Found {len(temp_records)} version(s) of the temperature record")
    if len(temp_records) > 0:
        current = temp_records.iloc[0]
        print(f"\nCurrent value: {current['value']}")
else:
    print("No records found")

## Summary

This notebook demonstrated how to use the TimeDB REST API to:
1. **Start the API server** - Required before making API calls
2. **Create series** - Using `POST /series` endpoint with name, unit, labels, and `data_class`
3. **Insert time series data** - Using `POST /upload` endpoint with `series_id` to reference pre-created series
4. **Read time series data** - Using `GET /values` endpoint with different modes
5. **Update records** - Using `PUT /values` endpoint (only works on overlapping)

### Key API Endpoints:

- **`GET /`** - API information and available endpoints
- **`POST /series`** - Create a new time series with name, unit, labels, and `data_class` (`'flat'` or `'overlapping'`)
- **`POST /upload`** - Create a new batch with time series values (pass `series_id` to use existing series)
- **`GET /values`** - Read time series values (supports `flat` and `overlapping` modes)
- **`PUT /values`** - Update existing projection records (flat are immutable)
- **`GET /list_timeseries`** - List all time series

### Query Modes:

- **`flat`**: Returns the latest value per (valid_time, series_id), determined by most recent known_time
- **`overlapping`**: Returns all forecast revisions with their known_time, useful for backtesting

### Data Classes:

- **`flat`** (default): Immutable facts. Cannot be updated.
- **`overlapping`**: Versioned forecasts. Can be updated via `PUT /values`.
- Updates create a new version with a new `known_time` while preserving the original for audit trail.
- Update responses include `overlapping_id` for the newly created version.

### Authentication:

- This example assumes **no authentication** (users_table not created)
- In production, you would:
  1. Create users_table using CLI: `timedb create tables --with-users`
  2. Create users via CLI: `timedb users create --tenant-id <uuid> --email <email>`
  3. Use API keys in requests: `headers={"X-API-Key": "your-api-key"}`
  4. Users can only access data for their own tenant_id

### Starting the API Server:

1. **Using the CLI** (recommended):
   ```bash
   timedb api --host 127.0.0.1 --port 8000
   ```

2. **Using uvicorn directly**:
   ```bash
   uvicorn timedb.api:app --host 127.0.0.1 --port 8000
   ```

3. **In a notebook** (using subprocess):
   ```python
   import subprocess
   process = subprocess.Popen(
       ["timedb", "api", "--host", "127.0.0.1", "--port", "8000"],
       stdout=subprocess.DEVNULL,
       stderr=subprocess.DEVNULL
   )
   ```

**Note**: To stop the server in a notebook, you can use `process.terminate()` or restart the kernel.