Stats Activity Dashboard

Stats & Activity Dashboard

The CWA Stats & Activity Dashboard provides comprehensive insights into library usage, user behavior, and system performance through three specialized tabs: User Activity, Library Statistics, and API Usage.

Overview

Access: Navigate to Admin → CWA Stats & Activity (requires admin privileges)

Features:

📊 Real-time activity tracking and visualization
📅 Flexible time range filtering (7 days to All Time)
🎯 Per-user filtering for detailed analysis
📥 CSV export for external analysis
🎨 Demo mode with realistic sample data
📱 Responsive charts powered by Apache ECharts

User Activity Tab

Tracks all user interactions with your library including logins, downloads, searches, and reading activity.

Metrics Overview Cards

Total Events: Sum of all tracked user activities

Measured: COUNT(*) from cwa_user_activity table
Includes: LOGIN, DOWNLOAD, READ, SEARCH, EMAIL, SHELF operations

Active Users: Number of unique users with activity

Measured: COUNT(DISTINCT user_id)
When filtering by user: Shows total login count instead

Downloads: Total ebook downloads/sends

Measured: COUNT where event_type IN ('DOWNLOAD', 'EMAIL')
Includes: Direct downloads, send-to-kindle, send-to-device

Books Read: Unique books accessed via reader

Measured: COUNT(DISTINCT item_id) where event_type = 'READ'

Searches: Total search queries

Measured: COUNT where event_type = 'SEARCH'

Activity Timeline

Line chart showing daily activity broken down by event type.

Data Source:

SELECT date(timestamp) as day, event_type, COUNT(*) 
FROM cwa_user_activity 
GROUP BY day, event_type

Event Types Tracked:

🔑 LOGIN: User authentication
📥 DOWNLOAD: Direct ebook downloads
📖 READ: In-browser reader access
🔍 SEARCH: Catalog searches
📧 EMAIL: Send-to-kindle/device
📚 SHELF: Shelf add/remove operations

Peak Usage Hours

Heatmap showing activity patterns by day of week and hour of day.

Measured: Hour extracted from timestamp, day-of-week calculation

Scale: 0-23 hours (24-hour format)
Days: Sunday (0) through Saturday (6)
Use Case: Identify optimal maintenance windows or peak server load times

Download Format Distribution

Pie chart showing breakdown of downloaded formats.

Data Source:

SELECT UPPER(COALESCE(
    CASE WHEN json_valid(extra_data) 
        THEN json_extract(extra_data, '$.format')
        ELSE extra_data 
    END,
    'UNKNOWN'
)) as format
WHERE event_type IN ('DOWNLOAD', 'EMAIL')

Tracked Formats: EPUB, MOBI, AZW3, PDF, KEPUB, etc.

Top Users (or Most Active Days)

When viewing all users: Shows top 10 users by activity count When filtering by user: Shows most active days for that user

Data Source:

-- All users mode
SELECT user_id, user_name, COUNT(*) as activity_count
GROUP BY user_id
ORDER BY activity_count DESC LIMIT 10

-- Single user mode
SELECT date(timestamp) as day, COUNT(*) as activity_count
WHERE user_id = ?
GROUP BY day
ORDER BY activity_count DESC LIMIT 10

Popular Books

Top 10 books by combined reads/downloads/emails.

Measured: COUNT(*) where event_type IN ('READ', 'DOWNLOAD', 'EMAIL') Use Case: Identify trending titles, popular genres

Recent Searches

Last 15 search queries with timestamps and usernames.

Data Source: extra_data field for SEARCH events Privacy Note: Admin-only view, helps identify catalog gaps

Reading Velocity

Line chart with 4-week moving average showing books read over time.

Measured: COUNT(DISTINCT item_id) per week where event_type = 'READ' Moving Average: Smooths out weekly variations for trend identification

Format Preferences by User

Stacked bar chart showing which formats each user prefers.

Data Source:

SELECT user_name, 
       UPPER(json_extract(extra_data, '$.format')) as format,
       COUNT(*)
WHERE event_type IN ('DOWNLOAD', 'READ', 'EMAIL')
GROUP BY user_name, format

Use Case: Optimize conversion settings per user preference

Discovery Sources

Pie chart showing how users find books.

Tracked Sources:

🔍 search: Via search results
📖 book_detail: From related books on detail page
📚 series: Through series browsing
👤 author: From author pages
🏷️ category: Category/genre browsing
📂 shelf: From custom shelves
➡️ direct: Direct URL access

Data Source: json_extract(extra_data, '$.source') on READ/DOWNLOAD events

Device Breakdown

Pie chart showing desktop vs mobile vs tablet usage.

Detection Method: User-Agent string parsing during log_activity()

if 'mobile' in user_agent or 'android' in user_agent:
    device_type = 'mobile'
elif 'tablet' in user_agent or 'ipad' in user_agent:
    device_type = 'tablet'
else:
    device_type = 'desktop'

Stored: json_extract(extra_data, '$.device_type')

Average Session Duration

Gauge chart showing average time between login and last activity.

Calculation:

AVG(
    (julianday(last_activity) - julianday(first_activity)) * 24 * 60
) as avg_minutes
WHERE user has multiple events in session

Search Success Rate

Gauge showing percentage of searches yielding results.

Measured: Heuristic based on search followed by download/read within timeframe Use Case: Identify catalog coverage issues

Shelf Activity

List showing adds/removes per shelf with net change indicators.

Data Source: SHELF_ADD and SHELF_REMOVE events Display: Top 10 most active shelves

Library Statistics Tab

Tracks library growth, content composition, and system automation effectiveness.

Summary Cards

Total Books: Current library size from Calibre metadata.db Books Added: Count imported during selected time range Conversions: Successful format conversions Format Count: Number of distinct formats in library

Library Growth

Area chart showing cumulative book count over time.

Data Source:

SELECT date(timestamp), COUNT(*) as books_added
FROM cwa_import
GROUP BY date(timestamp)

Calculation: Cumulative sum to show total library size progression

Format Distribution

Bar chart showing book count per format.

Data Source: Calibre metadata.db data table Note: Shows all formats available in library, not just added formats

Series Statistics

Table showing series completion status.

Columns:

Series Name
Book Count (owned)
Highest Index (series length indicator)

Calculation:

SELECT series.name, COUNT(books.id), MAX(books.series_index)
FROM books_series_link
JOIN series ON series.id = books_series_link.series
GROUP BY series.id

Use Case: Identify incomplete series for acquisition

Publication Year Distribution

Bar chart showing books by publication year.

Data Source: pubdate field from metadata.db Filters: Typically limited to years 1900-present

Rating Statistics

Comprehensive rating breakdown including:

Average Rating: Mean of all rated books
Unrated Percentage: Books without ratings
Distribution: Count per star rating (1-5 stars)

Data Source: ratings table in metadata.db Calculation:

SELECT rating/2.0 as stars, COUNT(*) 
FROM books_ratings_link
JOIN ratings ON ratings.id = books_ratings_link.rating
GROUP BY rating

Note: Calibre stores ratings as 0-10, displayed as 0-5 stars

Top Enforced Books

Books with most metadata enforcement operations.

Data Source:

SELECT book_id, book_title, COUNT(*) as enforcement_count,
       MAX(timestamp) as last_enforced
FROM cwa_enforcement
GROUP BY book_id
ORDER BY enforcement_count DESC
LIMIT 10

Trigger Types: EPUB_EDIT, MERGE, CONVERT Use Case: Identify frequently updated books

Import Source Flows

Sankey diagram showing book import pipeline from source to destination.

Tracked Flows:

Import Source → Import Method → Final Format
Example: "Web Upload" → "Calibredb Add" → "EPUB"

Data Source: cwa_import table with source/method tracking

API Usage Tab

Tracks OPDS, Kobo Sync, and REST API endpoint usage for monitoring integrations and third-party app connections.

API Usage Breakdown

Pie chart categorizing API calls by type.

Categories:

📱 OPDS: OPDS feed access (general ebook readers)
📖 Kobo Sync: Kobo device/app synchronization
🔧 REST API: Direct API endpoint calls
🌐 Web UI: Standard browser interface (baseline)

Data Source:

SELECT 
    CASE 
        WHEN endpoint LIKE '/opds%' THEN 'OPDS'
        WHEN endpoint LIKE '/kobo%' THEN 'Kobo Sync'
        WHEN endpoint LIKE '/api/%' THEN 'REST API'
        ELSE 'Web UI'
    END as category,
    COUNT(*)
FROM cwa_user_activity
WHERE event_type = 'OPDS_ACCESS' OR 'API_CALL'

Endpoint Frequency

Table showing most-accessed endpoints with call counts.

Columns:

Endpoint path (grouped by pattern)
Call count
Percentage of total

Grouping Logic: Similar endpoints consolidated (e.g., /kobo/{uuid}/state → /kobo/*/state)

API Timing Heatmap

Heatmap showing API usage patterns by hour and day.

Use Case:

Identify automated sync schedules
Detect unusual access patterns
Plan API maintenance windows

Filtering & Time Ranges

Time Range Options

Preset Ranges:

Last 7 Days
Last 30 Days (default)
Last 90 Days
Last 6 Months
Last Year
All Time
Custom Range

Custom Range: Select specific start and end dates via date pickers

User Filtering

User Activity Tab Only: Filter by specific user to analyze individual behavior

Dropdown: Shows all users with logged activity Impact:

Changes "Top Users" to "Most Active Days"
Changes "Active Users" card to "Total Logins"
All charts filtered to selected user's data

Reset Button

Clears all filters and returns to default view (Last 30 Days, all users).

CSV Export

Usage

Click 📥 Export CSV button on any tab to download current view as CSV.

Filename Format: cwa_stats_{tab_name}_{timestamp}.csv

Example: cwa_stats_activity_20251231_143022.csv

CSV Structure

Multi-Section Format: Each CSV contains multiple datasets separated by headers

User Activity CSV Sections:

=== USER ACTIVITY STATISTICS ===

Metric,Value
total_events,1234
active_users,15
...

=== TOP USERS ===
User ID,Username,Event Count
1,admin,456
...

=== FORMAT DISTRIBUTION ===
Format,Download Count
EPUB,789
...

Library Stats CSV Sections:

Library summary statistics
Growth data (date, count)
Format distribution
Series completion list
Rating breakdown
Top enforced books

API Stats CSV Sections:

Usage breakdown by category
Endpoint frequency table

Filter Preservation

CSV exports respect current filters:

Time range (days, custom dates)
User selection (Activity tab)

Implementation: Appends current URL query parameters to export endpoint

Demo Mode

Purpose

Test dashboard visualization with realistic sample data when:

Library is newly set up with no activity
Presenting features to stakeholders
Testing chart layouts and performance

Activation

Click 🎨 Show Demo Data button on any tab.

Indicators:

Button changes to 📊 Show Real Data (green gradient)
DEMO MODE badge appears in filter row
Pulsing animation indicates temporary state

Demo Data Generation

User Activity:

30 days of varied activity (100-300 events/day)
8 synthetic users with realistic names
Weighted format distribution (EPUB 40%, MOBI 25%, PDF 20%, etc.)
Peak usage during evenings (6 PM - 10 PM)
Heavier weekend activity

Library Stats:

Growth curve with 500-2000 books
Realistic format distribution matching common library composition
Series with 3-12 books each
Publication years weighted toward 2000-present
Rating distribution following normal curve (mostly 3-4 stars)

API Stats:

OPDS: 60% of API calls (reader apps)
Kobo Sync: 25% (e-reader devices)
REST API: 10% (integrations)
Hourly patterns matching typical device sync schedules

Technical Details

Demo data generated client-side via JavaScript functions:

generateDemoData() - User Activity
generateLibraryDemoData() - Library Stats
generateApiDemoData() - API Stats

Respects: Current time range filters (generates data for selected period)

Adding New Tracked Endpoints

Step 1: Log Activity in Route Handler

In your Flask route, import and call the logging function:

from scripts.cwa_db import CWA_DB
from flask import current_user
import json

@blueprint.route('/your-new-endpoint', methods=['GET'])
def your_endpoint():
    # Your endpoint logic here
    result = do_something()
    
    # Log the activity
    try:
        cwa_db = CWA_DB()
        cwa_db.log_activity(
            user_id=current_user.id,
            user_name=current_user.name,
            event_type='YOUR_EVENT_TYPE',  # e.g., 'API_CALL', 'DOWNLOAD', 'SEARCH'
            item_id=book_id if applicable else None,
            item_title=book_title if applicable else None,
            extra_data=json.dumps({
                'endpoint': '/your-new-endpoint',
                'method': 'GET',
                'status': 'success',
                # Add any relevant metadata
            })
        )
    except Exception as e:
        log.debug(f"Failed to log activity: {e}")
    
    return result

Step 2: Choose Event Type

Standard Event Types:

LOGIN - User authentication
DOWNLOAD - File downloads
READ - Reader access
SEARCH - Catalog searches
EMAIL - Send-to-device operations
OPDS_ACCESS - OPDS feed access
API_CALL - REST API calls
SHELF_ADD / SHELF_REMOVE - Shelf operations

Custom Event Types: Can create new types, will automatically appear in dashboards

Step 3: Structure extra_data

Use JSON format for rich metadata:

extra_data = json.dumps({
    'format': 'EPUB',           # For downloads/reads
    'source': 'search',         # How user found this (search/series/author/etc)
    'device_type': 'mobile',    # Auto-added by log_activity()
    'endpoint': '/api/books',   # For API calls
    'method': 'GET',            # HTTP method
    'status': 'success',        # Operation result
    'query': 'fantasy books',   # For searches
    # Add custom fields as needed
})

Step 4: Update API Stats Queries (Optional)

If adding new API endpoint patterns, update categorization in get_api_usage_breakdown():

# In scripts/cwa_db.py
CASE 
    WHEN endpoint LIKE '/opds%' THEN 'OPDS'
    WHEN endpoint LIKE '/kobo%' THEN 'Kobo Sync'
    WHEN endpoint LIKE '/api/%' THEN 'REST API'
    WHEN endpoint LIKE '/your-new-api/%' THEN 'Your New API'  # Add this
    ELSE 'Web UI'
END as category

Step 5: Test Activity Logging

Access your new endpoint
Navigate to CWA Stats & Activity
Check appropriate tab for new data
Use /cwa-stats-debug endpoint to verify JSON structure
Verify CSV export includes new data

Example: Adding OPDS Annotation Tracking

# In cps/opds.py
@opds.route('/opds/annotation', methods=['POST'])
def save_annotation():
    annotation_data = request.get_json()
    
    # Save annotation logic
    save_to_db(annotation_data)
    
    # Log activity
    try:
        cwa_db = CWA_DB()
        cwa_db.log_activity(
            user_id=current_user.id,
            user_name=current_user.name,
            event_type='OPDS_ACCESS',
            item_id=annotation_data.get('book_id'),
            item_title=get_book_title(annotation_data.get('book_id')),
            extra_data=json.dumps({
                'endpoint': '/opds/annotation',
                'method': 'POST',
                'action': 'save_annotation',
                'annotation_type': annotation_data.get('type')  # highlight, note, etc.
            })
        )
    except Exception as e:
        log.debug(f"Failed to log OPDS annotation: {e}")
    
    return jsonify({'status': 'success'})

Best Practices

Always use try/except: Don't let logging failures break your endpoint
Log after success: Only log completed operations, not failed attempts (except LOGIN_FAILED)
Include context: Add relevant metadata to extra_data for richer analytics
Consistent naming: Use existing event types when possible
Privacy aware: Don't log sensitive data (passwords, email content, etc.)
Device detection: Automatic via User-Agent, no manual tracking needed
Test thoroughly: Verify data appears correctly in dashboard and CSV exports

Database Schema

cwa_user_activity Table

Primary table for all activity tracking.

CREATE TABLE IF NOT EXISTS cwa_user_activity (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    timestamp TEXT DEFAULT CURRENT_TIMESTAMP,
    user_id INTEGER,
    user_name TEXT,
    event_type TEXT NOT NULL,
    item_id INTEGER,
    item_title TEXT,
    extra_data TEXT  -- JSON string
);

CREATE INDEX IF NOT EXISTS idx_activity_timestamp ON cwa_user_activity(timestamp);
CREATE INDEX IF NOT EXISTS idx_activity_user ON cwa_user_activity(user_id);
CREATE INDEX IF NOT EXISTS idx_activity_event ON cwa_user_activity(event_type);

Columns:

id: Auto-increment primary key
timestamp: ISO-8601 format (YYYY-MM-DD HH:MM:SS)
user_id: Foreign key to Calibre-Web user table
user_name: Denormalized for easier querying
event_type: Activity classification (LOGIN, DOWNLOAD, etc.)
item_id: Book ID when applicable
item_title: Book title (denormalized)
extra_data: JSON object with event-specific metadata

JSON Fields in extra_data:

{
    "format": "EPUB",
    "device_type": "mobile",
    "source": "search",
    "endpoint": "/api/books",
    "method": "GET",
    "query": "search terms"
}

Related Tables

cwa_import: Book import history cwa_conversions: Format conversion history
cwa_enforcement: Metadata enforcement operations epub_fixes: EPUB fixer service logs

These tables power the Library Stats tab but are separate from user activity tracking.

Troubleshooting

"No data available" in charts

Cause: No activity logged in selected time range

Solutions:

Expand time range (try "All Time")
Generate activity (download books, perform searches)
Enable Demo Mode to test visualizations
Check /cwa-stats-debug to verify database has records

Malformed JSON errors

Cause: Legacy activity records with plain string extra_data

Solution: Already handled automatically via json_valid() checks in queries. Legacy data displays using fallback logic.

Missing device breakdown data

Cause: Activity logged before device detection feature

Solution: New activity automatically includes device type. Old records default to "unknown".

CSV export shows "error.csv"

Cause: Database query failure or malformed data

Solutions:

Check error message in downloaded CSV file
Visit /cwa-stats-debug to identify data issues
Check Docker logs: docker logs cwa-dev | grep ERROR
Verify database file permissions: /config/cwa.db

Charts not updating after activity

Cause: Browser cache or time range filter

Solutions:

Click "Reset" button to refresh
Hard refresh browser (Ctrl+Shift+R / Cmd+Shift+R)
Verify activity is within selected time range
Check that event types match tab (e.g., DOWNLOADS on User Activity)

Performance Considerations

Large Databases

For installations with >100,000 activity records:

Use narrower time ranges: Default to 30 days instead of All Time
Regular VACUUM: sqlite3 /config/cwa.db "VACUUM;" to optimize database
Archive old data: Optionally export and truncate records >1 year old

Database Indexes

All critical queries are indexed:

idx_activity_timestamp: Time range filtering
idx_activity_user: Per-user queries
idx_activity_event: Event type filtering

Query Optimization

Queries use:

Date-based partitioning
Aggregation at database level
Limited result sets (TOP 10, LIMIT clauses)
JSON validation checks to prevent errors

Privacy & Security

Admin-Only Access

Stats dashboard requires admin role. Non-admin users cannot:

View any statistics
Access activity logs
Export CSV data
See other users' behavior

Data Retention

Activity data persists indefinitely by default. Consider:

Periodic exports for archival
Scheduled cleanup of old records
GDPR compliance if serving EU users

Anonymization

To anonymize historical data:

UPDATE cwa_user_activity 
SET user_name = 'User' || user_id,
    extra_data = json_remove(extra_data, '$.username_attempted', '$.ip')
WHERE timestamp < date('now', '-1 year');

Future Enhancements

Potential additions (not yet implemented):

Reading time tracking: Actual time spent in reader
Download heatmaps: Geographic distribution via IP
Recommendation effectiveness: Track recommended book access
Conversion success rates: Track failures vs successes
Email delivery tracking: Kindle/device email status
Scheduled reports: Daily/weekly email digests
Alerting: Unusual activity patterns, failed logins
User-level stats: Non-admin users see own activity

For questions or feature requests, please visit the GitHub repository or Discord community.

Uh oh!

Stats Activity Dashboard

Stats & Activity Dashboard

Table of Contents

Overview

User Activity Tab

Metrics Overview Cards

Activity Timeline

Peak Usage Hours

Download Format Distribution

Top Users (or Most Active Days)

Popular Books

Recent Searches

Reading Velocity

Format Preferences by User

Discovery Sources

Device Breakdown

Average Session Duration

Search Success Rate

Shelf Activity

Library Statistics Tab

Summary Cards

Library Growth

Format Distribution

Series Statistics

Publication Year Distribution

Rating Statistics

Top Enforced Books

Import Source Flows

API Usage Tab

API Usage Breakdown

Endpoint Frequency

API Timing Heatmap

Filtering & Time Ranges

Time Range Options

User Filtering

Reset Button

CSV Export

Usage

CSV Structure

Filter Preservation

Demo Mode

Purpose

Activation

Demo Data Generation

Technical Details

Adding New Tracked Endpoints

Step 1: Log Activity in Route Handler

Step 2: Choose Event Type

Step 3: Structure extra_data

Step 4: Update API Stats Queries (Optional)

Step 5: Test Activity Logging

Example: Adding OPDS Annotation Tracking

Best Practices

Database Schema

cwa_user_activity Table

Related Tables

Troubleshooting

"No data available" in charts

Malformed JSON errors

Missing device breakdown data

CSV export shows "error.csv"

Charts not updating after activity

Performance Considerations

Large Databases

Database Indexes

Query Optimization

Privacy & Security

Admin-Only Access

Data Retention

Anonymization

Future Enhancements

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!