Skip to content

Stats Activity Dashboard

crocodilestick edited this page Jan 1, 2026 · 1 revision

Stats & Activity Dashboard

The CWA Stats & Activity Dashboard provides comprehensive insights into library usage, user behavior, and system performance through three specialized tabs: User Activity, Library Statistics, and API Usage.

Table of Contents

Overview

Access: Navigate to AdminCWA Stats & Activity (requires admin privileges)

Features:

  • 📊 Real-time activity tracking and visualization
  • 📅 Flexible time range filtering (7 days to All Time)
  • 🎯 Per-user filtering for detailed analysis
  • 📥 CSV export for external analysis
  • 🎨 Demo mode with realistic sample data
  • 📱 Responsive charts powered by Apache ECharts

User Activity Tab

Tracks all user interactions with your library including logins, downloads, searches, and reading activity.

Metrics Overview Cards

Total Events: Sum of all tracked user activities

  • Measured: COUNT(*) from cwa_user_activity table
  • Includes: LOGIN, DOWNLOAD, READ, SEARCH, EMAIL, SHELF operations

Active Users: Number of unique users with activity

  • Measured: COUNT(DISTINCT user_id)
  • When filtering by user: Shows total login count instead

Downloads: Total ebook downloads/sends

  • Measured: COUNT where event_type IN ('DOWNLOAD', 'EMAIL')
  • Includes: Direct downloads, send-to-kindle, send-to-device

Books Read: Unique books accessed via reader

  • Measured: COUNT(DISTINCT item_id) where event_type = 'READ'

Searches: Total search queries

  • Measured: COUNT where event_type = 'SEARCH'

Activity Timeline

Line chart showing daily activity broken down by event type.

Data Source:

SELECT date(timestamp) as day, event_type, COUNT(*) 
FROM cwa_user_activity 
GROUP BY day, event_type

Event Types Tracked:

  • 🔑 LOGIN: User authentication
  • 📥 DOWNLOAD: Direct ebook downloads
  • 📖 READ: In-browser reader access
  • 🔍 SEARCH: Catalog searches
  • 📧 EMAIL: Send-to-kindle/device
  • 📚 SHELF: Shelf add/remove operations

Peak Usage Hours

Heatmap showing activity patterns by day of week and hour of day.

Measured: Hour extracted from timestamp, day-of-week calculation

  • Scale: 0-23 hours (24-hour format)
  • Days: Sunday (0) through Saturday (6)
  • Use Case: Identify optimal maintenance windows or peak server load times

Download Format Distribution

Pie chart showing breakdown of downloaded formats.

Data Source:

SELECT UPPER(COALESCE(
    CASE WHEN json_valid(extra_data) 
        THEN json_extract(extra_data, '$.format')
        ELSE extra_data 
    END,
    'UNKNOWN'
)) as format
WHERE event_type IN ('DOWNLOAD', 'EMAIL')

Tracked Formats: EPUB, MOBI, AZW3, PDF, KEPUB, etc.

Top Users (or Most Active Days)

When viewing all users: Shows top 10 users by activity count When filtering by user: Shows most active days for that user

Data Source:

-- All users mode
SELECT user_id, user_name, COUNT(*) as activity_count
GROUP BY user_id
ORDER BY activity_count DESC LIMIT 10

-- Single user mode
SELECT date(timestamp) as day, COUNT(*) as activity_count
WHERE user_id = ?
GROUP BY day
ORDER BY activity_count DESC LIMIT 10

Popular Books

Top 10 books by combined reads/downloads/emails.

Measured: COUNT(*) where event_type IN ('READ', 'DOWNLOAD', 'EMAIL') Use Case: Identify trending titles, popular genres

Recent Searches

Last 15 search queries with timestamps and usernames.

Data Source: extra_data field for SEARCH events Privacy Note: Admin-only view, helps identify catalog gaps

Reading Velocity

Line chart with 4-week moving average showing books read over time.

Measured: COUNT(DISTINCT item_id) per week where event_type = 'READ' Moving Average: Smooths out weekly variations for trend identification

Format Preferences by User

Stacked bar chart showing which formats each user prefers.

Data Source:

SELECT user_name, 
       UPPER(json_extract(extra_data, '$.format')) as format,
       COUNT(*)
WHERE event_type IN ('DOWNLOAD', 'READ', 'EMAIL')
GROUP BY user_name, format

Use Case: Optimize conversion settings per user preference

Discovery Sources

Pie chart showing how users find books.

Tracked Sources:

  • 🔍 search: Via search results
  • 📖 book_detail: From related books on detail page
  • 📚 series: Through series browsing
  • 👤 author: From author pages
  • 🏷️ category: Category/genre browsing
  • 📂 shelf: From custom shelves
  • ➡️ direct: Direct URL access

Data Source: json_extract(extra_data, '$.source') on READ/DOWNLOAD events

Device Breakdown

Pie chart showing desktop vs mobile vs tablet usage.

Detection Method: User-Agent string parsing during log_activity()

if 'mobile' in user_agent or 'android' in user_agent:
    device_type = 'mobile'
elif 'tablet' in user_agent or 'ipad' in user_agent:
    device_type = 'tablet'
else:
    device_type = 'desktop'

Stored: json_extract(extra_data, '$.device_type')

Average Session Duration

Gauge chart showing average time between login and last activity.

Calculation:

AVG(
    (julianday(last_activity) - julianday(first_activity)) * 24 * 60
) as avg_minutes
WHERE user has multiple events in session

Search Success Rate

Gauge showing percentage of searches yielding results.

Measured: Heuristic based on search followed by download/read within timeframe Use Case: Identify catalog coverage issues

Shelf Activity

List showing adds/removes per shelf with net change indicators.

Data Source: SHELF_ADD and SHELF_REMOVE events Display: Top 10 most active shelves

Library Statistics Tab

Tracks library growth, content composition, and system automation effectiveness.

Summary Cards

Total Books: Current library size from Calibre metadata.db Books Added: Count imported during selected time range Conversions: Successful format conversions Format Count: Number of distinct formats in library

Library Growth

Area chart showing cumulative book count over time.

Data Source:

SELECT date(timestamp), COUNT(*) as books_added
FROM cwa_import
GROUP BY date(timestamp)

Calculation: Cumulative sum to show total library size progression

Format Distribution

Bar chart showing book count per format.

Data Source: Calibre metadata.db data table Note: Shows all formats available in library, not just added formats

Series Statistics

Table showing series completion status.

Columns:

  • Series Name
  • Book Count (owned)
  • Highest Index (series length indicator)

Calculation:

SELECT series.name, COUNT(books.id), MAX(books.series_index)
FROM books_series_link
JOIN series ON series.id = books_series_link.series
GROUP BY series.id

Use Case: Identify incomplete series for acquisition

Publication Year Distribution

Bar chart showing books by publication year.

Data Source: pubdate field from metadata.db Filters: Typically limited to years 1900-present

Rating Statistics

Comprehensive rating breakdown including:

  • Average Rating: Mean of all rated books
  • Unrated Percentage: Books without ratings
  • Distribution: Count per star rating (1-5 stars)

Data Source: ratings table in metadata.db Calculation:

SELECT rating/2.0 as stars, COUNT(*) 
FROM books_ratings_link
JOIN ratings ON ratings.id = books_ratings_link.rating
GROUP BY rating

Note: Calibre stores ratings as 0-10, displayed as 0-5 stars

Top Enforced Books

Books with most metadata enforcement operations.

Data Source:

SELECT book_id, book_title, COUNT(*) as enforcement_count,
       MAX(timestamp) as last_enforced
FROM cwa_enforcement
GROUP BY book_id
ORDER BY enforcement_count DESC
LIMIT 10

Trigger Types: EPUB_EDIT, MERGE, CONVERT Use Case: Identify frequently updated books

Import Source Flows

Sankey diagram showing book import pipeline from source to destination.

Tracked Flows:

  • Import Source → Import Method → Final Format
  • Example: "Web Upload" → "Calibredb Add" → "EPUB"

Data Source: cwa_import table with source/method tracking

API Usage Tab

Tracks OPDS, Kobo Sync, and REST API endpoint usage for monitoring integrations and third-party app connections.

API Usage Breakdown

Pie chart categorizing API calls by type.

Categories:

  • 📱 OPDS: OPDS feed access (general ebook readers)
  • 📖 Kobo Sync: Kobo device/app synchronization
  • 🔧 REST API: Direct API endpoint calls
  • 🌐 Web UI: Standard browser interface (baseline)

Data Source:

SELECT 
    CASE 
        WHEN endpoint LIKE '/opds%' THEN 'OPDS'
        WHEN endpoint LIKE '/kobo%' THEN 'Kobo Sync'
        WHEN endpoint LIKE '/api/%' THEN 'REST API'
        ELSE 'Web UI'
    END as category,
    COUNT(*)
FROM cwa_user_activity
WHERE event_type = 'OPDS_ACCESS' OR 'API_CALL'

Endpoint Frequency

Table showing most-accessed endpoints with call counts.

Columns:

  • Endpoint path (grouped by pattern)
  • Call count
  • Percentage of total

Grouping Logic: Similar endpoints consolidated (e.g., /kobo/{uuid}/state/kobo/*/state)

API Timing Heatmap

Heatmap showing API usage patterns by hour and day.

Use Case:

  • Identify automated sync schedules
  • Detect unusual access patterns
  • Plan API maintenance windows

Filtering & Time Ranges

Time Range Options

Preset Ranges:

  • Last 7 Days
  • Last 30 Days (default)
  • Last 90 Days
  • Last 6 Months
  • Last Year
  • All Time
  • Custom Range

Custom Range: Select specific start and end dates via date pickers

User Filtering

User Activity Tab Only: Filter by specific user to analyze individual behavior

Dropdown: Shows all users with logged activity Impact:

  • Changes "Top Users" to "Most Active Days"
  • Changes "Active Users" card to "Total Logins"
  • All charts filtered to selected user's data

Reset Button

Clears all filters and returns to default view (Last 30 Days, all users).

CSV Export

Usage

Click 📥 Export CSV button on any tab to download current view as CSV.

Filename Format: cwa_stats_{tab_name}_{timestamp}.csv

  • Example: cwa_stats_activity_20251231_143022.csv

CSV Structure

Multi-Section Format: Each CSV contains multiple datasets separated by headers

User Activity CSV Sections:

=== USER ACTIVITY STATISTICS ===

Metric,Value
total_events,1234
active_users,15
...

=== TOP USERS ===
User ID,Username,Event Count
1,admin,456
...

=== FORMAT DISTRIBUTION ===
Format,Download Count
EPUB,789
...

Library Stats CSV Sections:

  • Library summary statistics
  • Growth data (date, count)
  • Format distribution
  • Series completion list
  • Rating breakdown
  • Top enforced books

API Stats CSV Sections:

  • Usage breakdown by category
  • Endpoint frequency table

Filter Preservation

CSV exports respect current filters:

  • Time range (days, custom dates)
  • User selection (Activity tab)

Implementation: Appends current URL query parameters to export endpoint

Demo Mode

Purpose

Test dashboard visualization with realistic sample data when:

  • Library is newly set up with no activity
  • Presenting features to stakeholders
  • Testing chart layouts and performance

Activation

Click 🎨 Show Demo Data button on any tab.

Indicators:

  • Button changes to 📊 Show Real Data (green gradient)
  • DEMO MODE badge appears in filter row
  • Pulsing animation indicates temporary state

Demo Data Generation

User Activity:

  • 30 days of varied activity (100-300 events/day)
  • 8 synthetic users with realistic names
  • Weighted format distribution (EPUB 40%, MOBI 25%, PDF 20%, etc.)
  • Peak usage during evenings (6 PM - 10 PM)
  • Heavier weekend activity

Library Stats:

  • Growth curve with 500-2000 books
  • Realistic format distribution matching common library composition
  • Series with 3-12 books each
  • Publication years weighted toward 2000-present
  • Rating distribution following normal curve (mostly 3-4 stars)

API Stats:

  • OPDS: 60% of API calls (reader apps)
  • Kobo Sync: 25% (e-reader devices)
  • REST API: 10% (integrations)
  • Hourly patterns matching typical device sync schedules

Technical Details

Demo data generated client-side via JavaScript functions:

  • generateDemoData() - User Activity
  • generateLibraryDemoData() - Library Stats
  • generateApiDemoData() - API Stats

Respects: Current time range filters (generates data for selected period)

Adding New Tracked Endpoints

Step 1: Log Activity in Route Handler

In your Flask route, import and call the logging function:

from scripts.cwa_db import CWA_DB
from flask import current_user
import json

@blueprint.route('/your-new-endpoint', methods=['GET'])
def your_endpoint():
    # Your endpoint logic here
    result = do_something()
    
    # Log the activity
    try:
        cwa_db = CWA_DB()
        cwa_db.log_activity(
            user_id=current_user.id,
            user_name=current_user.name,
            event_type='YOUR_EVENT_TYPE',  # e.g., 'API_CALL', 'DOWNLOAD', 'SEARCH'
            item_id=book_id if applicable else None,
            item_title=book_title if applicable else None,
            extra_data=json.dumps({
                'endpoint': '/your-new-endpoint',
                'method': 'GET',
                'status': 'success',
                # Add any relevant metadata
            })
        )
    except Exception as e:
        log.debug(f"Failed to log activity: {e}")
    
    return result

Step 2: Choose Event Type

Standard Event Types:

  • LOGIN - User authentication
  • DOWNLOAD - File downloads
  • READ - Reader access
  • SEARCH - Catalog searches
  • EMAIL - Send-to-device operations
  • OPDS_ACCESS - OPDS feed access
  • API_CALL - REST API calls
  • SHELF_ADD / SHELF_REMOVE - Shelf operations

Custom Event Types: Can create new types, will automatically appear in dashboards

Step 3: Structure extra_data

Use JSON format for rich metadata:

extra_data = json.dumps({
    'format': 'EPUB',           # For downloads/reads
    'source': 'search',         # How user found this (search/series/author/etc)
    'device_type': 'mobile',    # Auto-added by log_activity()
    'endpoint': '/api/books',   # For API calls
    'method': 'GET',            # HTTP method
    'status': 'success',        # Operation result
    'query': 'fantasy books',   # For searches
    # Add custom fields as needed
})

Step 4: Update API Stats Queries (Optional)

If adding new API endpoint patterns, update categorization in get_api_usage_breakdown():

# In scripts/cwa_db.py
CASE 
    WHEN endpoint LIKE '/opds%' THEN 'OPDS'
    WHEN endpoint LIKE '/kobo%' THEN 'Kobo Sync'
    WHEN endpoint LIKE '/api/%' THEN 'REST API'
    WHEN endpoint LIKE '/your-new-api/%' THEN 'Your New API'  # Add this
    ELSE 'Web UI'
END as category

Step 5: Test Activity Logging

  1. Access your new endpoint
  2. Navigate to CWA Stats & Activity
  3. Check appropriate tab for new data
  4. Use /cwa-stats-debug endpoint to verify JSON structure
  5. Verify CSV export includes new data

Example: Adding OPDS Annotation Tracking

# In cps/opds.py
@opds.route('/opds/annotation', methods=['POST'])
def save_annotation():
    annotation_data = request.get_json()
    
    # Save annotation logic
    save_to_db(annotation_data)
    
    # Log activity
    try:
        cwa_db = CWA_DB()
        cwa_db.log_activity(
            user_id=current_user.id,
            user_name=current_user.name,
            event_type='OPDS_ACCESS',
            item_id=annotation_data.get('book_id'),
            item_title=get_book_title(annotation_data.get('book_id')),
            extra_data=json.dumps({
                'endpoint': '/opds/annotation',
                'method': 'POST',
                'action': 'save_annotation',
                'annotation_type': annotation_data.get('type')  # highlight, note, etc.
            })
        )
    except Exception as e:
        log.debug(f"Failed to log OPDS annotation: {e}")
    
    return jsonify({'status': 'success'})

Best Practices

  1. Always use try/except: Don't let logging failures break your endpoint
  2. Log after success: Only log completed operations, not failed attempts (except LOGIN_FAILED)
  3. Include context: Add relevant metadata to extra_data for richer analytics
  4. Consistent naming: Use existing event types when possible
  5. Privacy aware: Don't log sensitive data (passwords, email content, etc.)
  6. Device detection: Automatic via User-Agent, no manual tracking needed
  7. Test thoroughly: Verify data appears correctly in dashboard and CSV exports

Database Schema

cwa_user_activity Table

Primary table for all activity tracking.

CREATE TABLE IF NOT EXISTS cwa_user_activity (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    timestamp TEXT DEFAULT CURRENT_TIMESTAMP,
    user_id INTEGER,
    user_name TEXT,
    event_type TEXT NOT NULL,
    item_id INTEGER,
    item_title TEXT,
    extra_data TEXT  -- JSON string
);

CREATE INDEX IF NOT EXISTS idx_activity_timestamp ON cwa_user_activity(timestamp);
CREATE INDEX IF NOT EXISTS idx_activity_user ON cwa_user_activity(user_id);
CREATE INDEX IF NOT EXISTS idx_activity_event ON cwa_user_activity(event_type);

Columns:

  • id: Auto-increment primary key
  • timestamp: ISO-8601 format (YYYY-MM-DD HH:MM:SS)
  • user_id: Foreign key to Calibre-Web user table
  • user_name: Denormalized for easier querying
  • event_type: Activity classification (LOGIN, DOWNLOAD, etc.)
  • item_id: Book ID when applicable
  • item_title: Book title (denormalized)
  • extra_data: JSON object with event-specific metadata

JSON Fields in extra_data:

{
    "format": "EPUB",
    "device_type": "mobile",
    "source": "search",
    "endpoint": "/api/books",
    "method": "GET",
    "query": "search terms"
}

Related Tables

cwa_import: Book import history cwa_conversions: Format conversion history
cwa_enforcement: Metadata enforcement operations epub_fixes: EPUB fixer service logs

These tables power the Library Stats tab but are separate from user activity tracking.


Troubleshooting

"No data available" in charts

Cause: No activity logged in selected time range

Solutions:

  1. Expand time range (try "All Time")
  2. Generate activity (download books, perform searches)
  3. Enable Demo Mode to test visualizations
  4. Check /cwa-stats-debug to verify database has records

Malformed JSON errors

Cause: Legacy activity records with plain string extra_data

Solution: Already handled automatically via json_valid() checks in queries. Legacy data displays using fallback logic.

Missing device breakdown data

Cause: Activity logged before device detection feature

Solution: New activity automatically includes device type. Old records default to "unknown".

CSV export shows "error.csv"

Cause: Database query failure or malformed data

Solutions:

  1. Check error message in downloaded CSV file
  2. Visit /cwa-stats-debug to identify data issues
  3. Check Docker logs: docker logs cwa-dev | grep ERROR
  4. Verify database file permissions: /config/cwa.db

Charts not updating after activity

Cause: Browser cache or time range filter

Solutions:

  1. Click "Reset" button to refresh
  2. Hard refresh browser (Ctrl+Shift+R / Cmd+Shift+R)
  3. Verify activity is within selected time range
  4. Check that event types match tab (e.g., DOWNLOADS on User Activity)

Performance Considerations

Large Databases

For installations with >100,000 activity records:

  1. Use narrower time ranges: Default to 30 days instead of All Time
  2. Regular VACUUM: sqlite3 /config/cwa.db "VACUUM;" to optimize database
  3. Archive old data: Optionally export and truncate records >1 year old

Database Indexes

All critical queries are indexed:

  • idx_activity_timestamp: Time range filtering
  • idx_activity_user: Per-user queries
  • idx_activity_event: Event type filtering

Query Optimization

Queries use:

  • Date-based partitioning
  • Aggregation at database level
  • Limited result sets (TOP 10, LIMIT clauses)
  • JSON validation checks to prevent errors

Privacy & Security

Admin-Only Access

Stats dashboard requires admin role. Non-admin users cannot:

  • View any statistics
  • Access activity logs
  • Export CSV data
  • See other users' behavior

Data Retention

Activity data persists indefinitely by default. Consider:

  • Periodic exports for archival
  • Scheduled cleanup of old records
  • GDPR compliance if serving EU users

Anonymization

To anonymize historical data:

UPDATE cwa_user_activity 
SET user_name = 'User' || user_id,
    extra_data = json_remove(extra_data, '$.username_attempted', '$.ip')
WHERE timestamp < date('now', '-1 year');

Future Enhancements

Potential additions (not yet implemented):

  • Reading time tracking: Actual time spent in reader
  • Download heatmaps: Geographic distribution via IP
  • Recommendation effectiveness: Track recommended book access
  • Conversion success rates: Track failures vs successes
  • Email delivery tracking: Kindle/device email status
  • Scheduled reports: Daily/weekly email digests
  • Alerting: Unusual activity patterns, failed logins
  • User-level stats: Non-admin users see own activity

For questions or feature requests, please visit the GitHub repository or Discord community.

Clone this wiki locally