-
-
Notifications
You must be signed in to change notification settings - Fork 450
Stats Activity Dashboard
The CWA Stats & Activity Dashboard provides comprehensive insights into library usage, user behavior, and system performance through three specialized tabs: User Activity, Library Statistics, and API Usage.
- Overview
- User Activity Tab
- Library Statistics Tab
- API Usage Tab
- Filtering & Time Ranges
- CSV Export
- Demo Mode
- Adding New Tracked Endpoints
- Database Schema
Access: Navigate to Admin → CWA Stats & Activity (requires admin privileges)
Features:
- 📊 Real-time activity tracking and visualization
- 📅 Flexible time range filtering (7 days to All Time)
- 🎯 Per-user filtering for detailed analysis
- 📥 CSV export for external analysis
- 🎨 Demo mode with realistic sample data
- 📱 Responsive charts powered by Apache ECharts
Tracks all user interactions with your library including logins, downloads, searches, and reading activity.
Total Events: Sum of all tracked user activities
- Measured: COUNT(*) from cwa_user_activity table
- Includes: LOGIN, DOWNLOAD, READ, SEARCH, EMAIL, SHELF operations
Active Users: Number of unique users with activity
- Measured: COUNT(DISTINCT user_id)
- When filtering by user: Shows total login count instead
Downloads: Total ebook downloads/sends
- Measured: COUNT where event_type IN ('DOWNLOAD', 'EMAIL')
- Includes: Direct downloads, send-to-kindle, send-to-device
Books Read: Unique books accessed via reader
- Measured: COUNT(DISTINCT item_id) where event_type = 'READ'
Searches: Total search queries
- Measured: COUNT where event_type = 'SEARCH'
Line chart showing daily activity broken down by event type.
Data Source:
SELECT date(timestamp) as day, event_type, COUNT(*)
FROM cwa_user_activity
GROUP BY day, event_typeEvent Types Tracked:
- 🔑 LOGIN: User authentication
- 📥 DOWNLOAD: Direct ebook downloads
- 📖 READ: In-browser reader access
- 🔍 SEARCH: Catalog searches
- 📧 EMAIL: Send-to-kindle/device
- 📚 SHELF: Shelf add/remove operations
Heatmap showing activity patterns by day of week and hour of day.
Measured: Hour extracted from timestamp, day-of-week calculation
- Scale: 0-23 hours (24-hour format)
- Days: Sunday (0) through Saturday (6)
- Use Case: Identify optimal maintenance windows or peak server load times
Pie chart showing breakdown of downloaded formats.
Data Source:
SELECT UPPER(COALESCE(
CASE WHEN json_valid(extra_data)
THEN json_extract(extra_data, '$.format')
ELSE extra_data
END,
'UNKNOWN'
)) as format
WHERE event_type IN ('DOWNLOAD', 'EMAIL')Tracked Formats: EPUB, MOBI, AZW3, PDF, KEPUB, etc.
When viewing all users: Shows top 10 users by activity count When filtering by user: Shows most active days for that user
Data Source:
-- All users mode
SELECT user_id, user_name, COUNT(*) as activity_count
GROUP BY user_id
ORDER BY activity_count DESC LIMIT 10
-- Single user mode
SELECT date(timestamp) as day, COUNT(*) as activity_count
WHERE user_id = ?
GROUP BY day
ORDER BY activity_count DESC LIMIT 10Top 10 books by combined reads/downloads/emails.
Measured: COUNT(*) where event_type IN ('READ', 'DOWNLOAD', 'EMAIL') Use Case: Identify trending titles, popular genres
Last 15 search queries with timestamps and usernames.
Data Source: extra_data field for SEARCH events
Privacy Note: Admin-only view, helps identify catalog gaps
Line chart with 4-week moving average showing books read over time.
Measured: COUNT(DISTINCT item_id) per week where event_type = 'READ' Moving Average: Smooths out weekly variations for trend identification
Stacked bar chart showing which formats each user prefers.
Data Source:
SELECT user_name,
UPPER(json_extract(extra_data, '$.format')) as format,
COUNT(*)
WHERE event_type IN ('DOWNLOAD', 'READ', 'EMAIL')
GROUP BY user_name, formatUse Case: Optimize conversion settings per user preference
Pie chart showing how users find books.
Tracked Sources:
- 🔍 search: Via search results
- 📖 book_detail: From related books on detail page
- 📚 series: Through series browsing
- 👤 author: From author pages
- 🏷️ category: Category/genre browsing
- 📂 shelf: From custom shelves
- ➡️ direct: Direct URL access
Data Source: json_extract(extra_data, '$.source') on READ/DOWNLOAD events
Pie chart showing desktop vs mobile vs tablet usage.
Detection Method: User-Agent string parsing during log_activity()
if 'mobile' in user_agent or 'android' in user_agent:
device_type = 'mobile'
elif 'tablet' in user_agent or 'ipad' in user_agent:
device_type = 'tablet'
else:
device_type = 'desktop'Stored: json_extract(extra_data, '$.device_type')
Gauge chart showing average time between login and last activity.
Calculation:
AVG(
(julianday(last_activity) - julianday(first_activity)) * 24 * 60
) as avg_minutes
WHERE user has multiple events in sessionGauge showing percentage of searches yielding results.
Measured: Heuristic based on search followed by download/read within timeframe Use Case: Identify catalog coverage issues
List showing adds/removes per shelf with net change indicators.
Data Source: SHELF_ADD and SHELF_REMOVE events Display: Top 10 most active shelves
Tracks library growth, content composition, and system automation effectiveness.
Total Books: Current library size from Calibre metadata.db Books Added: Count imported during selected time range Conversions: Successful format conversions Format Count: Number of distinct formats in library
Area chart showing cumulative book count over time.
Data Source:
SELECT date(timestamp), COUNT(*) as books_added
FROM cwa_import
GROUP BY date(timestamp)Calculation: Cumulative sum to show total library size progression
Bar chart showing book count per format.
Data Source: Calibre metadata.db data table Note: Shows all formats available in library, not just added formats
Table showing series completion status.
Columns:
- Series Name
- Book Count (owned)
- Highest Index (series length indicator)
Calculation:
SELECT series.name, COUNT(books.id), MAX(books.series_index)
FROM books_series_link
JOIN series ON series.id = books_series_link.series
GROUP BY series.idUse Case: Identify incomplete series for acquisition
Bar chart showing books by publication year.
Data Source: pubdate field from metadata.db
Filters: Typically limited to years 1900-present
Comprehensive rating breakdown including:
- Average Rating: Mean of all rated books
- Unrated Percentage: Books without ratings
- Distribution: Count per star rating (1-5 stars)
Data Source: ratings table in metadata.db Calculation:
SELECT rating/2.0 as stars, COUNT(*)
FROM books_ratings_link
JOIN ratings ON ratings.id = books_ratings_link.rating
GROUP BY ratingNote: Calibre stores ratings as 0-10, displayed as 0-5 stars
Books with most metadata enforcement operations.
Data Source:
SELECT book_id, book_title, COUNT(*) as enforcement_count,
MAX(timestamp) as last_enforced
FROM cwa_enforcement
GROUP BY book_id
ORDER BY enforcement_count DESC
LIMIT 10Trigger Types: EPUB_EDIT, MERGE, CONVERT Use Case: Identify frequently updated books
Sankey diagram showing book import pipeline from source to destination.
Tracked Flows:
- Import Source → Import Method → Final Format
- Example: "Web Upload" → "Calibredb Add" → "EPUB"
Data Source: cwa_import table with source/method tracking
Tracks OPDS, Kobo Sync, and REST API endpoint usage for monitoring integrations and third-party app connections.
Pie chart categorizing API calls by type.
Categories:
- 📱 OPDS: OPDS feed access (general ebook readers)
- 📖 Kobo Sync: Kobo device/app synchronization
- 🔧 REST API: Direct API endpoint calls
- 🌐 Web UI: Standard browser interface (baseline)
Data Source:
SELECT
CASE
WHEN endpoint LIKE '/opds%' THEN 'OPDS'
WHEN endpoint LIKE '/kobo%' THEN 'Kobo Sync'
WHEN endpoint LIKE '/api/%' THEN 'REST API'
ELSE 'Web UI'
END as category,
COUNT(*)
FROM cwa_user_activity
WHERE event_type = 'OPDS_ACCESS' OR 'API_CALL'Table showing most-accessed endpoints with call counts.
Columns:
- Endpoint path (grouped by pattern)
- Call count
- Percentage of total
Grouping Logic: Similar endpoints consolidated (e.g., /kobo/{uuid}/state → /kobo/*/state)
Heatmap showing API usage patterns by hour and day.
Use Case:
- Identify automated sync schedules
- Detect unusual access patterns
- Plan API maintenance windows
Preset Ranges:
- Last 7 Days
- Last 30 Days (default)
- Last 90 Days
- Last 6 Months
- Last Year
- All Time
- Custom Range
Custom Range: Select specific start and end dates via date pickers
User Activity Tab Only: Filter by specific user to analyze individual behavior
Dropdown: Shows all users with logged activity Impact:
- Changes "Top Users" to "Most Active Days"
- Changes "Active Users" card to "Total Logins"
- All charts filtered to selected user's data
Clears all filters and returns to default view (Last 30 Days, all users).
Click 📥 Export CSV button on any tab to download current view as CSV.
Filename Format: cwa_stats_{tab_name}_{timestamp}.csv
- Example:
cwa_stats_activity_20251231_143022.csv
Multi-Section Format: Each CSV contains multiple datasets separated by headers
User Activity CSV Sections:
=== USER ACTIVITY STATISTICS ===
Metric,Value
total_events,1234
active_users,15
...
=== TOP USERS ===
User ID,Username,Event Count
1,admin,456
...
=== FORMAT DISTRIBUTION ===
Format,Download Count
EPUB,789
...Library Stats CSV Sections:
- Library summary statistics
- Growth data (date, count)
- Format distribution
- Series completion list
- Rating breakdown
- Top enforced books
API Stats CSV Sections:
- Usage breakdown by category
- Endpoint frequency table
CSV exports respect current filters:
- Time range (days, custom dates)
- User selection (Activity tab)
Implementation: Appends current URL query parameters to export endpoint
Test dashboard visualization with realistic sample data when:
- Library is newly set up with no activity
- Presenting features to stakeholders
- Testing chart layouts and performance
Click 🎨 Show Demo Data button on any tab.
Indicators:
- Button changes to 📊 Show Real Data (green gradient)
- DEMO MODE badge appears in filter row
- Pulsing animation indicates temporary state
User Activity:
- 30 days of varied activity (100-300 events/day)
- 8 synthetic users with realistic names
- Weighted format distribution (EPUB 40%, MOBI 25%, PDF 20%, etc.)
- Peak usage during evenings (6 PM - 10 PM)
- Heavier weekend activity
Library Stats:
- Growth curve with 500-2000 books
- Realistic format distribution matching common library composition
- Series with 3-12 books each
- Publication years weighted toward 2000-present
- Rating distribution following normal curve (mostly 3-4 stars)
API Stats:
- OPDS: 60% of API calls (reader apps)
- Kobo Sync: 25% (e-reader devices)
- REST API: 10% (integrations)
- Hourly patterns matching typical device sync schedules
Demo data generated client-side via JavaScript functions:
-
generateDemoData()- User Activity -
generateLibraryDemoData()- Library Stats -
generateApiDemoData()- API Stats
Respects: Current time range filters (generates data for selected period)
In your Flask route, import and call the logging function:
from scripts.cwa_db import CWA_DB
from flask import current_user
import json
@blueprint.route('/your-new-endpoint', methods=['GET'])
def your_endpoint():
# Your endpoint logic here
result = do_something()
# Log the activity
try:
cwa_db = CWA_DB()
cwa_db.log_activity(
user_id=current_user.id,
user_name=current_user.name,
event_type='YOUR_EVENT_TYPE', # e.g., 'API_CALL', 'DOWNLOAD', 'SEARCH'
item_id=book_id if applicable else None,
item_title=book_title if applicable else None,
extra_data=json.dumps({
'endpoint': '/your-new-endpoint',
'method': 'GET',
'status': 'success',
# Add any relevant metadata
})
)
except Exception as e:
log.debug(f"Failed to log activity: {e}")
return resultStandard Event Types:
-
LOGIN- User authentication -
DOWNLOAD- File downloads -
READ- Reader access -
SEARCH- Catalog searches -
EMAIL- Send-to-device operations -
OPDS_ACCESS- OPDS feed access -
API_CALL- REST API calls -
SHELF_ADD/SHELF_REMOVE- Shelf operations
Custom Event Types: Can create new types, will automatically appear in dashboards
Use JSON format for rich metadata:
extra_data = json.dumps({
'format': 'EPUB', # For downloads/reads
'source': 'search', # How user found this (search/series/author/etc)
'device_type': 'mobile', # Auto-added by log_activity()
'endpoint': '/api/books', # For API calls
'method': 'GET', # HTTP method
'status': 'success', # Operation result
'query': 'fantasy books', # For searches
# Add custom fields as needed
})If adding new API endpoint patterns, update categorization in get_api_usage_breakdown():
# In scripts/cwa_db.py
CASE
WHEN endpoint LIKE '/opds%' THEN 'OPDS'
WHEN endpoint LIKE '/kobo%' THEN 'Kobo Sync'
WHEN endpoint LIKE '/api/%' THEN 'REST API'
WHEN endpoint LIKE '/your-new-api/%' THEN 'Your New API' # Add this
ELSE 'Web UI'
END as category- Access your new endpoint
- Navigate to CWA Stats & Activity
- Check appropriate tab for new data
- Use
/cwa-stats-debugendpoint to verify JSON structure - Verify CSV export includes new data
# In cps/opds.py
@opds.route('/opds/annotation', methods=['POST'])
def save_annotation():
annotation_data = request.get_json()
# Save annotation logic
save_to_db(annotation_data)
# Log activity
try:
cwa_db = CWA_DB()
cwa_db.log_activity(
user_id=current_user.id,
user_name=current_user.name,
event_type='OPDS_ACCESS',
item_id=annotation_data.get('book_id'),
item_title=get_book_title(annotation_data.get('book_id')),
extra_data=json.dumps({
'endpoint': '/opds/annotation',
'method': 'POST',
'action': 'save_annotation',
'annotation_type': annotation_data.get('type') # highlight, note, etc.
})
)
except Exception as e:
log.debug(f"Failed to log OPDS annotation: {e}")
return jsonify({'status': 'success'})- Always use try/except: Don't let logging failures break your endpoint
- Log after success: Only log completed operations, not failed attempts (except LOGIN_FAILED)
-
Include context: Add relevant metadata to
extra_datafor richer analytics - Consistent naming: Use existing event types when possible
- Privacy aware: Don't log sensitive data (passwords, email content, etc.)
- Device detection: Automatic via User-Agent, no manual tracking needed
- Test thoroughly: Verify data appears correctly in dashboard and CSV exports
Primary table for all activity tracking.
CREATE TABLE IF NOT EXISTS cwa_user_activity (
id INTEGER PRIMARY KEY AUTOINCREMENT,
timestamp TEXT DEFAULT CURRENT_TIMESTAMP,
user_id INTEGER,
user_name TEXT,
event_type TEXT NOT NULL,
item_id INTEGER,
item_title TEXT,
extra_data TEXT -- JSON string
);
CREATE INDEX IF NOT EXISTS idx_activity_timestamp ON cwa_user_activity(timestamp);
CREATE INDEX IF NOT EXISTS idx_activity_user ON cwa_user_activity(user_id);
CREATE INDEX IF NOT EXISTS idx_activity_event ON cwa_user_activity(event_type);Columns:
-
id: Auto-increment primary key -
timestamp: ISO-8601 format (YYYY-MM-DD HH:MM:SS) -
user_id: Foreign key to Calibre-Web user table -
user_name: Denormalized for easier querying -
event_type: Activity classification (LOGIN, DOWNLOAD, etc.) -
item_id: Book ID when applicable -
item_title: Book title (denormalized) -
extra_data: JSON object with event-specific metadata
JSON Fields in extra_data:
{
"format": "EPUB",
"device_type": "mobile",
"source": "search",
"endpoint": "/api/books",
"method": "GET",
"query": "search terms"
}cwa_import: Book import history
cwa_conversions: Format conversion history
cwa_enforcement: Metadata enforcement operations
epub_fixes: EPUB fixer service logs
These tables power the Library Stats tab but are separate from user activity tracking.
Cause: No activity logged in selected time range
Solutions:
- Expand time range (try "All Time")
- Generate activity (download books, perform searches)
- Enable Demo Mode to test visualizations
- Check
/cwa-stats-debugto verify database has records
Cause: Legacy activity records with plain string extra_data
Solution: Already handled automatically via json_valid() checks in queries. Legacy data displays using fallback logic.
Cause: Activity logged before device detection feature
Solution: New activity automatically includes device type. Old records default to "unknown".
Cause: Database query failure or malformed data
Solutions:
- Check error message in downloaded CSV file
- Visit
/cwa-stats-debugto identify data issues - Check Docker logs:
docker logs cwa-dev | grep ERROR - Verify database file permissions:
/config/cwa.db
Cause: Browser cache or time range filter
Solutions:
- Click "Reset" button to refresh
- Hard refresh browser (Ctrl+Shift+R / Cmd+Shift+R)
- Verify activity is within selected time range
- Check that event types match tab (e.g., DOWNLOADS on User Activity)
For installations with >100,000 activity records:
- Use narrower time ranges: Default to 30 days instead of All Time
-
Regular VACUUM:
sqlite3 /config/cwa.db "VACUUM;"to optimize database - Archive old data: Optionally export and truncate records >1 year old
All critical queries are indexed:
-
idx_activity_timestamp: Time range filtering -
idx_activity_user: Per-user queries -
idx_activity_event: Event type filtering
Queries use:
- Date-based partitioning
- Aggregation at database level
- Limited result sets (TOP 10, LIMIT clauses)
- JSON validation checks to prevent errors
Stats dashboard requires admin role. Non-admin users cannot:
- View any statistics
- Access activity logs
- Export CSV data
- See other users' behavior
Activity data persists indefinitely by default. Consider:
- Periodic exports for archival
- Scheduled cleanup of old records
- GDPR compliance if serving EU users
To anonymize historical data:
UPDATE cwa_user_activity
SET user_name = 'User' || user_id,
extra_data = json_remove(extra_data, '$.username_attempted', '$.ip')
WHERE timestamp < date('now', '-1 year');Potential additions (not yet implemented):
- Reading time tracking: Actual time spent in reader
- Download heatmaps: Geographic distribution via IP
- Recommendation effectiveness: Track recommended book access
- Conversion success rates: Track failures vs successes
- Email delivery tracking: Kindle/device email status
- Scheduled reports: Daily/weekly email digests
- Alerting: Unusual activity patterns, failed logins
- User-level stats: Non-admin users see own activity
For questions or feature requests, please visit the GitHub repository or Discord community.