Automated signal decomposition and diagnostic engine for high-fidelity time series preprocessing.
tseda is a high-fidelity Python framework designed to automate the preprocessing,
signal decomposition, and structural validation of regularly sampled time-series data.
By automating Singular Spectrum Analysis (SSA) parameter selection, tseda eliminates
the manual guesswork that traditionally introduces bias into production analytical pipelines.
It bridges the gap between raw data ingestion and model-ready signals by providing
deterministic heuristics for window selection and component grouping.
For a short guided tour of tseda, watch the overview video on YouTube: https://www.youtube.com/watch?v=baoJrIpSTE8.
- Automated SSA Window Calibration: Automates optimal window selections derived directly from sampling frequencies and eigen-spectrum spread, preventing the distortion of latent signals.
- Deterministic Component Grouping: Implements a robust variance-and-correlation heuristic (leveraging the Kneedle algorithm) to mathematically isolate trend and seasonality from ambient noise.
- Bi-Directional Interface Flex: Seamlessly transition between a headless Notebook SDK for automated CI/CD batch pipelines and an interactive Plotly Observability Dashboard for expert human-in-the-loop validation.
- KMDS Lineage Persistence: Serializes pipeline decisions, reconstruction metadata, and analytical summaries directly into an ontology-backed KMDS knowledge graph for compliance auditing and data lineage tracking.
- Structural Suitability Validation: A built-in gate that quantifies whether a dataset possesses enough internal structure for meaningful decomposition, preventing the "garbage-in, garbage-out" failure mode common in automated SSA.
tseda is designed to fit into standard enterprise version-control and deployment lifecycles:
- Pull: Checkout the project branch within your enterprise version-control environment.
- Execute: Run
tsedaheadlessly via the Notebook SDK to compute baseline signal profiles and export components. - Verify: Launch the interactive dashboard to audit edge cases or manually override heuristics for complex signals.
- Commit: Persist the final analytical state and observations directly to the metadata repository via the KMDS integration.
tseda organizes time series analysis into a structured three-phase process to ensure methodological consistency.
Evaluate the raw signal's statistical properties before decomposition.
- Distribution Analysis: Utilize Kernel Density Estimates (KDE) and box plots to identify multi-modality or outliers.
- Autocorrelation Profiling: ACF and PACF plots provide immediate indicators of seasonal structure and autoregressive components.
Window selection and component grouping are the two hardest engineering bottlenecks when applying SSA. Choosing the wrong window distorts the eigen spectrum; grouping the wrong components conflates trend with noise. tseda automates both.
The app first computes an initial SSA window from the detected cadence, then validates whether the eigen spectrum has enough spread. If the smallest eigenvalue still explains too much variance, the window is doubled and SSA is recomputed until the criterion is satisfied.
Before grouping, tseda checks if variance is concentrated in a small number of leading eigenvectors. A flat eigenspectrum is the signature of white noise; applying SSA to such a series is mathematically valid but practically meaningless.
Change point detection is executed automatically post-grouping, covering two structural shifts:
- Trend shifts — detects permanent changes in the long-run mean level (PELT on the normalised Trend component).
- Seasonal amplitude shifts — detects points where the seasonal pattern becomes noticeably stronger or weaker (PELT on the rolling-RMS envelope of the Seasonality component).
Use the Akaike Information Criterion (AIC) as a function of model rank to provide a principled guide for model order selection. Finalize the analysis by saving structured findings to the KMDS knowledge base.
The package also provides a notebook interface to these features. If you have a new dataset that you want to analyze, look at the data loader directory for examples. Download your dataset, clean it, produce your time series, and analyze it with tseda.
- UI and notebook parity: Anything you can do in the UI should be scriptable in notebooks.
- Configuration-first behavior: Runtime thresholds and heuristics are externalized in
src/tseda/config/tseda_config.yaml. - Explicit controls for decomposition: Window size and component grouping are treated as first-class controls in both UI and Python API.
- Composable feature calls: Plotting, decomposition, diagnostics, and reporting are exposed as separate methods so users can build custom analysis flows.
Use NotebookThreeStepAPI for the same three-step workflow directly in Python:
from tseda.notebook_api import NotebookThreeStepAPI, load_series_from_csv
series = load_series_from_csv("data/coffee_prices.csv")
api = NotebookThreeStepAPI(series)
# Step 1: initial assessment
fig_kde = api.get_kde_plot(show_kde=True, bin_algorithm="scott")
fig_acf = api.get_acf_plot(lags=40)
# Step 2: decomposition with explicit window and grouping control
current_window = api.get_window()
api.set_window(current_window, apply_window_refinement=True)
grouping, dw_ok = api.suggest_grouping()
api.set_grouping(grouping=grouping)
fig_recon = api.get_reconstruction_plot()
# Step 3: observation logging outputs
fig_var = api.get_variance_explained_plot()
report_text = api.generate_observation_text()Key notebook API capabilities:
get_kde_plot(..., bin_algorithm="scott")with configurable histogram bin algorithms (auto,fd,doane,scott,stone,rice,sturges,sqrt).get_window()/set_window(...)for explicit SSA window control.suggest_grouping(grouping_config=...)/set_grouping(...)/get_grouping()for explicit component assignment control, including kneedle/noise-floor overrides.suggest_grouping_with_window_autotune(...)to retry grouping with automatic window reassignment until DW is in range or the window limit is reached.get_grouping_heuristic_configuration()to inspect active grouping-heuristic config values.get_suitability_result(...)for the same top-k eigenvalue suitability gate used by the UI.
from pathlib import Path
from tseda.notebook_api import NotebookThreeStepAPI, load_example_series
# Assumes this script runs from the repository root.
workspace_root = Path.cwd()
# Load an example dataset.
series = load_example_series("coffee_prices", workspace_root=workspace_root)
api = NotebookThreeStepAPI(series)
# -------------------------
# Step 1: Initial Assessment
# -------------------------
sampling_df = api.get_sampling_properties()
stats_df = api.get_summary_statistics()
kde_fig = api.get_kde_plot(show_kde=True, bin_algorithm="scott")
acf_fig = api.get_acf_plot(lags=30)
pacf_fig = api.get_pacf_plot(lags=30, method="yw")
# -------------------------
# Step 2: Decomposition
# -------------------------
window_before = api.get_window()
window_after = api.set_window(window_before, apply_window_refinement=True)
suitability = api.get_suitability_result()
if not suitability.is_suitable:
raise RuntimeError(
f"Dataset not suitable for SSA: top-{suitability.top_k} ratio "
f"{suitability.top_k_ratio:.3f} < threshold {suitability.threshold:.3f}"
)
grouping, dw_ok = api.suggest_grouping()
api.set_grouping(grouping=grouping)
eigen_fig = api.get_eigen_plot()
reconstruction_fig = api.get_reconstruction_plot()
change_point_fig = api.get_change_point_plot()
loess_fig = api.get_loess_plot(fraction=0.10)
noise_kde_fig = api.get_noise_kde_plot(bandwidth="silverman")
recon_meta = api.get_reconstruction_metadata()
# -------------------------
# Step 3: Observation Logging
# -------------------------
variance_fig = api.get_variance_explained_plot()
noise_variance_fig = api.get_noise_variance_plot()
observation_text = api.generate_observation_text()
components_df = api.export_components_dataframe()
print("Window:", window_before, "->", window_after)
print("DW in suggested grouping:", dw_ok)
print("Reconstruction metadata:", recon_meta)
print("Observation preview:\n", observation_text[:500])
print("Components head:\n", components_df.head())from pathlib import Path
from tseda.notebook_api import NotebookThreeStepAPI, load_series_from_csv
# Replace with your own CSV path.
csv_path = Path("/absolute/path/to/your_time_series.csv")
# Default expects: column 0 = timestamp, column 1 = numeric value.
# If your schema is different, pass timestamp_col/value_col as names or indices.
series = load_series_from_csv(csv_path, timestamp_col=0, value_col=1)
api = NotebookThreeStepAPI(series)
# Optional: inspect and adjust decomposition control points.
print("Initial window:", api.get_window())
api.set_window(api.get_window(), apply_window_refinement=True)
grouping, dw_ok = api.suggest_grouping()
api.set_grouping(grouping=grouping)
# Generate core artifacts.
kde_fig = api.get_kde_plot(show_kde=True, bin_algorithm="scott")
reconstruction_fig = api.get_reconstruction_plot()
variance_fig = api.get_variance_explained_plot()
observation_text = api.generate_observation_text()
components_df = api.export_components_dataframe()
print("DW status:", dw_ok)
print(observation_text[:400])
print(components_df.head())Python 3.12 or higher is required to run this package.
Before starting the installation, verify your Python version:
python --versionEnsure the output shows Python 3.12 or higher. If not, please upgrade Python before proceeding.
Conda is the recommended package manager for development and installation (development was done with conda):
conda create -n tseda python=3.12
conda activate tseda
pip install tsedaThen run the app:
tsedaIf you just want to run the app with minimal setup:
- Install with
pipx:
pipx install tseda- Launch the app:
tseda- Open your browser at
http://127.0.0.1:8050.
If pipx is not available, use the standard Python install instructions below.
Verify you have Python 3.12 or higher installed:
python --versionCreate and activate a virtual environment, then install from PyPI:
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install tsedatsedaYou can also launch with Python module execution:
python -m tsedaNote: python tseda is not a valid way to run an installed package because Python treats tseda as a local script path.
By default, the app starts at http://127.0.0.1:8050.
Optional runtime overrides:
TSEDA_HOST=0.0.0.0 TSEDA_PORT=8050 TSEDA_DEBUG=false tseda- Click "Drag and Drop or Select Files" in the Initial Assessment panel.
- Your file must be a CSV or Excel file with at least two columns: a timestamp column (first) and a numeric value column (second).
- The data must be regularly sampled at hourly or lower frequency (e.g., hourly, daily, monthly).
- The dataset must contain no missing values (NA / NaN). Clean your data before uploading.
- Files are limited to 2,000 rows (configurable via
file_upload.max_file_linesinsrc/tseda/config/tseda_config.yaml).
Example datasets are available directly in the repository under data. They are intentionally not bundled inside wheel/sdist package builds to keep distribution artifacts lean.
Hyndman-based example files:
- data/hyndman_arrivals_quarterly_japan.csv
- data/hyndman_goog_daily_close.csv
- data/hyndman_hyndsight_daily_pageviews.csv
- data/hyndman_sunspot_monthly_area.csv
- data/hyndman_usconsumption_quarterly_consumption.csv
Additional example files:
- data/coffee_prices.csv
- data/monthly-car-sales.csv
- data/trimmed_biomass - generated_biomass_MW_series.csv
- data/uci_air_quality_hourly_co.csv
- data/ticket_resolution_hourly_nyc311.csv
- data/white_noise_data.csv — negative example; expected to fail the dataset suitability check
If you install from source (clone the repo), these files are available immediately. If you install from PyPI/package artifacts, download the examples from the repository paths above.
| Step | Panel | What to do |
|---|---|---|
| 1 | Initial Assessment of Time Series | Review distribution plots (KDE, box plot) and the ACF / PACF for autocorrelation patterns. |
| 2 | Time Series Decomposition | Review the suggested grouping table, adjust the prepopulated Trend, Seasonality, and Noise inputs if needed, then click Apply Grouping. When Durbin-Watson is in range [1.5, 2.5], the Export Components button is enabled to download Trend/Seasonality/Noise as CSV. |
| 3 | Observation Logging | Review the AIC rank diagnostics, read the auto-generated summary, and add your own observations before saving the report. |
If you are developing locally from source:
pip install -e .
tseda- Build source and wheel distributions:
uv build- Validate distributions before upload:
uvx twine check dist/*pip install -r docs/requirements.txt
sphinx-build -b html docs/source docs/_build/htmlYou can also use the Makefile:
make -C docs htmlThe generated site will be available in docs/_build/html.
This repository includes .readthedocs.yaml configured to build docs from docs/source/conf.py.
- Push the repository to GitHub (or another supported provider).
- Sign in to Read the Docs and import the project.
- In Read the Docs project settings:
- Set the default branch.
- Confirm the config file path is
.readthedocs.yaml.
- Trigger a build from the Read the Docs dashboard.
- Optionally enable a custom domain and versioned docs.
If the build fails, inspect the Read the Docs build logs and replicate locally using:
make -C docs htmlA detailed user guide is available at docs/user_guide.md. A video version of the user guide is also available on YouTube. The written guide covers:
- Data requirements and input format
- Step-by-step walkthrough of all three workflow phases
- Interpreting SSA decomposition outputs (eigenvalue profile, component groupings, Durbin-Watson test)
- Change point detection (trend shifts and seasonal amplitude shifts — see Change Point Detection)
- AIC-based model order selection
- Exporting reports and knowledge base entries
- Configuration guide
If you'd like to request a feature or report an issue, please open an issue on GitHub. You're also welcome to reach out to me directly.