Skip to content

allumios/os_digital_twin

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

An Open-Source Digital Twin for Structural Dynamics

A Python-based digital twin pipeline that identifies the per-storey stiffness of a three-storey aluminium shear frame from shaking table measurements, using Transitional Markov Chain Monte Carlo (TMCMC) Bayesian model updating.

Author: Osman Mukuk Supervisor: Dr Marco De Angelis Institution: University of Strathclyde, BEng Civil Engineering, 2025-26

This repository contains the code and data for an undergraduate dissertation that develops, calibrates and validates a digital twin for a laboratory shear frame tested at the University of Strathclyde. The pipeline is open-source, reproducible and grounded in real experimental data. Every stage, from the raw acceleration records to the final posterior stiffness distributions, is documented and can be regenerated by running the scripts in this repository.

What the pipeline does

The pipeline processes acceleration records from two independent test sessions on a Centrotecnica shaking table. Each session contains six types of dynamic excitation: free vibration, impact, three harmonic tests (one per mode), and earthquake-type broadband excitation. The natural frequencies of the structure are identified from the free vibration tails at the end of each test. These frequencies, together with their empirical measurement uncertainty, are used to update the three storey stiffness parameters of a shear-building forward model through Bayesian inference.

The pipeline is organised into five modules. Module 1 performs system identification, extracting natural frequencies, mode shapes and damping ratios from the acceleration records. Module 2 builds the forward model and computes an initial stiffness estimate from the measured geometry. Module 3 runs the TMCMC sampler to obtain posterior distributions for the three per-storey stiffnesses. Module 4 validates the calibrated model against independent measurements from Session 2. Module 5 propagates the geometric measurement tolerances through the forward model separately from the Bayesian posterior, providing a combined uncertainty budget.

Repository structure

dt/
├── README.md                   # This file
├── requirements.txt            # Python dependencies
├── .gitignore                  # Files excluded from version control
│
├── config.py                   # All structural, experimental and BMU parameters
├── forward_model.py            # 3-DOF eigenvalue solver and stiffness matrix
├── signal_processing.py        # Data loading, FFT, PSD, damping, mode shapes
├── bayesian_updating.py        # TMCMC sampler and posterior diagnostics
├── uncertainty_analysis.py     # Geometric tolerance propagation, sensitivity
├── run_digital_twin.py         # Main pipeline (runs all five modules in order)
├── generate_plots.py           # Detailed amplitude and PSD plots per test
├── convert_data.py             # Converts Centrotecnica xlsx to compressed npz
│
├── data/                       # Experimental data
│   ├── session_1.npz           # Session 1 (calibration), 6 tests, 13 MB
│   └── session_2.npz           # Session 2 (validation), 6 tests, 12 MB
│
├── figures/                    # Generated by run_digital_twin.py (16 plots)
└── output/                     # Generated by run_digital_twin.py and generate_plots.py
    ├── accelerograms/          # Raw acceleration time histories (12 plots)
    ├── free_vib_windows/       # Free vibration tail windows (8 plots)
    ├── amplitude/              # Individual amplitude spectra (36 plots)
    └── psd/                    # Individual PSD plots (36 plots)

The figures/ and output/ folders are not tracked by git. They are regenerated every time the pipeline is run.

Installation

The pipeline requires Python 3.9 or newer. To install the dependencies:

pip install -r requirements.txt

The dependencies are numpy, scipy, matplotlib and openpyxl. The last one is only needed if you want to regenerate the npz files from the original xlsx data using convert_data.py.

Running the pipeline

The main pipeline runs in one command:

python run_digital_twin.py

This takes approximately one minute on a typical laptop. It produces console output reporting the identified frequencies, the TMCMC stages, the posterior means and standard deviations, the validation results, and the uncertainty budget. It also writes 16 figures to the figures/ folder and 20 plots to output/free_vib_windows/ and output/accelerograms/.

After the main pipeline finishes, you can optionally generate the full set of detailed per-test amplitude and PSD plots:

python generate_plots.py

This adds 72 plots to output/amplitude/ and output/psd/ (one plot per floor per test per spectrum type). It takes approximately three minutes because it processes each test individually at high resolution.

Data format

The experimental data is stored in two compressed NumPy archive files:

data/session_1.npz    Session 1 (calibration), 6 dynamic tests
data/session_2.npz    Session 2 (validation), 6 dynamic tests

Each archive contains six arrays, one per test. Each array has shape (N, 4) where column 0 is the time axis (seconds) and columns 1 to 3 are the accelerations at Floor 1, Floor 2 and Floor 3 (arbitrary units, as output by the DAQ). The sampling rate is 2048 Hz. Record lengths vary between tests: impact tests are 15-20 seconds, harmonic and earthquake tests are 40-50 seconds.

The original raw data was supplied as Centrotecnica xlsx files (82 MB total). These were converted to npz format (25 MB total) for efficient loading and storage. If you have the original xlsx files, you can regenerate the npz archives by placing the xlsx files in data/ and running:

python convert_data.py

The conversion preserves full float64 precision.

Configuration

All parameters are defined in config.py. This is the only file you should need to edit if you want to adapt the pipeline to a different structure or dataset. The main sections are:

Structural properties — floor masses, Young's modulus, column dimensions, number of columns per storey, storey heights. The default values correspond to the three-storey EN AW-6082-T6 aluminium frame used in this project.

Measurement tolerances — the resolutions of the instruments used to measure the column depth, column width and storey heights. These propagate through Module 5 to give the geometric contribution to the total uncertainty. Young's modulus and floor mass are treated as fixed values, following the approach used by Bonney et al. (2022) for a comparable laboratory structure.

Data files — paths to the two session archives and the sheet names for each of the six tests per session.

TMCMC settings — prior bounds on the stiffness parameters (uniform distribution between K_LO and K_HI), the number of samples per stage (NSAMPLES), the proposal scaling factor (TMCMC_BETA) and the random seed (SEED). The seed ensures that every run produces identical results.

Methodology summary

Frequency identification. Natural frequencies are extracted from the free vibration tail at the end of each dynamic test. The excitation end point is detected by computing the sliding RMS of the Floor 3 acceleration in 0.5-second windows and identifying the last window where the RMS exceeds 25% of its peak value. A 0.5-second buffer is added after this point, and everything that follows is taken as the free vibration tail. The FFT of this tail is computed using the full available length with a Hanning window, and the three strongest peaks above 5 Hz are identified as the natural frequencies.

Tail quality filter. A tail is considered clean if it is at least 5 seconds long (to provide FFT frequency resolution of 0.2 Hz or better) and its RMS amplitude is less than 20% of the mid-record RMS (confirming the excitation has effectively stopped). In the default dataset, 5 out of 6 Session 1 tests and 3 out of 6 Session 2 tests satisfy both conditions. The excluded tests are retained in the raw data but not used for calibration or validation.

Calibration target. The mean and standard deviation of the frequencies identified from the 5 clean Session 1 tails serve as the calibration target and the empirical measurement uncertainty entering the Bayesian likelihood.

Likelihood function. The likelihood is a frequency-only Gaussian on the residuals between predicted and measured frequencies, using the empirical standard deviations as the noise scale. Mode shapes are computed as a post-hoc diagnostic but are not included in the likelihood, because the three-sensor setup does not produce mode shapes of sufficient quality for two of the three modes.

Validation. The calibrated posterior is validated against the 3 clean Session 2 tests, which were not used during calibration. Each test is classified as pass or fail for each mode based on whether the identified frequency falls within the 95% posterior credible interval.

Uncertainty budget. The Bayesian posterior standard deviations capture the measurement variability. The geometric measurement tolerances are propagated separately through the forward model using Monte Carlo simulation, giving a second uncertainty component. The two components are combined via the root sum of squares to give the total stiffness uncertainty.

Results

Running run_digital_twin.py with the default configuration produces the following posterior stiffness distributions:

Parameter Mean (N/m) Standard deviation (N/m) Relative uncertainty
k1 (bottom storey) 52,508 1,428 2.7%
k2 (middle storey) 56,956 2,074 3.6%
k3 (top storey) 66,778 2,501 3.7%

The posterior mean reproduces the Session 1 calibration frequencies (7.203, 20.961, 30.435 Hz) to within 0.25% for all three modes. The Session 2 validation reveals a systematic upward shift of 0.1-0.2 Hz in the measured frequencies, attributable to reassembly of the frame between sessions.

Because TMCMC uses a fixed random seed, these values are fully reproducible. Every run of the pipeline on the same data produces identical output.

Adapting the pipeline

The pipeline is designed to be adapted to other three-storey shear frames or, with more effort, to structures with different numbers of degrees of freedom. To use a different structure:

  1. Update the structural properties and measurement tolerances in config.py.
  2. Replace the data files in data/ with your own, using the same format (npz archive with one array per test, each of shape (N, 4) containing time and three floor accelerations).
  3. Update SESSION_1_FILE, SESSION_2_FILE, S1_SHEETS and S2_SHEETS in config.py to point to your new files and match your test names.

For structures with a different number of storeys, the forward model and BMU modules would need to be generalised. This is not supported by the current code but is a natural extension.

References

Ching, J., & Chen, Y.-C. (2007). Transitional Markov Chain Monte Carlo method for Bayesian model updating, model class selection, and model averaging. Journal of Engineering Mechanics, 133(7), 816-832.

Bonney, M. S., de Angelis, M., Dal Borgo, M., Andrade, L., Beregi, S., Jamia, N., & Wagg, D. J. (2022). Development of a digital twin operational platform using Python Flask. Data-Centric Engineering, 3, e1.

Chopra, A. K. (2012). Dynamics of Structures: Theory and Applications to Earthquake Engineering (4th ed.). Pearson.

Lye, A., Cicirello, A., & Patelli, E. (2021). Sampling methods for solving Bayesian model updating problems: A tutorial. Mechanical Systems and Signal Processing, 159, 107760.

License and use

This code is released as open source to accompany the dissertation "An Open-Source Digital Twin for Structural Dynamics" (University of Strathclyde, 2026). It is free to use, modify and redistribute for academic and educational purposes. If you use the code or data in your own work, please cite the dissertation and contact the author or supervisor for any commercial or derivative applications.

About

An Open-Source Digital Twin for Structural Dynamics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages