# Updated Realistic Synthetic Waveguide Dataset (Physics-Only)

This synthetic dataset is designed for training physics‚Äêinspired neural networks in the domain of optical waveguide characterization. It consists of 50,000 samples, each with 15 input features capturing physical, material, and geometrical attributes, and 14 output targets describing optical performance.

Use this dataset to train data‚Äêdriven models that predict waveguide performance from basic design parameters in integrated photonics applications.

## üìÇ Input Parameter List (15 Features)

| Name                      | Description                                                                                                          | Range / Format                                                                                       |
|---------------------------|----------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------|
| `core_index`              | Complex refractive index of the waveguide core.                                                                      | Real: 1.48‚Äì1.52 (single-mode) or 1.50‚Äì1.52 (multimode); Imaginary: -1e-8 to -1e-7; Stored as ‚Äúx+yj‚Äù. |
| `clad_index`              | Complex refractive index of the cladding.                                                                            | Real: ~1.44 up to just below core index; Imaginary: -1e-8 to -1e-7; Format ‚Äúx+yj‚Äù.                  |
| `core_radius_m`           | Core radius \(a\).                                                                                                   | Single-mode: 0.5‚Äì2 ¬µm; Multimode: 2‚Äì10 ¬µm.                                                           |
| `clad_radius_m`           | Cladding radius \(b\).                                                                                               | 20‚Äì50 ¬µm.                                                                                            |
| `length_m`                | Waveguide length \(L\).                                                                                              | 0.001‚Äì0.5 m.                                                                                         |
| `wavelength_m`            | Operating wavelength \(\lambda\).                                                                                    | 500√ó10‚Åª‚Åπ‚Äì1.6√ó10‚Åª‚Å∂ m.                                                                                 |
| `polarization`            | Input polarization (0 = pure TE, 1 = pure TM).                                                                       | 0‚Äì1.                                                                                                |
| `alpha_core`              | Core intrinsic loss coefficient \(lpha_{m core}\).                                                               | 1e-4‚Äì1e-3 m‚Åª¬π.                                                                                       |
| `alpha_clad`              | Cladding intrinsic loss coefficient \(lpha_{m clad}\).                                                           | 1e-4‚Äì1e-3 m‚Åª¬π.                                                                                       |
| `photoelastic_coeff`      | Photoelastic coefficient \(p\) of the core material.                                                                 | 0.20‚Äì0.25.                                                                                           |
| `delta_rho_over_rho`      | Density variation ratio \(\Deltaho/ho\).                                                                          | 1e-12‚Äì1e-11.                                                                                         |
| `sigma_rms_m`             | RMS surface roughness \(\sigma\) at the core‚Äìcladding interface.                                                      | 1‚Äì10 nm.                                                                                            |
| `roughness_corr_length_m` | Correlation length \(L_{m corr}\) of interface roughness.                                                          | 100 nm‚Äì1 ¬µm.                                                                                         |
| `w_in_m`                  | Input beam waist \(w_{m in}\).                                                                                     | 1‚Äì5 ¬µm.                                                                                             |
| `input_power`             | Input optical power \(P_{m in}\).                                                                                  | 1e-3‚Äì1e-2 W.                                                                                        |


## üåü Output Parameter List (14 Targets)

| Name                     | Description                                                                                                   |
|--------------------------|---------------------------------------------------------------------------------------------------------------|
| `propagation_loss_dB`    | Propagation loss (dB):                                     |
|                          | \[P_out = P_in e^{-Œ±_total L},  L_prop = 10 log_10(P_in / P_out)\]    |
| `insertion_loss_dB`      | Insertion (coupling) loss (dB) via Gaussian overlap.                                                           |
| `coupling_loss_dB`       | Same as insertion loss.                                                                                        |
| `mode_field_diameter_m`  | Mode field diameter (MFD): \[w = a(0.65 + 1.619 V^{-1.5} + 2.879 V^{-6}),  MFD = 2w\]                          |
| `mode_confinement_factor`| Fraction of power in core: \[Œì = u^2 / V^2\]                                                                   |
| `single_mode`            | Flag 'Y' if single-mode (V<2.405), else 'N'.                                                                  |
| `multi_mode`             | Flag 'Y' if multimode (V‚â•2.405), else 'N'.                                                                    |
| `scattering_loss_dB`     | Scattering loss (dB):                                                                                         |
|                          | \[Œ±_scatt,bulk = (8œÄ^3/(3 Œª^4)) p^2 (ŒîœÅ/œÅ)^2 Œì,  Œ±_scatt,surf = (4œÄ^3/Œª^2) œÉ^2 L_corr,  loss = 4.343 (Œ±_scatt,total) L\] |
| `effective_index`        | Effective index: \[n_eff = sqrt(n_clad^2 + (u^2/V^2)(n_core^2 - n_clad^2))\]                                   |
| `cross_coupling`         | Cross-coupling: \[0 if V<2.405; 0.5 (V-2.405)/V if V‚â•2.405\]                                                   |
| `TE_percent`, `TM_percent`| Mode polarization fractions (%)                                                                               |
| `V_parameter`            | Normalized frequency \[V = (2œÄ a / Œª) sqrt(n_core^2 - n_clad^2)\]                                              |
| `output_power`           | Output power: \[P_out = P_in exp(-Œ±_total L)\]                                                                |


## üßÆ Key Physics Equations

**Normalized Frequency**  
$$V = \frac{2\pi\,a}{\lambda}\sqrt{n_{\mathrm{core,real}}^2 - n_{\mathrm{clad,real}}^2}$$

**Mode Field Radius (w)**  
$$w = a\Bigl(0.65 + 1.619\,V^{-1.5} + 2.879\,V^{-6}\Bigr)$$

**Mode Field Diameter (MFD)**  
$$\mathrm{MFD} = 2\,w$$

**Eigenvalue Parameter (u)**  
$$u = \begin{cases}0.9\,V,&V<2.405,\\V-0.5,&V\ge2.405,\end{cases}$$

**Mode Confinement Factor (Œì)**  
$$\Gamma = \frac{u^2}{V^2}$$

**Effective Attenuation (Œ±_eff)**  
$$\alpha_{\rm eff} = \alpha_{\rm core}\,\Gamma + \alpha_{\rm clad}\,(1-\Gamma)$$

**Bulk Scattering (Œ±_scatt,bulk)**  
$$\alpha_{\rm scatt,bulk} = \frac{8\pi^3}{3\lambda^4}p^2\Bigl(\tfrac{\Delta\rho}{\rho}\Bigr)^2\Gamma$$

**Surface Scattering (Œ±_scatt,surf)**  
$$\alpha_{\rm scatt,surf} = \frac{4\pi^3}{\lambda^2}\sigma_{\rm rms}^2\,L_{\rm corr}$$

**Total Attenuation (Œ±_total)**  
$$\alpha_{\rm total} = \alpha_{\rm eff} + \alpha_{\rm scatt,bulk} + \alpha_{\rm scatt,surf}$$

**Output Power**  
$$P_{\rm out} = P_{\rm in} \exp\bigl(-\alpha_{\rm total} L\bigr)$$

**Propagation Loss**  
$$\mathcal{L}_{\rm prop} = 10 \log_{10}\frac{P_{\rm in}}{P_{\rm out}}$$

## ‚öôÔ∏è Methodology (Physics-Only)

1. **Random Sampling**: Uniformly sample each input parameter within its specified range.
2. **Mode Balancing**: Force approximately 50% of samples into the single-mode regime ($V<2.405$) and 50% into multi-mode ($V\ge2.405$).
3. **Physics-Based Computation**: Use the equations above to compute all optical performance metrics (losses, MFD, n_eff, polarization, etc.).
4. **Complex Index Formatting**: Store core and cladding refractive indices as strings `"x+yj"` without extra brackets.