# 📘 Updated Realistic Synthetic Waveguide Dataset (Physics-Only)
---
**Dataset Overview**

This synthetic dataset is designed for training **physics‐inspired neural networks** in the domain of **optical waveguide characterization**. The dataset consists of **50,000 samples**. Each sample comprises **15 input features** that capture the physical, material, and geometrical attributes of a waveguide and **14 output targets** that describe its optical performance (losses, mode characteristics, effective index, and polarization components).

The dataset is especially useful for developing data‐driven models that can predict waveguide performance from basic design parameters, enabling applications in integrated photonics for both glass and silicon-based devices.

---
## 📂 Input Parameter List (15 Features)
| **Feature**               | **Description**                                                                                                                          |
|---------------------------|-------------------------------------------------------------------------------------------------------------------------------------------|
| `core_index`              | Complex refractive index of the waveguide core. Range: real 1.48–1.52 (single‑mode) or 1.50–1.52 (multimode); imag −1×10⁻⁸–−1×10⁻⁷. Format: “x+yj”. |
| `clad_index`              | Complex refractive index of the cladding. Range: ~1.44 up to just below core index; imag −1×10⁻⁸–−1×10⁻⁷. Format: same as core_index.    |
| `core_radius_m`           | Core radius (a) in meters. Range: 0.5–2 µm (single‑mode) or 2–10 µm (multimode).                                                     |
| `clad_radius_m`           | Cladding radius (b) in meters. Range: 20–50 µm.                                                                                         |
| `length_m`                | Waveguide length (L) in meters. Range: 0.001–0.5 m.                                                                                     |
| `wavelength_m`            | Operating wavelength (λ) in meters. Range: 500 nm–1600 nm.                                                                             |
| `polarization`            | Input polarization (0 = TE, 1 = TM). Range: 0–1.                                                                                       |
| `alpha_core`              | Core intrinsic loss coefficient (α₁) in m⁻¹. Range: 1×10⁻⁴–1×10⁻³.                                                                     |
| `alpha_clad`              | Cladding intrinsic loss coefficient (α₂) in m⁻¹. Range: 1×10⁻⁴–1×10⁻³.                                                                  |
| `photoelastic_coeff`      | Photoelastic coefficient (p) of the core material. Range: 0.20–0.25.                                                                     |
| `delta_rho_over_rho`      | Density variation ratio (Δρ/ρ). Range: 1×10⁻¹²–1×10⁻¹¹.                                                                                 |
| `sigma_rms_m`             | RMS surface roughness (σ) at core–clad interface (m). Range: 1–10 nm.                                                                   |
| `roughness_corr_length_m` | Correlation length (L_corr) of interface roughness (m). Range: 100 nm–1 µm.                                                             |
| `w_in_m`                  | Input beam waist (w_in) in meters. Range: 1–5 µm.                                                                                      |
| `input_power`             | Input optical power (P_in) in W. Range: 1×10⁻³–1×10⁻².                                                                                 |


---
## 🌟 Output Parameter List (14 Targets)
| **Target**                   | **Description**                                                                                                    |
|------------------------------|--------------------------------------------------------------------------------------------------------------------|
| `propagation_loss_dB`        | Propagation loss (dB): $$P_{out} = P_{in}\exp(-\alpha_{total}L),\quad L_{prop}=10\log_{10}(P_{in}/P_{out})$$     |
| `insertion_loss_dB`          | Insertion (coupling) loss (dB) via Gaussian overlap.                                                               |
| `coupling_loss_dB`           | Coupling loss (dB), identical to insertion_loss_dB.                                                                |
| `mode_field_diameter_m`      | Mode field diameter (MFD): $$w = a(0.65+1.619V^{-1.5}+2.879V^{-6}),\; MFD=2w$$                                      |
| `mode_confinement_factor`    | Fraction of power in the core: $$\Gamma = u^2/V^2$$                                                               |
| `single_mode`                | ‘Y’ if single-mode (V<2.405), else ‘N’.                                                                            |
| `multi_mode`                 | ‘Y’ if multimode (V≥2.405), else ‘N’.                                                                              |
| `scattering_loss_dB`         | Scattering loss (dB): $$4.343(\alpha_{scatt,bulk}+\alpha_{scatt,surf})L$$                                         |
| `effective_index`            | Effective refractive index: $$n_{eff}=\sqrt{n_{clad}^2+(u^2/V^2)(n_{core}^2-n_{clad}^2)}$$                       |
| `cross_coupling`             | Cross-coupling: $$0\;	ext{if }V<2.405,\;0.5\,(V-2.405)/V\;	ext{if }V\ge2.405$$                                   |
| `TE_percent`, `TM_percent`   | Mode polarization percentages (%)                                                                                  |
| `V_parameter`                | Normalized frequency: $$V=2\pi a/\lambda\sqrt{n_{core}^2-n_{clad}^2}$$                                           |
| `output_power`               | Output power: $$P_{out}=P_{in}\exp(-\alpha_{total}L)$$                                                            |


---
## 🧮 Key Physics Equations
---
1. **Normalized Frequency**  
   $$V = \frac{2\pi\,a}{\lambda}\sqrt{n_{core,real}^2 - n_{clad,real}^2}$$

2. **Mode Field Radius**  
   $$w = a\Bigl(0.65 + 1.619V^{-1.5} + 2.879V^{-6}\Bigr)$$

3. **Mode Field Diameter**  
   $$MFD = 2w$$

4. **Eigenvalue Parameter**  
   $$u = \begin{cases}0.9V,&V<2.405\\V-0.5,&V\ge2.405\end{cases}$$

5. **Mode Confinement Factor**  
   $$\Gamma = \frac{u^2}{V^2}$$

6. **Effective Attenuation**  
   $$\alpha_{eff} = \alpha_{core}\Gamma + \alpha_{clad}(1-\Gamma)$$

7. **Bulk Scattering**  
   $$\alpha_{scatt,bulk} = \frac{8\pi^3}{3\lambda^4}p^2(\tfrac{\Delta\rho}{\rho})^2\Gamma$$

8. **Surface Scattering**  
   $$\alpha_{scatt,surf} = \frac{4\pi^3}{\lambda^2}\sigma_{rms}^2L_{corr}$$

9. **Total Attenuation**  
   $$\alpha_{total} = \alpha_{eff} + \alpha_{scatt,bulk} + \alpha_{scatt,surf}$$

10. **Output Power**  
   $$P_{out} = P_{in}\exp(-\alpha_{total}L)$$

11. **Propagation Loss**  
   $$L_{prop} = 10\log_{10}(P_{in}/P_{out})$$


---
## ⚙️ Methodology (Physics-Only)
---
1. **Random Sampling** – Uniformly sample each input parameter within its specified range.
2. **Mode Balancing** – Force ~50% of samples into single-mode (V<2.405) and 50% into multimode (V≥2.405).
3. **Physics-Based Computation** – Compute optical performance metrics using the equations above.
4. **Index Formatting** – Store complex indices as strings “x+yj” for easy parsing.
