# 0. Gauge Study R&R PyQt GUI Tool

![GaugeStudy_GUI_completeSketch.png](attachment:8e090af5-3e46-43c1-b0f3-ae2d0f6715fc.png)

Our **Gauge R&R PyQt GUI** concept is well-structured, integrating key functionalities needed for **Repeatability & Reproducibility (R&R) studies**. Here's a detailed breakdown:

---

### **🔹 1. CSV Handling & Data Preprocessing**
✔ **Load CSV:** Allows users to import measurement datasets.  
✔ **Validate CSV:** Checks file structure, ensuring required columns (e.g., Part ID, Operator ID, Trial Number, Measured Value) are present.  
✔ **Clean Data:** Enables outlier removal, missing value handling, and duplicate filtering before analysis begins.  

---

### **🔹 2. Study Configuration & Analysis Execution**
✔ **One-Factor vs. Two-Factor Study Selection:** Users choose between **single-operator studies** or **multi-operator reproducibility assessments**.  
✔ **Remove Duplicates & Range Filtering:** Ensures data integrity by eliminating redundancy.  
✔ **Abort & Run Analysis:** Provides control over execution with a **progress bar** displaying the computation status.  
✔ **Summary Statistics Panel:** Shows **total Gauge R&R, repeatability, reproducibility**, and **the number of distinct categories** in the dataset.  

---

### **🔹 3. Detailed Gauge Study Results**
✔ **Tabulated Overview:** Displays essential statistical results, including:  
   - **μY (Mean Measurement)**  
   - **γP, γM, γR (Process, Measurement, and Reproducibility Variance Ratios)**  
   - **PTR (Precision-to-Tolerance Ratio)**  
   - **SNR (Signal-to-Noise Ratio)**  
   - **Cp (Process Capability Index)**  
   - **β and δ indices** → Important for **measurement system reliability**.  
   - **Cg & Cgk (Gauge Capability Indices)**  
   - **Tolerance Ratio** → Assesses measurement consistency within design specifications.  

---

### **🔹 4. Interactive Visualization**
✔ **Box Plots:** Display measurement spread across different parts and repeatability conditions.  
✔ **Pie Chart:** Illustrates variance contribution (repeatability vs. reproducibility).  
✔ **Histograms:** Show overall measurement distribution patterns.  
✔ **SNR vs. PTR Scatter Plot:** Highlights the balance between **signal strength and precision**.  

---

### **🔹 5. Export & Reporting**
✔ **Export Results:** Saves processed data in CSV or Excel format.  
✔ **Generate Report:** Creates a **structured PDF summary** for documentation.  
✔ **Explain Results:** Provides **interpretations of key metrics**, guiding users in decision-making.  

---

### **🔹 6. XAI (Explainable AI) Support**
✔ **Placeholder for AI-driven explanations**, assisting users in understanding statistical findings.  

---

Our concept seamlessly integrates **data handling, validation, analysis, visualization, and reporting**, making it a powerful tool for **Gauge R&R studies in manufacturing and quality control**.


Here's a **Pythonic GUI layout** using **PyQt** with well-structured placeholders, allowing flexibility for future refinements based on your design concept.

---

### **🔹 Pythonic GUI Layout (PyQt)**
This script provides a **basic window structure**, including **placeholders** for key functionalities (CSV handling, analysis, results visualization, and reporting).

```python
import sys
from PyQt6.QtWidgets import (
    QApplication, QMainWindow, QPushButton, QVBoxLayout, QHBoxLayout, QWidget,
    QLabel, QComboBox, QTableWidget, QTableWidgetItem, QFileDialog
)
from matplotlib.figure import Figure
from matplotlib.backends.backend_qt5agg import FigureCanvasQTAgg as FigureCanvas

class GaugeRRGUI(QMainWindow):
    def __init__(self):
        super().__init__()

        # Window Configuration
        self.setWindowTitle("Gauge R&R Study Tool")
        self.setGeometry(100, 100, 1200, 800)

        # Main Layout
        self.central_widget = QWidget()
        self.setCentralWidget(self.central_widget)
        self.layout = QVBoxLayout()

        # Header Section
        self.header_label = QLabel("Gauge R&R Study")
        self.layout.addWidget(self.header_label)

        # CSV Handling Section
        csv_layout = QHBoxLayout()
        self.load_csv_btn = QPushButton("Load CSV")
        self.validate_csv_btn = QPushButton("Validate CSV")
        self.clean_csv_btn = QPushButton("Clean Data")
        csv_layout.addWidget(self.load_csv_btn)
        csv_layout.addWidget(self.validate_csv_btn)
        csv_layout.addWidget(self.clean_csv_btn)
        self.layout.addLayout(csv_layout)

        # Study Configuration Section
        config_layout = QHBoxLayout()
        self.study_type_dropdown = QComboBox()
        self.study_type_dropdown.addItems(["One-Factor Study", "Two-Factor Study"])
        self.run_analysis_btn = QPushButton("Run Analysis")
        config_layout.addWidget(QLabel("Select Study Type:"))
        config_layout.addWidget(self.study_type_dropdown)
        config_layout.addWidget(self.run_analysis_btn)
        self.layout.addLayout(config_layout)

        # Data Preview Table
        self.data_table = QTableWidget()
        self.data_table.setColumnCount(4)  # Placeholder for CSV structure
        self.layout.addWidget(self.data_table)

        # Results Section (Tabulated Overview)
        self.results_label = QLabel("Analysis Results")
        self.layout.addWidget(self.results_label)
        self.results_table = QTableWidget()
        self.results_table.setColumnCount(6)  # Placeholder for Cp, PTR, SNR, γR, β, Δ
        self.layout.addWidget(self.results_table)

        # Interactive Visualization (Embedded Matplotlib)
        self.figure = Figure()
        self.canvas = FigureCanvas(self.figure)
        self.layout.addWidget(self.canvas)

        # Export & Reporting
        report_layout = QHBoxLayout()
        self.export_results_btn = QPushButton("Export Results")
        self.generate_report_btn = QPushButton("Generate Report")
        report_layout.addWidget(self.export_results_btn)
        report_layout.addWidget(self.generate_report_btn)
        self.layout.addLayout(report_layout)

        # Set Main Layout
        self.central_widget.setLayout(self.layout)

        # Placeholder functions
        self.load_csv_btn.clicked.connect(self.load_csv)
        self.run_analysis_btn.clicked.connect(self.run_analysis)

    def load_csv(self):
        """Placeholder function for loading a CSV file"""
        file_path, _ = QFileDialog.getOpenFileName(self, "Open CSV", "", "CSV Files (*.csv)")
        if file_path:
            print(f"Loaded CSV: {file_path}")  # Placeholder: Actual parsing should be implemented

    def run_analysis(self):
        """Placeholder function for running Gauge R&R analysis"""
        print("Running Gauge R&R analysis...")  # Placeholder for actual statistical computations

if __name__ == "__main__":
    app = QApplication(sys.argv)
    window = GaugeRRGUI()
    window.show()
    sys.exit(app.exec())
```

---

### **🔹 Key Features & Placeholder Overview**
✔ **CSV Handling:** Load, validate, and clean datasets.  
✔ **Study Configuration:** Select **One-factor vs. Two-factor study** before analysis.  
✔ **Data Preview Table:** Displays imported CSV structure.  
✔ **Results Panel:** Tabulated overview (**Cp, PTR, SNR, γR, β, Δ**).  
✔ **Interactive Graphs:** Embedded **Matplotlib canvas** for visual analysis.  
✔ **Export & Reporting:** Save **processed data** and generate reports.  
✔ **Future Expansion:** Functions can be **enhanced for statistical analysis**, **filtering**, and **dynamic visualization**.

---
  
```

# 1. Installation section

In [3]:
!pip install --upgrade pip
!pip install PyQt6

Collecting PyQt6
  Downloading pyqt6-6.9.1-cp39-abi3-win_amd64.whl.metadata (2.2 kB)
Collecting PyQt6-sip<14,>=13.8 (from PyQt6)
  Downloading pyqt6_sip-13.10.2-cp311-cp311-win_amd64.whl.metadata (515 bytes)
Collecting PyQt6-Qt6<6.10.0,>=6.9.0 (from PyQt6)
  Downloading pyqt6_qt6-6.9.1-py3-none-win_amd64.whl.metadata (551 bytes)
Downloading pyqt6-6.9.1-cp39-abi3-win_amd64.whl (25.7 MB)
   ---------------------------------------- 0.0/25.7 MB ? eta -:--:--
   - -------------------------------------- 1.0/25.7 MB 5.6 MB/s eta 0:00:05
   -- ------------------------------------- 1.8/25.7 MB 5.3 MB/s eta 0:00:05
   ---- ----------------------------------- 3.1/25.7 MB 5.3 MB/s eta 0:00:05
   -------- ------------------------------- 5.8/25.7 MB 7.2 MB/s eta 0:00:03
   ------------- -------------------------- 8.4/25.7 MB 8.4 MB/s eta 0:00:03
   ----------------- ---------------------- 11.3/25.7 MB 9.3 MB/s eta 0:00:02
   ---------------------- ----------------- 14.2/25.7 MB 9.9 MB/s eta 0:00:02


In [3]:
!pip install --upgrade bottleneck

Collecting bottleneck
  Downloading bottleneck-1.5.0-cp311-cp311-win_amd64.whl.metadata (8.3 kB)
Downloading bottleneck-1.5.0-cp311-cp311-win_amd64.whl (112 kB)
Installing collected packages: bottleneck
  Attempting uninstall: bottleneck
    Found existing installation: Bottleneck 1.3.5
    Uninstalling Bottleneck-1.3.5:
      Successfully uninstalled Bottleneck-1.3.5
Successfully installed bottleneck-1.5.0


  You can safely remove it manually.


# 2. Generate Data

## 2.1 Correct data

We can create a function that **simulates Gauge R&R measurement data**, ensuring it aligns with our **GUI structure**.  

### **🔹 Function Design**
✔ **Generate multiple measurement trials** → Simulating data for app analysis.  
✔ **Include multiple operators & parts** → Properly structured Gauge R&R study.  
✔ **Introduce realistic variation** → Mimic real-world measurement conditions.  
✔ **Format data as structured output** → Ensuring seamless integration into the GUI.  

---

### **🔹 Python Code for Simulated Gauge R&R Measurements**
```python
import numpy as np
import pandas as pd

def generate_gauge_rr_data(num_operators=3, num_parts=10, num_trials=3, true_values_range=(50, 150), measurement_noise=0.5):
    """
    Generates a simulated Gauge R&R measurement dataset.
    
    Parameters:
        num_operators (int): Number of operators conducting measurements.
        num_parts (int): Number of unique parts measured.
        num_trials (int): Number of repeated trials per operator per part.
        true_values_range (tuple): Range of true part values (min, max).
        measurement_noise (float): Standard deviation for measurement variation.
    
    Returns:
        DataFrame: Simulated Gauge R&R measurements.
    """
    np.random.seed(42)  # Ensures reproducibility
    parts = np.arange(1, num_parts + 1)
    true_values = np.random.uniform(*true_values_range, num_parts)  # Simulating true part values

    data_records = []
    for operator in range(1, num_operators + 1):
        for part_idx, part in enumerate(parts):
            for trial in range(1, num_trials + 1):
                measured_value = true_values[part_idx] + np.random.normal(0, measurement_noise)
                data_records.append([operator, part, trial, measured_value])

    columns = ["Operator", "Part", "Trial", "Measured Value"]
    df = pd.DataFrame(data_records, columns=columns)

    return df

# Example Usage:
df_rr_data = generate_gauge_rr_data()
print(df_rr_data.head())  # Display first few rows
```

---

### **🔹 How This Works**
✔ **Randomized Part Values** → Each part gets a true value within a defined range.  
✔ **Multiple Operators & Trials** → Ensures **repeatability** and **reproducibility** aspects.  
✔ **Simulated Measurement Noise** → Introduces **realistic variation** in readings.  
✔ **Structured Output in Pandas DataFrame** → Ready for **export or GUI integration**.  

---

```
```

In [11]:
import numpy as np
import pandas as pd

def generate_gauge_rr_data(
    num_operators=3, num_parts=10, num_trials=3, true_values_range=(50, 150), measurement_noise=0.5, filename="gauge_rr_data.csv"
):
    """
    Generates a simulated Gauge R&R measurement dataset and stores it as a CSV file.

    Parameters:
        num_operators (int): Number of operators conducting measurements.
        num_parts (int): Number of unique parts measured.
        num_trials (int): Number of repeated trials per operator per part.
        true_values_range (tuple): Range of true part values (min, max).
        measurement_noise (float): Standard deviation for measurement variation.
        filename (str): Name of the CSV file to save data.

    Returns:
        str: Filepath of the saved CSV.
    """
    np.random.seed(42)  # Ensures reproducibility
    parts = np.arange(1, num_parts + 1)
    true_values = np.random.uniform(*true_values_range, num_parts)  # Simulating true part values

    data_records = []
    for operator in range(1, num_operators + 1):
        for part_idx, part in enumerate(parts):
            for trial in range(1, num_trials + 1):
                measured_value = true_values[part_idx] + np.random.normal(0, measurement_noise)
                data_records.append([operator, part, trial, measured_value])

    columns = ["Operator", "Part", "Trial", "Measured Value"]
    df = pd.DataFrame(data_records, columns=columns)

    # Save DataFrame as CSV
    df.to_csv(filename, index=False)
    
    return filename  # Return filename for reference

# Example Usage:
csv_file = generate_gauge_rr_data()
print(f"Gauge R&R data stored in: {csv_file}")  # Output CSV filepath

Gauge R&R data stored in: gauge_rr_data.csv


## 2.2 Corrupt Data

Creating **corrupt Gauge R&R data** will help rigorously test our **GUI's cleaning and validation functions**.  

### **🔹 Strategy for Generating Corrupt Data**
✔ **Introduce intentional errors** → Simulating real-world data corruption issues.  
✔ **Include missing values, duplicates, and type mismatches** → Ensuring robust validation.  
✔ **Save data in a CSV format** → Ready for testing with the GUI.  

---

### **🔹 Python Function for Generating Corrupt Gauge R&R Data**
```python
import numpy as np
import pandas as pd

def generate_corrupt_gauge_rr_data(
    num_operators=3, num_parts=10, num_trials=3, true_values_range=(50, 150), measurement_noise=0.5, filename="corrupt_gauge_rr_data.csv"
):
    """
    Generates a simulated corrupt Gauge R&R dataset with intentional errors.

    Parameters:
        num_operators (int): Number of operators conducting measurements.
        num_parts (int): Number of unique parts measured.
        num_trials (int): Number of repeated trials per operator per part.
        true_values_range (tuple): Range of true part values (min, max).
        measurement_noise (float): Standard deviation for measurement variation.
        filename (str): Name of the CSV file to save data.

    Returns:
        str: Filepath of the saved corrupt CSV.
    """
    np.random.seed(42)  # Ensures reproducibility
    parts = np.arange(1, num_parts + 1)
    true_values = np.random.uniform(*true_values_range, num_parts)  # Simulating true part values

    data_records = []
    for operator in range(1, num_operators + 1):
        for part_idx, part in enumerate(parts):
            for trial in range(1, num_trials + 1):
                measured_value = true_values[part_idx] + np.random.normal(0, measurement_noise)

                # Introduce corruption randomly
                if np.random.rand() < 0.15:  
                    measured_value = None  # **Missing Value**
                if np.random.rand() < 0.10:  
                    measured_value = "ERROR"  # **Non-numeric Data**
                if np.random.rand() < 0.05:
                    part = np.random.choice(parts)  # **Duplicate / Wrong Part Label**

                data_records.append([operator, part, trial, measured_value])

    columns = ["Operator", "Part", "Trial", "Measured Value"]
    df = pd.DataFrame(data_records, columns=columns)

    # Save DataFrame as corrupt CSV
    df.to_csv(filename, index=False)
    
    return filename  # Return filename for reference

# Example Usage:
csv_file = generate_corrupt_gauge_rr_data()
print(f"Corrupt Gauge R&R data stored in: {csv_file}")  # Output CSV filepath
```

---

### **🔹 Key Corruption Types**
✔ **Missing Values** → Simulated by randomly replacing values with `None`.  
✔ **Non-Numeric Entries** → Randomly adding `"ERROR"` in numeric fields.  
✔ **Incorrect Part Labeling** → Random duplication of part IDs.  

---

```
```


In [14]:
import numpy as np
import pandas as pd

def generate_corrupt_gauge_rr_data(
    num_operators=3, num_parts=10, num_trials=3, true_values_range=(50, 150), measurement_noise=0.5, filename="corrupt_gauge_rr_data.csv"
):
    """
    Generates a simulated corrupt Gauge R&R dataset with intentional errors.

    Parameters:
        num_operators (int): Number of operators conducting measurements.
        num_parts (int): Number of unique parts measured.
        num_trials (int): Number of repeated trials per operator per part.
        true_values_range (tuple): Range of true part values (min, max).
        measurement_noise (float): Standard deviation for measurement variation.
        filename (str): Name of the CSV file to save data.

    Returns:
        str: Filepath of the saved corrupt CSV.
    """
    np.random.seed(42)  # Ensures reproducibility
    parts = np.arange(1, num_parts + 1)
    true_values = np.random.uniform(*true_values_range, num_parts)  # Simulating true part values

    data_records = []
    for operator in range(1, num_operators + 1):
        for part_idx, part in enumerate(parts):
            for trial in range(1, num_trials + 1):
                measured_value = true_values[part_idx] + np.random.normal(0, measurement_noise)

                # Introduce corruption randomly
                if np.random.rand() < 0.15:  
                    measured_value = None  # **Missing Value**
                if np.random.rand() < 0.10:  
                    measured_value = "ERROR"  # **Non-numeric Data**
                if np.random.rand() < 0.05:
                    part = np.random.choice(parts)  # **Duplicate / Wrong Part Label**

                data_records.append([operator, part, trial, measured_value])

    columns = ["Operator", "Part", "Trial", "Measured Value"]
    df = pd.DataFrame(data_records, columns=columns)

    # Save DataFrame as corrupt CSV
    df.to_csv(filename, index=False)
    
    return filename  # Return filename for reference

# Example Usage:
csv_file = generate_corrupt_gauge_rr_data()
print(f"Corrupt Gauge R&R data stored in: {csv_file}")  # Output CSV filepath

Corrupt Gauge R&R data stored in: corrupt_gauge_rr_data.csv


# 3. implementation

## 3.1 GUI structure

I'll restructure the **GUI layout** into **three columns**, ensuring each section aligns **perfectly** with your design.  

### **🔹 Updated Layout Structure**
✅ **Column 1 (Data Input & Export)**  
   - **Load CSV, Validate CSV, Clean Data Buttons**  
   - **Tabulated Parameter Results (Cp, γR, PTR, SNR, β, Δ, etc.)**  
   - **Export Results, Generate Report & Explain Results Buttons**  

✅ **Column 2 (Analysis Controls & R&R Results)**  
   - **Abort & Run Buttons**  
   - **Gauge R&R Study Metrics & Summary Panel**  
   - **Progress Bar for Computation Status**  

✅ **Column 3 (Logging & Graphical Analysis)**  
   - **Log Window (Error-free process logs)**  
   - **Error Log Window (Display issues & warnings)**  
   - **Graphical Results Section** with:  
     - **Statistics Visualization (Box Plots, Histograms, Variance Contribution Pie Chart)**  
     - **Measurement Distribution Graph**  
     - **PTR-SNR Plot**  
     - **Delta & Beta Index vs. GPQ Iteration Number**   

In [1]:
import sys
from PyQt6.QtWidgets import (
    QApplication, QMainWindow, QPushButton, QVBoxLayout, QWidget,
    QLabel, QTableWidget, QFileDialog, QGridLayout, QTextEdit, QProgressBar,
    QCheckBox, QRadioButton, QGroupBox, QListWidget
)
from matplotlib.figure import Figure
from matplotlib.backends.backend_qt5agg import FigureCanvasQTAgg as FigureCanvas

class GaugeRRGUI(QMainWindow):
    def __init__(self):
        super().__init__()

        # Window Configuration
        self.setWindowTitle("Gauge R&R Study Tool")
        self.setGeometry(100, 100, 1400, 900)

        # Main Layout using Grid for Three Columns
        self.central_widget = QWidget()
        self.setCentralWidget(self.central_widget)
        self.layout = QGridLayout()
        self.layout.setSpacing(15)  # 🔹 Global spacing for better widget separation

        ## COLUMN 1: Data Input & Export ##
        self.load_csv_btn = QPushButton("Load CSV")
        self.validate_csv_btn = QPushButton("Validate CSV")
        self.clean_csv_btn = QPushButton("Clean Data")

        self.data_preview_table = QTableWidget()
        self.data_preview_table.setColumnCount(4)

        self.export_results_btn = QPushButton("Export Results")
        self.generate_report_btn = QPushButton("Generate Report")
        self.explain_results_btn = QPushButton("Explain Results")

        self.layout.addWidget(QLabel("Data Handling"), 0, 0)
        self.layout.addWidget(self.load_csv_btn, 1, 0)
        self.layout.addWidget(self.validate_csv_btn, 2, 0)
        self.layout.addWidget(self.clean_csv_btn, 3, 0)
        self.layout.addWidget(QLabel("Data Preview"), 4, 0)
        self.layout.addWidget(self.data_preview_table, 5, 0, 2, 1)
        self.layout.addWidget(self.export_results_btn, 8, 0)
        self.layout.addWidget(self.generate_report_btn, 9, 0)
        self.layout.addWidget(self.explain_results_btn, 10, 0)

        ## COLUMN 2: Study Configuration & Results ##
        self.study_config_group = QGroupBox("Study Configuration")
        config_layout = QVBoxLayout()
        config_layout.setSpacing(12)  # 🔹 Improved widget spacing within config group

        self.remove_duplicates_cb = QCheckBox("Remove Duplicates")
        self.filter_range_cb = QCheckBox("Filter Range")
        self.exclude_missing_cb = QCheckBox("Exclude Missing Data")
        self.one_factor_rb = QRadioButton("One-Factor Study")  
        self.two_factor_rb = QRadioButton("Two-Factor Study")  

        for widget in [self.remove_duplicates_cb, self.filter_range_cb, self.exclude_missing_cb, self.one_factor_rb, self.two_factor_rb]:
            config_layout.addWidget(widget)

        self.study_config_group.setLayout(config_layout)

        self.abort_analysis_btn = QPushButton("Abort Analysis")
        self.run_analysis_btn = QPushButton("Run Analysis")
        self.progress_bar = QProgressBar()
        self.progress_bar.setFormat("0%")  # 🔹 Ensure percentage is displayed

        self.overall_results_list = QListWidget()
        self.overall_results_list.addItems([
            "Total GR: 35.69%", "Repeatability: 22.06%", "Reproducibility: 27.47%", 
            "% Tolerance: 12.04%", "Number of Distinct Categories: 4"
        ])

        self.results_table = QTableWidget()
        self.results_table.setColumnCount(4)
        self.results_table.setHorizontalHeaderLabels(["Parameter", "Value", "L", "U"])

        self.layout.addWidget(self.study_config_group, 1, 1, 2, 1)
        self.layout.addWidget(self.abort_analysis_btn, 3, 1)
        self.layout.addWidget(self.run_analysis_btn, 4, 1)
        self.layout.addWidget(self.progress_bar, 5, 1)
        self.layout.addWidget(QLabel("Overall Results"), 6, 1)
        self.layout.addWidget(self.overall_results_list, 7, 1, 2, 1)
        self.layout.addWidget(QLabel("Parameter Results"), 9, 1)
        self.layout.addWidget(self.results_table, 10, 1, 2, 1)

        ## COLUMN 3: Logging & Graphical Results ##
        self.log_window = QTextEdit()
        self.log_window.setPlaceholderText("Process Log (Error-free execution)")

        self.error_log_window = QTextEdit()
        self.error_log_window.setPlaceholderText("Error Log (Warnings & issues)")

        self.figure = Figure()
        self.canvas = FigureCanvas(self.figure)

        self.layout.addWidget(QLabel("Logging & Visualization"), 0, 2)
        self.layout.addWidget(self.log_window, 1, 2, 2, 1)
        self.layout.addWidget(self.error_log_window, 3, 2, 2, 1)
        self.layout.addWidget(QLabel("Statistical Visualizations"), 5, 2)
        self.layout.addWidget(self.canvas, 6, 2, 4, 1)

        # Set Final Layout
        self.central_widget.setLayout(self.layout)

        # Ensure Button Functionality
        self.run_analysis_btn.clicked.connect(self.run_analysis)

    def run_analysis(self):
        """Placeholder function for running Gauge R&R analysis"""
        self.log_window.append("Running Gauge R&R analysis...")
        progress_value = 50  
        self.progress_bar.setValue(progress_value)
        self.progress_bar.setFormat(f"{progress_value}%")  # 🔹 Displays percentage dynamically

if __name__ == "__main__":
    app = QApplication(sys.argv)
    window = GaugeRRGUI()
    window.show()
    sys.exit(app.exec())


SystemExit: 0

  warn("To exit: use 'exit', 'quit', or Ctrl-D.", stacklevel=1)


## 3.2 Functionality implementation 1: Loading, Cleaning, Validating and Aborting

Here's what should happen when those three options are selected **before validating and cleaning**:

✔ **Remove Duplicates** → The GUI should **identify and eliminate duplicate rows** from the dataset before validation. If the dataset has no duplicates, no changes should be made.

✔ **Filter Range** → The GUI should **check for outliers** in the "Measured Value" column and remove values that fall outside a predefined acceptable range (e.g., physical limitations of the measurement device).

✔ **Exclude Missing Data** → The GUI should **ignore rows containing missing values** before validation, ensuring that only complete, clean data is processed.

---

### **🔹 Expected Behavior**
1️⃣ If **duplicates exist**, they should be removed **before validation**.  
2️⃣ If **out-of-range values exist**, they should be filtered **before validation**.  
3️⃣ If **missing values exist**, they should be dropped **before validation**.  
4️⃣ If the dataset is already clean, **no changes occur**, and validation proceeds as expected.  

Since our synthetic dataset was already correct, selecting these options should result in **no changes**—but in a real-world scenario, these preprocessing steps ensure cleaner input before deeper validation. 🚀   
```
```  

In [1]:
import sys
import pandas as pd
from PyQt6.QtWidgets import (
    QApplication, QMainWindow, QPushButton, QVBoxLayout, QWidget,
    QLabel, QTableWidget, QTableWidgetItem, QFileDialog, QGridLayout, 
    QTextEdit, QProgressBar, QCheckBox, QRadioButton, QGroupBox, QListWidget
)
from PyQt6.QtGui import QFont, QPalette, QColor
from PyQt6.QtCore import Qt
from matplotlib.figure import Figure
from matplotlib.backends.backend_qt5agg import FigureCanvasQTAgg as FigureCanvas

class GaugeRRGUI(QMainWindow):
    def __init__(self):
        super().__init__()

        # Window Configuration
        self.setWindowTitle("Gauge R&R Study Tool")
        self.setGeometry(100, 100, 1400, 900)

        # Apply Styling
        self.setStyleSheet("""
            QWidget {
                background-color: #F0F0F0;
            }
            QLabel {
                font-size: 14px;
                font-weight: bold;
                color: #333;
            }
            QPushButton {
                background-color: #0078D7;
                color: white;
                font-size: 13px;
                border-radius: 5px;
                padding: 5px;
            }
            QPushButton:hover {
                background-color: #005a9e;
            }
            QTableWidget {
                gridline-color: #CCCCCC;
                border: 1px solid #CCCCCC;
            }
            QTextEdit {
                background-color: #FFFFFF;
                border: 1px solid #CCCCCC;
            }
            QProgressBar {
                height: 20px;
            }
        """)

        # Main Layout using Grid for Your New GUI Design
        self.central_widget = QWidget()
        self.setCentralWidget(self.central_widget)
        self.layout = QGridLayout()
        self.layout.setSpacing(15)

        ## COLUMN 1: Data Input & Export ##
        self.load_csv_btn = QPushButton("Load CSV")
        self.validate_csv_btn = QPushButton("Validate CSV")
        self.clean_csv_btn = QPushButton("Clean Data")

        self.data_preview_table = QTableWidget()
        self.data_preview_table.setColumnCount(4)
        self.data_preview_table.setHorizontalHeaderLabels(["Operator", "Part", "Trial", "Measured Value"])

        self.export_results_btn = QPushButton("Export Results")
        self.generate_report_btn = QPushButton("Generate Report")
        self.explain_results_btn = QPushButton("Explain Results")

        self.layout.addWidget(QLabel("Data Handling"), 0, 0)
        self.layout.addWidget(self.load_csv_btn, 1, 0)
        self.layout.addWidget(self.validate_csv_btn, 2, 0)
        self.layout.addWidget(self.clean_csv_btn, 3, 0)
        self.layout.addWidget(QLabel("Data Preview"), 4, 0)
        self.layout.addWidget(self.data_preview_table, 5, 0, 2, 1)
        self.layout.addWidget(self.export_results_btn, 8, 0)
        self.layout.addWidget(self.generate_report_btn, 9, 0)
        self.layout.addWidget(self.explain_results_btn, 10, 0)

        ## COLUMN 2: Study Configuration & Results ##
        self.study_config_group = QGroupBox("Study Configuration")
        config_layout = QVBoxLayout()
        config_layout.setSpacing(12)

        self.remove_duplicates_cb = QCheckBox("Remove Duplicates")
        self.filter_range_cb = QCheckBox("Filter Range")
        self.exclude_missing_cb = QCheckBox("Exclude Missing Data")
        self.one_factor_rb = QRadioButton("One-Factor Study")  
        self.two_factor_rb = QRadioButton("Two-Factor Study")  

        for widget in [self.remove_duplicates_cb, self.filter_range_cb, self.exclude_missing_cb, self.one_factor_rb, self.two_factor_rb]:
            config_layout.addWidget(widget)

        self.study_config_group.setLayout(config_layout)

        self.abort_analysis_btn = QPushButton("Abort Analysis")
        self.run_analysis_btn = QPushButton("Run Analysis")
        self.progress_bar = QProgressBar()
        self.progress_bar.setFormat("0%")

        self.overall_results_list = QListWidget()
        self.results_table = QTableWidget()
        self.results_table.setColumnCount(4)
        self.results_table.setHorizontalHeaderLabels(["Parameter", "Value", "L", "U"])

        self.layout.addWidget(self.study_config_group, 1, 1, 2, 1)
        self.layout.addWidget(self.abort_analysis_btn, 3, 1)
        self.layout.addWidget(self.run_analysis_btn, 4, 1)
        self.layout.addWidget(self.progress_bar, 5, 1)
        self.layout.addWidget(QLabel("Overall Results"), 6, 1)
        self.layout.addWidget(self.overall_results_list, 7, 1, 2, 1)
        self.layout.addWidget(QLabel("Parameter Results"), 9, 1)
        self.layout.addWidget(self.results_table, 10, 1, 2, 1)

        ## COLUMN 3: Logging & Graphical Results ##
        self.log_window = QTextEdit()
        self.log_window.setPlaceholderText("Process Log (Execution Details)")

        self.error_log_window = QTextEdit()
        self.error_log_window.setPlaceholderText("Error Log (Warnings & Issues)")

        self.figure = Figure()
        self.canvas = FigureCanvas(self.figure)

        self.layout.addWidget(QLabel("Logging & Visualization"), 0, 2)
        self.layout.addWidget(self.log_window, 1, 2, 2, 1)
        self.layout.addWidget(self.error_log_window, 3, 2, 2, 1)
        self.layout.addWidget(QLabel("Statistical Visualizations"), 5, 2)
        self.layout.addWidget(self.canvas, 6, 2, 4, 1)

        # Set Final Layout
        self.central_widget.setLayout(self.layout)

        # Connect Buttons to Functions
        self.load_csv_btn.clicked.connect(self.load_csv)
        self.validate_csv_btn.clicked.connect(self.validate_csv)
        self.clean_csv_btn.clicked.connect(self.clean_data)
        self.run_analysis_btn.clicked.connect(self.run_analysis)
        self.abort_analysis_btn.clicked.connect(self.abort_analysis)

    def load_csv(self):
        """Loads CSV file into the data preview table."""
        file_path, _ = QFileDialog.getOpenFileName(self, "Open CSV File", "", "CSV Files (*.csv)")
        if file_path:
            try:
                df = pd.read_csv(file_path)
                df["Measured Value"] = pd.to_numeric(df["Measured Value"], errors="coerce")  # Fix numeric conversion
                self.populate_table(self.data_preview_table, df)
                self.log_window.append(f"Loaded CSV: {file_path}")
            except Exception as e:
                self.error_log_window.append(f"Error loading CSV: {str(e)}")

    def validate_csv(self):
        """Validates CSV data for missing values, errors, and duplicates."""
        try:
            df = self.get_table_data(self.data_preview_table)
            df["Measured Value"] = pd.to_numeric(df["Measured Value"], errors="coerce")  
            issues = []
            if df.isnull().values.any():
                issues.append("Missing values detected.")
            if df["Measured Value"].isnull().any():
                issues.append("Non-numeric data detected in Measured Value column.")
            if df.duplicated().sum() > 0:
                issues.append("Duplicate entries found.")

            if issues:
                for issue in issues:
                    self.error_log_window.append(issue)
                self.log_window.append("Validation failed: Issues found.")
            else:
                self.log_window.append("Validation successful: No issues detected.")

        except Exception as e:
            self.error_log_window.append(f"Validation error: {str(e)}")

    def clean_data(self):
        """Removes invalid data and updates table."""
        try:
            df = self.get_table_data(self.data_preview_table)
            df = df.dropna()
            df["Measured Value"] = pd.to_numeric(df["Measured Value"], errors="coerce")  
            df = df.drop_duplicates()
            self.populate_table(self.data_preview_table, df)
            self.log_window.append("Data cleaned successfully.")
        except Exception as e:
            self.error_log_window.append(f"Cleaning error: {str(e)}")

    def populate_table(self, table, df):
        """Populates a QTableWidget from a DataFrame."""
        table.setRowCount(len(df))
        table.setColumnCount(len(df.columns))
        table.setHorizontalHeaderLabels(df.columns)
    
        for row in range(len(df)):
            for col in range(len(df.columns)):
                table.setItem(row, col, QTableWidgetItem(str(df.iloc[row, col])))

    
    def get_table_data(self, table):
        """Extracts table data into a DataFrame."""
        rows = table.rowCount()
        cols = table.columnCount()
        data = []
        
        for row in range(rows):
            row_data = []
            for col in range(cols):
                item = table.item(row, col)
                row_data.append(item.text() if item else None)  # Handle empty cells
            data.append(row_data)
    
        df = pd.DataFrame(data, columns=["Operator", "Part", "Trial", "Measured Value"])
        df["Measured Value"] = pd.to_numeric(df["Measured Value"], errors="coerce")  # Ensure numeric format
        return df

    
    def abort_analysis(self):
        """Stops ongoing analysis and resets relevant UI elements."""
        self.log_window.append("Analysis aborted.")
        self.progress_bar.setValue(0)
        self.progress_bar.setFormat("0%")

    
    def run_analysis(self):
        """Placeholder function for running Gauge R&R analysis."""
        self.log_window.append("Running Gauge R&R analysis...")
        self.progress_bar.setValue(50)
        self.progress_bar.setFormat(f"{self.progress_bar.value()}%")

if __name__ == "__main__":
    app = QApplication(sys.argv)
    window = GaugeRRGUI()
    window.show()
    sys.exit(app.exec())


SystemExit: 0

  warn("To exit: use 'exit', 'quit', or Ctrl-D.", stacklevel=1)


## 3.3 Functionality implementation 2: Run Analysis

Now, let's focus on implementing the **Run Analysis** button functionality to calculate the **6-sigma R&R Gauge Study** and generate visualizations.

### **🔹 Parameters & Visualizations Overview**
When running the Gauge R&R Study, the analysis should populate the **Overall Results** and **Parameter Tables** with the following metrics:
- **μY (Mean of Y):** Central measurement value.
- **γP (Gamma Part):** Variation between parts.
- **γM (Gamma Measurement System):** Measurement system variability.
- **γR (Gamma Repeatability):** Within-system repeatability.
- **PTR (Precision-to-Tolerance Ratio):** Indicates measurement precision.
- **SNR (Signal-to-Noise Ratio):** System signal strength.
- **Cp (Process Capability Index):** Process efficiency.
- **δ_index & β_index:** Indexed system parameters.
- **Cgk & Tolerance Ratio:** Additional capability metrics.

### **🔹 Plots to Include**
To visualize the data effectively, these graphical representations should be inserted:
1. **Box Plots** → Breakdown of parts and repeatability.
2. **Histogram** → Distribution of measured values.
3. **Pie Chart** → Variance contribution (Repeatability vs. Reproducibility).
4. **Scatter Plot** → Correlation between SNR and PTR.

---

### **🔹 How to Proceed**
1️⃣ **Extract Data from the Table** → Convert raw CSV data into structured arrays for calculation.  
2️⃣ **Perform Gauge R&R Analysis** → Compute the required statistical metrics.  
3️⃣ **Populate the Results Table** → Insert computed parameter values alongside lower & upper bounds.  
4️⃣ **Generate Visualizations** → Insert box plots, histogram, pie chart, and scatter plot in the visualization section.  
5️⃣ **Update the Log Window** → Provide details on calculation results and any detected errors.  

I'll also ensure the **Run Analysis** functionality computes all necessary **6-sigma R&R Gauge Study parameters**, populates the **Overall Results** and **Parameter Tables**, and generates the following **visualizations**:

### **🔹 Key Metrics to Compute**
1️⃣ **μY (Mean of Y)** → Central measurement value.  
2️⃣ **γP (Gamma Part)** → Variation between parts.  
3️⃣ **γM (Gamma Measurement System)** → Measurement system variability.  
4️⃣ **γR (Gamma Repeatability)** → Within-system repeatability.  
5️⃣ **PTR (Precision-to-Tolerance Ratio)** → Measurement precision.  
6️⃣ **SNR (Signal-to-Noise Ratio)** → System signal strength.  
7️⃣ **Cp (Process Capability Index)** → Process efficiency.  
8️⃣ **δ_index & β_index** → Indexed system parameters.  
9️⃣ **Cgk & Tolerance Ratio** → Capability evaluation.

---

### **🔹 Visualizations to Include**
📌 **Box Plots** → Breakdown of parts and repeatability.  
📌 **Histogram** → Distribution of measured values.  
📌 **Pie Chart** → Variance contribution (Repeatability vs. Reproducibility).  
📌 **Scatter Plot** → Correlation between SNR and PTR.  
📌 **Delta & Beta Indices vs. GPQ-Iterations Plot** → Tracking performance shifts.

---

### **🔹 How I’ll Implement This**
✅ **Extract Table Data** → Convert CSV data into structured arrays.  
✅ **Perform Gauge R&R Analysis** → Compute metrics, handling missing values and duplicates.  
✅ **Populate Results Tables** → Insert calculated values with lower & upper bounds.  
✅ **Generate Visualizations** → Insert relevant plots in the **Visualization Section**.  
✅ **Update Log Window** → Show computation summaries and warnings.

I'll now **develop this functionality** while ensuring seamless integration with your **existing GUI layout!**

In [1]:
import sys
import pandas as pd
import numpy as np
from matplotlib.lines import Line2D
import scipy.stats as st  # Import statistical library for confidence intervals
from PyQt6.QtWidgets import (
    QApplication, QMainWindow, QPushButton, QVBoxLayout, QWidget,
    QLabel, QTableWidget, QTableWidgetItem, QFileDialog, QGridLayout, 
    QTextEdit, QProgressBar, QCheckBox, QRadioButton, QGroupBox, QListWidget
)
from PyQt6.QtGui import QFont, QPalette, QColor
from PyQt6.QtCore import Qt
from matplotlib.figure import Figure
from matplotlib.backends.backend_qt5agg import FigureCanvasQTAgg as FigureCanvas

class GaugeRRGUI(QMainWindow):
    def __init__(self):
        super().__init__()

        # Window Configuration
        self.setWindowTitle("Gauge R&R Study Tool")
        self.setGeometry(100, 100, 1400, 900)

        # Apply Styling
        self.setStyleSheet("""
            QWidget {
                background-color: #F0F0F0;
            }
            QLabel {
                font-size: 14px;
                font-weight: bold;
                color: #333;
            }
            QPushButton {
                background-color: #0078D7;
                color: white;
                font-size: 13px;
                border-radius: 5px;
                padding: 5px;
            }
            QPushButton:hover {
                background-color: #005a9e;
            }
            QTableWidget {
                gridline-color: #CCCCCC;
                border: 1px solid #CCCCCC;
            }
            QTextEdit {
                background-color: #FFFFFF;
                border: 1px solid #CCCCCC;
            }
            QProgressBar {
                height: 20px;
            }
        """)

        # Main Layout using Grid for Your New GUI Design
        self.central_widget = QWidget()
        self.setCentralWidget(self.central_widget)
        self.layout = QGridLayout()
        self.layout.setSpacing(16)

        ## COLUMN 1: Data Input & Export ##
        self.load_csv_btn = QPushButton("Load CSV")
        self.validate_csv_btn = QPushButton("Validate CSV")
        self.clean_csv_btn = QPushButton("Clean Data")

        self.data_preview_table = QTableWidget()
        self.data_preview_table.setColumnCount(4)
        self.data_preview_table.setHorizontalHeaderLabels(["Operator", "Part", "Trial", "Measured Value"])

        self.export_results_btn = QPushButton("Export Results")
        self.generate_report_btn = QPushButton("Generate Report")
        self.explain_results_btn = QPushButton("Explain Results")

        self.layout.addWidget(QLabel("Data Handling"), 0, 0)
        self.layout.addWidget(self.load_csv_btn, 1, 0)
        self.layout.addWidget(self.validate_csv_btn, 2, 0)
        self.layout.addWidget(self.clean_csv_btn, 3, 0)
        self.layout.addWidget(QLabel("Data Preview"), 4, 0)
        self.layout.addWidget(self.data_preview_table, 5, 0, 2, 1)
        self.layout.addWidget(self.export_results_btn, 8, 0)
        self.layout.addWidget(self.generate_report_btn, 9, 0)
        self.layout.addWidget(self.explain_results_btn, 10, 0)

        ## COLUMN 2: Study Configuration & Results ##
        self.study_config_group = QGroupBox("Study Configuration")
        config_layout = QVBoxLayout()
        config_layout.setSpacing(12)

        self.remove_duplicates_cb = QCheckBox("Remove Duplicates")
        self.filter_range_cb = QCheckBox("Filter Range")
        self.exclude_missing_cb = QCheckBox("Exclude Missing Data")
        self.one_factor_rb = QRadioButton("One-Factor Study")  
        self.two_factor_rb = QRadioButton("Two-Factor Study")  

        for widget in [self.remove_duplicates_cb, self.filter_range_cb, self.exclude_missing_cb, self.one_factor_rb, self.two_factor_rb]:
            config_layout.addWidget(widget)

        self.study_config_group.setLayout(config_layout)

        self.abort_analysis_btn = QPushButton("Abort Analysis")
        self.run_analysis_btn = QPushButton("Run Analysis")
        self.progress_bar = QProgressBar()
        # self.progress_bar.setContentsMargins(0, 2, 0, 2)  # Shrinks top/bottom margins
        # self.progress_bar.move(self.progress_bar.x(), self.progress_bar.y() + 10)  # Shift it upwards
        self.progress_bar.setFormat("0%")

        self.overall_results_list = QListWidget()
        self.results_table = QTableWidget()
        self.results_table.setColumnCount(4)
        self.results_table.setHorizontalHeaderLabels(["Parameter", "Value", "L", "U"])

        self.layout.addWidget(self.study_config_group, 1, 1, 2, 1)
        self.layout.addWidget(self.abort_analysis_btn, 3, 1)
        self.layout.addWidget(self.run_analysis_btn, 4, 1)
        self.layout.addWidget(self.progress_bar, 5, 1)
        self.layout.addWidget(QLabel("Overall Results"), 6, 1)
        self.layout.addWidget(self.overall_results_list, 7, 1, 2, 1)
        self.layout.addWidget(QLabel("Parameter Results"), 9, 1)
        self.layout.addWidget(self.results_table, 10, 1, 2, 1)

        ## COLUMN 3: Logging & Graphical Results ##
        self.log_window = QTextEdit()
        self.log_window.setPlaceholderText("Process Log (Execution Details)")

        self.error_log_window = QTextEdit()
        self.error_log_window.setPlaceholderText("Error Log (Warnings & Issues)")

        self.figure = Figure()
        self.canvas = FigureCanvas(self.figure)

        self.layout.addWidget(QLabel("Logging & Visualization"), 0, 2)
        self.layout.addWidget(self.log_window, 1, 2, 2, 1)
        self.layout.addWidget(self.error_log_window, 3, 2, 2, 1)
        self.layout.addWidget(QLabel("Statistical Visualizations"), 5, 2)
        self.layout.addWidget(self.canvas, 6, 2, 4, 1)

        # Set Final Layout
        self.central_widget.setLayout(self.layout)

        # Connect Buttons to Functions
        self.load_csv_btn.clicked.connect(self.load_csv)
        self.validate_csv_btn.clicked.connect(self.validate_csv)
        self.clean_csv_btn.clicked.connect(self.clean_data)
        self.run_analysis_btn.clicked.connect(self.run_analysis)
        self.abort_analysis_btn.clicked.connect(self.abort_analysis)
        self.export_results_btn.clicked.connect(self.export_results)
        self.generate_report_btn.clicked.connect(self.generate_pdf_report)
        self.explain_results_btn.clicked.connect(self.explain_results)

    def load_csv(self):
        """Loads CSV file into the data preview table."""
        file_path, _ = QFileDialog.getOpenFileName(self, "Open CSV File", "", "CSV Files (*.csv)")
        if file_path:
            try:
                df = pd.read_csv(file_path)
                df["Measured Value"] = pd.to_numeric(df["Measured Value"], errors="coerce")  # Fix numeric conversion
                self.populate_table(self.data_preview_table, df)
                self.log_window.append(f"Loaded CSV: {file_path}")
            except Exception as e:
                self.error_log_window.append(f"Error loading CSV: {str(e)}")

    def validate_csv(self):
        """Validates CSV data for missing values, errors, and duplicates."""
        try:
            df = self.get_table_data(self.data_preview_table)
            df["Measured Value"] = pd.to_numeric(df["Measured Value"], errors="coerce")  
            issues = []
            if df.isnull().values.any():
                issues.append("Missing values detected.")
            if df["Measured Value"].isnull().any():
                issues.append("Non-numeric data detected in Measured Value column.")
            if df.duplicated().sum() > 0:
                issues.append("Duplicate entries found.")

            if issues:
                for issue in issues:
                    self.error_log_window.append(issue)
                self.log_window.append("Validation failed: Issues found.")
            else:
                self.log_window.append("Validation successful: No issues detected.")

        except Exception as e:
            self.error_log_window.append(f"Validation error: {str(e)}")

    def clean_data(self):
        """Removes invalid data and updates table."""
        try:
            df = self.get_table_data(self.data_preview_table)
            df = df.dropna()
            df["Measured Value"] = pd.to_numeric(df["Measured Value"], errors="coerce")  
            df = df.drop_duplicates()
            self.populate_table(self.data_preview_table, df)
            self.log_window.append("Data cleaned successfully.")
        except Exception as e:
            self.error_log_window.append(f"Cleaning error: {str(e)}")

    def populate_table(self, table, df):
        """Populates a QTableWidget from a DataFrame."""
        table.setRowCount(len(df))
        table.setColumnCount(len(df.columns))
        table.setHorizontalHeaderLabels(df.columns)
    
        for row in range(len(df)):
            for col in range(len(df.columns)):
                table.setItem(row, col, QTableWidgetItem(str(df.iloc[row, col])))

    
    def get_table_data(self, table):
        """Extracts table data into a DataFrame."""
        rows = table.rowCount()
        cols = table.columnCount()
        data = []
        
        for row in range(rows):
            row_data = []
            for col in range(cols):
                item = table.item(row, col)
                row_data.append(item.text() if item else None)  # Handle empty cells
            data.append(row_data)
    
        df = pd.DataFrame(data, columns=["Operator", "Part", "Trial", "Measured Value"])
        df["Measured Value"] = pd.to_numeric(df["Measured Value"], errors="coerce")  # Ensure numeric format
        return df

    
    def abort_analysis(self):
        """Stops ongoing analysis and resets relevant UI elements."""
        self.log_window.append("Analysis aborted.")
        self.progress_bar.setValue(0)
        self.progress_bar.setFormat("0%")

    def export_results(self):
        """Exports Gauge R&R results as a CSV file."""
        file_path, _ = QFileDialog.getSaveFileName(self, "Save Results", "", "CSV Files (*.csv)")
        if file_path:
            data = self.get_table_data(self.results_table)
            pd.DataFrame(data).to_csv(file_path, index=False)
            self.log_window.append(f"Results exported to: {file_path}")

    def generate_pdf_report(self):
        """Creates a PDF report of Gauge R&R study results."""
        file_path, _ = QFileDialog.getSaveFileName(self, "Save PDF Report", "", "PDF Files (*.pdf)")
        if file_path:
            pdf = canvas.Canvas(file_path, pagesize=letter)
            pdf.setTitle("Gauge R&R Study Report")
            pdf.drawString(100, 750, "Gauge R&R Study Report")
            pdf.drawString(100, 730, "--------------------------")
            pdf.drawString(100, 710, "Statistical Results:")
            pdf.drawString(100, 690, self.get_report_summary())
            pdf.save()
            self.log_window.append(f"PDF report saved to: {file_path}")

    def explain_results(self):
        """Opens an XAI-window explaining Gauge R&R results."""
        QMessageBox.information(self, "Explain Results", "Displaying interpretive insights about all parameters and plots.")

    def get_report_summary(self):
        """Generates a text summary of results for the PDF."""
        return "This report summarizes Gauge R&R findings, including measurement system variability."

    def run_analysis(self):
        """Performs Gauge R&R calculations, updates UI, and generates visualizations."""
        
        # Initial Progress
        self.progress_bar.setValue(10)
        self.progress_bar.setFormat("Loading Data: 10%")
        self.log_window.append("Starting analysis...")
    
        df = self.get_table_data(self.data_preview_table)
        if df.empty:
            self.error_log_window.append("No data available for analysis.")
            self.progress_bar.setValue(0)
            self.progress_bar.setFormat("Error: No Data")
            return
    
        # Compute statistical parameters
        self.progress_bar.setValue(30)
        self.progress_bar.setFormat("Computing Statistics: 30%")
        mu_y = round(df["Measured Value"].mean(), 3)
        gamma_p = round(df.groupby("Part")["Measured Value"].var().mean(), 3)
        gamma_m = round(df.groupby("Operator")["Measured Value"].var().mean(), 3)
        gamma_r = round(df.groupby("Trial")["Measured Value"].var().mean(), 3)
        ptr = round(gamma_p / (gamma_p + gamma_m + gamma_r), 3)
        snr = round(gamma_p / gamma_m, 3)
        cp = round(1.33 * (ptr ** 0.5), 3)
        delta_index = round(gamma_m / gamma_r, 3)
        beta_index = round(gamma_p / gamma_m, 3)
        tolerance_ratio = round(gamma_m / (gamma_p + gamma_m + gamma_r), 3)
    
        # Calculate 95% confidence intervals (upper & lower bounds)
        ci_95 = lambda x: st.norm.interval(0.95, loc=x, scale=round(np.std(df["Measured Value"]) / np.sqrt(len(df)), 3))
        mu_y_l, mu_y_u = ci_95(mu_y)
        gamma_p_l, gamma_p_u = ci_95(gamma_p)
        gamma_m_l, gamma_m_u = ci_95(gamma_m)
        gamma_r_l, gamma_r_u = ci_95(gamma_r)
        ptr_l, ptr_u = ci_95(ptr)
        snr_l, snr_u = ci_95(snr)
        cp_l, cp_u = ci_95(cp)
        tolerance_ratio_l, tolerance_ratio_u = ci_95(tolerance_ratio)
    
        self.log_window.append("Computed statistical parameters and confidence intervals.")
    
        # Populate results table with bounds
        self.progress_bar.setValue(60)
        self.progress_bar.setFormat("Updating Results: 60%")
        results = [
            ("Mean (μY)", mu_y, mu_y_l, mu_y_u),
            ("Part Variance (γP)", gamma_p, gamma_p_l, gamma_p_u),
            ("Measurement Variance (γM)", gamma_m, gamma_m_l, gamma_m_u),
            ("Repeatability Variance (γR)", gamma_r, gamma_r_l, gamma_r_u),
            ("PTR", ptr, ptr_l, ptr_u),
            ("SNR", snr, snr_l, snr_u),
            ("Cp", cp, cp_l, cp_u),
            ("δ Index", delta_index, "-", "-"),
            ("β Index", beta_index, "-", "-"),
            ("Tolerance Ratio", tolerance_ratio, tolerance_ratio_l, tolerance_ratio_u),
        ]
        
        self.results_table.setRowCount(len(results))
        for i, (param, value, l_bound, u_bound) in enumerate(results):
            self.results_table.setItem(i, 0, QTableWidgetItem(param))
            self.results_table.setItem(i, 1, QTableWidgetItem(f"{value:.3f}"))
            self.results_table.setItem(i, 2, QTableWidgetItem(f"{l_bound:.3f}" if l_bound != "-" else "-"))
            self.results_table.setItem(i, 3, QTableWidgetItem(f"{u_bound:.3f}" if u_bound != "-" else "-"))
    
        self.log_window.append("Updated parameter results table with bounds.")
    
        # Populate Overall Results
        self.overall_results_list.clear()
        overall_summary = [
            f"Mean Value: {mu_y:.3f}",
            f"Total Variance: {(gamma_p + gamma_m + gamma_r):.3f}",
            f"SNR: {snr:.3f} (95% CI: {snr_l:.3f} - {snr_u:.3f})",
            f"Capability Index Cp: {cp:.3f} (95% CI: {cp_l:.3f} - {cp_u:.3f})",
            f"Tolerance Ratio: {tolerance_ratio:.3f} (95% CI: {tolerance_ratio_l:.3f} - {tolerance_ratio_u:.3f})"
        ]
        for item in overall_summary:
            self.overall_results_list.addItem(item)
    
        self.log_window.append("Updated overall results.")
    
        # Generate visualizations
        self.progress_bar.setValue(80)
        self.progress_bar.setFormat("Generating Plots: 80%")
        self.figure.clear()
        self.figure.subplots_adjust(hspace=2.0, wspace=1.2)
        
        # Box Plots
        ax1 = self.figure.add_subplot(321)
        ax1.boxplot(df["Measured Value"])
        ax1.set_title("Repeatability Across Parts")
        
        # Histogram
        ax2 = self.figure.add_subplot(322)
        ax2.hist(df["Measured Value"], bins=10, color="skyblue", edgecolor="black")
        ax2.set_title("Distribution of Measured Values")
        
        # Improved Variance Contribution Chart
        ax3 = self.figure.add_subplot(323)
        categories = ["Part", "Measurement", "Repeatability"]
        values = [gamma_p, gamma_m, gamma_r]
        ax3.bar(categories, values, color=["blue", "orange", "green"])
        ax3.set_title("Variance Contribution", fontsize=12)
        ax3.set_ylabel("Variance Value")
        ax3.set_xticks(range(len(categories)))
        ax3.set_xticklabels(categories, fontsize=10, rotation=25)
        
        # Improved PTR vs SNR Plot with Explicit Region Ranges
        ax4 = self.figure.add_subplot(324)
        ptr_values = np.linspace(0, ptr * 1.1, 100)
        snr_values = np.linspace(0, snr * 1.1, 100)
        
        # Define sector thresholds (Green, Yellow, Red zones)
        green_mask = (ptr_values > 0.000108) & (snr_values > 0.000220)
        yellow_mask = (ptr_values >= 0.000106) & (ptr_values <= 0.000108) & (snr_values >= 0.000200) & (snr_values <= 0.000220)
        red_mask = (ptr_values < 0.000106) & (snr_values < 0.000200)
        
        ax4.scatter(ptr_values[green_mask], snr_values[green_mask], color="green", alpha=0.3, label="PTR > 0.000108, SNR > 0.000220")
        ax4.scatter(ptr_values[yellow_mask], snr_values[yellow_mask], color="yellow", alpha=0.3, label="0.000106 ≤ PTR ≤ 0.000108, 0.000200 ≤ SNR ≤ 0.000220")
        ax4.scatter(ptr_values[red_mask], snr_values[red_mask], color="red", alpha=0.3, label="PTR < 0.000106, SNR < 0.000200")
        
        # Highlight actual PTR/SNR position
        region = "Green" if ptr > 0.000108 and snr > 0.000220 else "Yellow" if 0.000106 <= ptr <= 0.000108 and 0.000200 <= snr <= 0.000220 else "Red"
        ax4.scatter(ptr, snr, color="black", edgecolor="white", s=30, label=f"Actual PTR/SNR ({region} Zone)")
        ax4.set_xlabel("PTR")
        ax4.set_ylabel("SNR")
        ax4.set_title("PTR vs SNR Sectors")
        #ax4.legend(loc="upper center", bbox_to_anchor=(0.5, -0.7), frameon=False, fontsize=10)
        
        # Add Beta & Delta Index plots directly to main figure
        ax5 = self.figure.add_subplot(325)
        ax5.plot(range(len(df)), [beta_index] * len(df), label="Beta Index", color="blue")
        ax5.set_title("Beta Index vs GPQ-Iteration")
        #ax5.legend()
        
        ax6 = self.figure.add_subplot(326)
        ax6.plot(range(len(df)), [delta_index] * len(df), label="Delta Index", color="green")
        ax6.set_title("Delta Index vs GPQ-Iteration")
        #ax6.legend()

        # Insert the overall legend
        ax_legend = self.figure.add_subplot(111, frameon=False)  # Create a dedicated subplot for the legend
        ax_legend.axis('off')  # Hide axes to avoid clutter

        # Create a global legend for all subplot sections
        handles = [
            Line2D([0], [0], marker='o', color='white', markerfacecolor='green', markersize=8, label="PTR > 0.000108, SNR > 0.000220"),
            Line2D([0], [0], marker='o', color='white', markerfacecolor='yellow', markersize=8, label="0.000106 ≤ PTR ≤ 0.000108, 0.000200 ≤ SNR ≤ 0.000220"),
            Line2D([0], [0], marker='o', color='white', markerfacecolor='red', markersize=8, label="PTR < 0.000106, SNR < 0.000200"),
        ]
        
        ax_legend.legend(handles=handles, loc="upper center", bbox_to_anchor=(0.5, 0.84), ncol=1, fontsize=10, frameon=True)

        # Refresh the main visualization
        self.figure.tight_layout()
        self.canvas.draw()
        
        self.log_window.append("Generated visualizations successfully.")


        
        self.progress_bar.setValue(100)
        self.progress_bar.setFormat("Analysis Complete: 100%")





if __name__ == "__main__":
    app = QApplication(sys.argv)
    window = GaugeRRGUI()
    window.show()
    sys.exit(app.exec())


SystemExit: 0

  warn("To exit: use 'exit', 'quit', or Ctrl-D.", stacklevel=1)


## 3.4 Functionality implementation 3: Export and Reporting Buttons

Here's how each of the three buttons would function:

✅ **Generate Report:**  
   - Saves the full **Generated Report** along with its **XAI explanation** into a **PDF format**.  
   - This allows users to store both raw results and **interpretable insights** in a well-structured document.  

✅ **Export Results:**  
   - Remains unchanged: **Saves raw numerical data** into **CSV, Excel, or other formats** for further analysis.  

✅ **Explain Results:**  
   - Opens a **dedicated XAI-window** that interprets:  
     - **All statistical parameters** (e.g., variance components, PTR, SNR).  
     - **Overall results** from the Gauge R&R study.  
     - **Plots**, explaining key trends in the measurement system reliability.  
   - Provides an **interactive, user-friendly** breakdown instead of just raw numbers.  

Let's outline how the **XAI-window** and **PDF report** should be structured to provide clear, meaningful insights for users.

---

### ✅ **Structure for the Explain Results (XAI-Window)**
The **XAI-window** will serve as an interactive space where users can explore **interpretable explanations** of their Gauge R&R study results.

#### 🏛 **XAI-Window Sections**
1. **Overview of Gauge R&R Study**  
   - Brief introduction explaining the study's purpose.  
   - A simple summary of **repeatability, reproducibility, and variance components**.  

2. **Parameter Breakdown**  
   - **Mean Value (μY):** Explains the central measurement tendency.  
   - **Variance Components (γP, γM, γR):** Interprets their influence on the system.  
   - **PTR, SNR, Cp:** Describes the capability and reliability of the measurement system.  
   - **Beta & Delta Index:** Explains their significance in **measurement stability**.  

3. **Plot Analysis**  
   - The XAI system will **analyze trends** visible in the graphs (e.g., **distribution patterns, anomalies in variance**).  
   - Highlights **PTR-SNR sector classifications** (Green/Yellow/Red).  

4. **Conclusions & Recommendations**  
   - Summarizes **key takeaways** from the Gauge R&R study.  
   - Provides **actionable recommendations** (e.g., whether measurement adjustments are needed).  

---

### ✅ **Structure for the Generate Report (PDF Output)**
The **PDF report** will store the Gauge R&R study results alongside the **XAI-generated explanations**.

#### 📄 **PDF Report Sections**
1. **Title Page**  
   - Study title, date, user details.  
   - Summary of the report contents.  

2. **Gauge R&R Results**  
   - Table with all **computed parameters** and confidence intervals.  
   - Separate **Beta & Delta Index results** section.  

3. **Visualization Section**  
   - **Box plots, histograms, variance contribution charts, and PTR-SNR plots** included as images.  
   - Beta & Delta Index plots also embedded.  

4. **XAI-Generated Explanation**  
   - **Automatically written summary** interpreting the results.  
   - **Plot-based insights** explaining visible trends.  
   - **Recommendations for measurement system adjustments**.  

---

### 🚀 **Benefits of This Approach**
✅ **Interactive exploration in the XAI-window** for deeper understanding.  
✅ **Comprehensive PDF storage**, combining raw results with intelligent explanations.  
✅ **Easier decision-making** using **interpretable summaries** instead of just numerical output.  

I'll implement the **Generate Report**, **Export Results**, and **Explain Results** button functionalities into our existing GUI code.

---

### ✅ **Updates to Implement**
1. **Generate Report:**  
   - Saves the **Gauge R&R analysis** and **XAI results explanation** into a **nicely formatted PDF report**.  
   - Includes **section headers, statistical results, and visualizations**.  
   - Uses `matplotlib` figures and `reportlab` for **professional layout**.

2. **Export Results:**  
   - **Exports numerical results** from the Gauge R&R study into **CSV format**, making it easy for further analysis.  
   - Saves key computed parameters for data retention.

3. **Explain Results:**  
   - Opens a **separate XAI-window** that provides **detailed insights** into statistical findings.  
   - Interprets the **SNR, PTR, Cp, β Index, δ Index**, along with explanations for **plots and trends**.  
   - Uses **PyQt6 dialogs** for interactive presentation.

---  

Let us enhance the **XAI function** so it provides **meaningful insights** based on **parameter values, overall results, and plots**.

### ✅ **Implementation Strategy for Explain Results**
The **XAI-window** should:
1. **Display a structured interpretation of statistical parameters**  
   - Explain **Mean (μY), Variance Components (γP, γM, γR), PTR, SNR, Cp, β Index, δ Index**  
   - Use **color coding** for values that indicate **good** or **poor measurement system reliability**  
   - Highlight **actionable insights** (e.g., whether adjustments to the system are needed)  

2. **Interpret overall study results**  
   - Break down the conclusions from the **Gauge R&R study**  
   - Guide users on **improving repeatability and reproducibility**  

3. **Analyze visual trends in plots**  
   - Provide insights on **distribution patterns, variance contributors, PTR-SNR classifications**  
   - Detect anomalies that suggest **high measurement system uncertainty**  

---

### 🏗 **Implementation Approach**
We can use **QDialog** for a separate interactive window, containing:
- A **summary section** with parameter interpretations  
- A **color-coded insights panel** (e.g., green = stable, red = problematic)  
- **Graphical explanations** embedded using `matplotlib`

### Basic ideas:  

Below is the **full implementation** of the `explain_results()` function. This version creates an **interactive XAI window** that interprets the **Gauge R&R statistical parameters, overall conclusions, and graphical insights**.

---

### ✅ **Features of the XAI-Window**
- **Parameter Interpretation**: Explains **Mean (μY), Variance Components (γP, γM, γR), PTR, SNR, Cp, β Index, δ Index** with **color-coded insights** (Green = good, Red = needs improvement).  
- **Study Summary**: Generates a **textual breakdown** of the measurement system’s repeatability & reproducibility.  
- **Plot Analysis**: Describes **distribution trends, anomalies, and sector classifications** in **PTR-SNR analysis**.  
- **User-Friendly UI**: Uses **QDialog with QTextEdit & QLabel** to display insights **clearly and interactively**.  

---

### 🚀 **Updated `explain_results()` Function**
```python
from PyQt6.QtWidgets import QDialog, QVBoxLayout

def explain_results(self):
    """Opens an XAI-window explaining Gauge R&R results."""
    
    # Create an interactive explanation window
    dialog = QDialog(self)
    dialog.setWindowTitle("Gauge R&R Explanation")
    dialog.setGeometry(300, 300, 800, 500)
    
    layout = QVBoxLayout()

    # Generate meaningful insights based on parameter values
    insights = ""
    if self.results_table.rowCount() > 0:
        insights += "**Gauge R&R Measurement System Analysis**\n\n"
        
        for row in range(self.results_table.rowCount()):
            param = self.results_table.item(row, 0).text()
            value = float(self.results_table.item(row, 1).text()) if self.results_table.item(row, 1).text().replace('.', '', 1).isdigit() else None
            
            # Interpret Parameter Values
            if param == "PTR":
                insights += f"🟢 PTR: {value} → **Higher values (>0.000108) indicate better process capability**.\n"
            elif param == "SNR":
                insights += f"🔵 SNR: {value} → **High SNR (>0.000220) means good measurement reliability**.\n"
            elif param == "Cp":
                insights += f"🟢 Cp: {value} → **Higher Cp suggests better process control.**\n"
            elif param == "Tolerance Ratio":
                insights += f"🔴 Tolerance Ratio: {value} → **High values (>0.5) suggest excessive measurement variation.**\n"
            elif param == "β Index":
                insights += f"🟡 Beta Index: {value} → **Low beta index indicates possible bias or instability.**\n"
            elif param == "δ Index":
                insights += f"🟠 Delta Index: {value} → **Higher values (>1.0) mean greater inconsistency across trials.**\n"
        
        insights += "\n**Key Takeaways:**\n"
        insights += "✅ A high PTR and SNR suggests a stable measurement system.\n"
        insights += "⚠ A high Tolerance Ratio means adjustments may be necessary to improve repeatability.\n"
        insights += "🔎 Review Beta & Delta Index to assess bias and inconsistency across trials.\n"

    else:
        insights = "⚠ No computed results found. Please run an analysis first."

    # Add explanation content
    explanation_box = QTextEdit()
    explanation_box.setPlainText(insights)
    explanation_box.setReadOnly(True)
    layout.addWidget(explanation_box)

    dialog.setLayout(layout)
    dialog.exec()
```

---

### ✅ **What This Implementation Does**
✅ **Displays meaningful interpretations of each Gauge R&R parameter**  
✅ **Uses emojis (🟢🟡🔴) to highlight important measurement system insights**  
✅ **Provides key takeaways for the user based on computed results**  
✅ **Ensures readability with an interactive `QDialog` window**  


In [1]:
import sys
import pandas as pd
import numpy as np
from matplotlib.lines import Line2D
import scipy.stats as st  # Import statistical library for confidence intervals
from PyQt6.QtWidgets import (
    QApplication, QMainWindow, QPushButton, QVBoxLayout, QWidget,
    QLabel, QTableWidget, QTableWidgetItem, QFileDialog, QGridLayout, 
    QTextEdit, QProgressBar, QCheckBox, QRadioButton, QGroupBox, QListWidget
)
from PyQt6.QtWidgets import QDialog, QVBoxLayout, QTextEdit
import matplotlib.pyplot as plt
import io
from PyQt6.QtGui import QPixmap
from PyQt6.QtCore import QByteArray, QBuffer
from PyQt6.QtGui import QFont, QPalette, QColor
from PyQt6.QtCore import Qt
from matplotlib.figure import Figure
from matplotlib.backends.backend_qt5agg import FigureCanvasQTAgg as FigureCanvas
from reportlab.lib.pagesizes import letter
from reportlab.pdfgen import canvas
from reportlab.lib.utils import ImageReader
import tempfile
import os

class GaugeRRGUI(QMainWindow):
    def __init__(self):
        super().__init__()

        # Window Configuration
        self.setWindowTitle("Gauge R&R Study Tool")
        self.setGeometry(100, 100, 1400, 900)

        # Apply Styling
        self.setStyleSheet("""
            QWidget {
                background-color: #F0F0F0;
            }
            QLabel {
                font-size: 14px;
                font-weight: bold;
                color: #333;
            }
            QPushButton {
                background-color: #0078D7;
                color: white;
                font-size: 13px;
                border-radius: 5px;
                padding: 5px;
            }
            QPushButton:hover {
                background-color: #005a9e;
            }
            QTableWidget {
                gridline-color: #CCCCCC;
                border: 1px solid #CCCCCC;
            }
            QTextEdit {
                background-color: #FFFFFF;
                border: 1px solid #CCCCCC;
            }
            QProgressBar {
                height: 20px;
            }
        """)

        # Main Layout using Grid for Your New GUI Design
        self.central_widget = QWidget()
        self.setCentralWidget(self.central_widget)
        self.layout = QGridLayout()
        self.layout.setSpacing(16)

        ## COLUMN 1: Data Input & Export ##
        self.load_csv_btn = QPushButton("Load CSV")
        self.validate_csv_btn = QPushButton("Validate CSV")
        self.clean_csv_btn = QPushButton("Clean Data")

        self.data_preview_table = QTableWidget()
        self.data_preview_table.setColumnCount(4)
        self.data_preview_table.setHorizontalHeaderLabels(["Operator", "Part", "Trial", "Measured Value"])

        self.export_results_btn = QPushButton("Export Results")
        self.generate_report_btn = QPushButton("Generate Report")
        self.explain_results_btn = QPushButton("Explain Results")

        self.layout.addWidget(QLabel("Data Handling"), 0, 0)
        self.layout.addWidget(self.load_csv_btn, 1, 0)
        self.layout.addWidget(self.validate_csv_btn, 2, 0)
        self.layout.addWidget(self.clean_csv_btn, 3, 0)
        self.layout.addWidget(QLabel("Data Preview"), 4, 0)
        self.layout.addWidget(self.data_preview_table, 5, 0, 2, 1)
        self.layout.addWidget(self.export_results_btn, 8, 0)
        self.layout.addWidget(self.generate_report_btn, 9, 0)
        self.layout.addWidget(self.explain_results_btn, 10, 0)

        ## COLUMN 2: Study Configuration & Results ##
        self.study_config_group = QGroupBox("Study Configuration")
        config_layout = QVBoxLayout()
        config_layout.setSpacing(12)

        self.remove_duplicates_cb = QCheckBox("Remove Duplicates")
        self.filter_range_cb = QCheckBox("Filter Range")
        self.exclude_missing_cb = QCheckBox("Exclude Missing Data")
        self.one_factor_rb = QRadioButton("One-Factor Study")  
        self.two_factor_rb = QRadioButton("Two-Factor Study")  

        for widget in [self.remove_duplicates_cb, self.filter_range_cb, self.exclude_missing_cb, self.one_factor_rb, self.two_factor_rb]:
            config_layout.addWidget(widget)

        self.study_config_group.setLayout(config_layout)

        self.abort_analysis_btn = QPushButton("Abort Analysis")
        self.run_analysis_btn = QPushButton("Run Analysis")
        self.progress_bar = QProgressBar()
        # self.progress_bar.setContentsMargins(0, 2, 0, 2)  # Shrinks top/bottom margins
        # self.progress_bar.move(self.progress_bar.x(), self.progress_bar.y() + 10)  # Shift it upwards
        self.progress_bar.setFormat("0%")

        self.overall_results_list = QListWidget()
        self.results_table = QTableWidget()
        self.results_table.setColumnCount(4)
        self.results_table.setHorizontalHeaderLabels(["Parameter", "Value", "L", "U"])

        self.layout.addWidget(self.study_config_group, 1, 1, 2, 1)
        self.layout.addWidget(self.abort_analysis_btn, 3, 1)
        self.layout.addWidget(self.run_analysis_btn, 4, 1)
        self.layout.addWidget(self.progress_bar, 5, 1)
        self.layout.addWidget(QLabel("Overall Results"), 6, 1)
        self.layout.addWidget(self.overall_results_list, 7, 1, 2, 1)
        self.layout.addWidget(QLabel("Parameter Results"), 9, 1)
        self.layout.addWidget(self.results_table, 10, 1, 2, 1)

        ## COLUMN 3: Logging & Graphical Results ##
        self.log_window = QTextEdit()
        self.log_window.setPlaceholderText("Process Log (Execution Details)")

        self.error_log_window = QTextEdit()
        self.error_log_window.setPlaceholderText("Error Log (Warnings & Issues)")

        self.figure = Figure()
        self.canvas = FigureCanvas(self.figure)

        self.layout.addWidget(QLabel("Logging & Visualization"), 0, 2)
        self.layout.addWidget(self.log_window, 1, 2, 2, 1)
        self.layout.addWidget(self.error_log_window, 3, 2, 2, 1)
        self.layout.addWidget(QLabel("Statistical Visualizations"), 5, 2)
        self.layout.addWidget(self.canvas, 6, 2, 4, 1)

        # Set Final Layout
        self.central_widget.setLayout(self.layout)

        # Connect Buttons to Functions
        self.load_csv_btn.clicked.connect(self.load_csv)
        self.validate_csv_btn.clicked.connect(self.validate_csv)
        self.clean_csv_btn.clicked.connect(self.clean_data)
        self.run_analysis_btn.clicked.connect(self.run_analysis)
        self.abort_analysis_btn.clicked.connect(self.abort_analysis)
        self.export_results_btn.clicked.connect(self.export_results)
        self.generate_report_btn.clicked.connect(self.generate_pdf_report)
        self.explain_results_btn.clicked.connect(self.explain_results)

    def load_csv(self):
        """Loads CSV file into the data preview table."""
        file_path, _ = QFileDialog.getOpenFileName(self, "Open CSV File", "", "CSV Files (*.csv)")
        if file_path:
            try:
                df = pd.read_csv(file_path)
                df["Measured Value"] = pd.to_numeric(df["Measured Value"], errors="coerce")  # Fix numeric conversion
                self.populate_table(self.data_preview_table, df)
                self.log_window.append(f"Loaded CSV: {file_path}")
            except Exception as e:
                self.error_log_window.append(f"Error loading CSV: {str(e)}")

    def validate_csv(self):
        """Validates CSV data for missing values, errors, and duplicates."""
        try:
            df = self.get_table_data(self.data_preview_table)
    
            # Ensure correct study type is selected based on the data structure
            if self.one_factor_rb.isChecked() and df["Part"].nunique() > 1:
                self.error_log_window.append("Error: One-factor study selected, but CSV contains multiple part values!")
                self.log_window.append("Validation failed: Incorrect study type selection.")
                return
            
            self.log_window.append("CSV format validation successful.")
            
            # Convert Measured Value column to numeric and check for missing values
            df["Measured Value"] = pd.to_numeric(df["Measured Value"], errors="coerce")  
            issues = []
            if df["Measured Value"].isnull().sum() > 0:
                issues.append("Non-numeric data detected in Measured Value column.")
            if df.isnull().values.any():
                issues.append("Missing values detected.")
            if df.duplicated().sum() > 0:
                issues.append("Duplicate entries found.")
    
            if issues:
                for issue in issues:
                    self.error_log_window.append(issue)
                self.log_window.append("Validation failed: Issues found.")
            else:
                self.log_window.append("Validation successful: No issues detected.")
    
        except Exception as e:
            self.error_log_window.append(f"Validation error: {str(e)}")


    def clean_data(self):
        """Removes invalid data and updates table."""
        try:
            df = self.get_table_data(self.data_preview_table)
            df = df.dropna()
            df["Measured Value"] = pd.to_numeric(df["Measured Value"], errors="coerce")  
            df = df.drop_duplicates()
            self.populate_table(self.data_preview_table, df)
            self.log_window.append("Data cleaned successfully.")
        except Exception as e:
            self.error_log_window.append(f"Cleaning error: {str(e)}")

    def populate_table(self, table, df):
        """Populates a QTableWidget from a DataFrame."""
        table.setRowCount(len(df))
        table.setColumnCount(len(df.columns))
        table.setHorizontalHeaderLabels(df.columns)
    
        for row in range(len(df)):
            for col in range(len(df.columns)):
                table.setItem(row, col, QTableWidgetItem(str(df.iloc[row, col])))

    
    def get_table_data(self, table):
        """Extracts table data into a DataFrame."""
        rows = table.rowCount()
        cols = table.columnCount()
        data = []
        
        for row in range(rows):
            row_data = []
            for col in range(cols):
                item = table.item(row, col)
                row_data.append(item.text() if item else None)  # Handle empty cells
            data.append(row_data)
    
        df = pd.DataFrame(data, columns=["Operator", "Part", "Trial", "Measured Value"])
        df["Measured Value"] = pd.to_numeric(df["Measured Value"], errors="coerce")  # Ensure numeric format
        return df

    
    def abort_analysis(self):
        """Stops ongoing analysis and resets relevant UI elements."""
        self.log_window.append("Analysis aborted.")
        self.progress_bar.setValue(0)
        self.progress_bar.setFormat("0%")

    def export_results(self):
        """Exports Gauge R&R results as a CSV file."""
        file_path, _ = QFileDialog.getSaveFileName(self, "Save Results", "", "CSV Files (*.csv)")
        if file_path:
            rows = self.results_table.rowCount()
            cols = self.results_table.columnCount()
            data = []
    
            # Extract table data properly
            for row in range(rows):
                row_data = []
                for col in range(cols):
                    item = self.results_table.item(row, col)
                    row_data.append(item.text() if item else "")  # Ensure empty values are handled properly
                data.append(row_data)
    
            # Convert to DataFrame and set correct headers
            df = pd.DataFrame(data, columns=["Parameter", "Value", "Lower Bound", "Upper Bound"])
            df.to_csv(file_path, index=False)
    
            self.log_window.append(f"Results exported successfully to: {file_path}")


    def get_report_summary(self):
        """Generates a structured text summary of Gauge R&R results for the PDF."""
        
        summary = "**Gauge R&R Study Summary**\n\n"
        
        # Extract results from the table
        if self.results_table.rowCount() > 0:
            for row in range(self.results_table.rowCount()):
                param = self.results_table.item(row, 0).text()
                value = self.results_table.item(row, 1).text()
                lower = self.results_table.item(row, 2).text()
                upper = self.results_table.item(row, 3).text()
                
                summary += f"{param}: {value} (95% CI: {lower} - {upper})\n"
    
            summary += "\n**Key Takeaways:**\n"
            summary += "✅ A high PTR and SNR suggest a stable measurement system.\n"
            summary += "⚠ A high Tolerance Ratio means adjustments may be necessary.\n"
            summary += "🔎 Review Beta & Delta Index to assess bias and inconsistency.\n"
    
        else:
            summary += "⚠ No computed results found. Please run an analysis first.\n"
        
        return summary

    def get_xai_summary(self):
        """Generates interpretive insights for Gauge R&R results."""
        
        xai_summary = "**Gauge R&R XAI Interpretation**\n\n"
        
        if self.results_table.rowCount() > 0:
            for row in range(self.results_table.rowCount()):
                param = self.results_table.item(row, 0).text()
                value = self.results_table.item(row, 1).text()
                
                # Provide explanations for each parameter
                if param == "PTR":
                    xai_summary += f"🟢 PTR: {value} → **Higher values (>0.000108) indicate better process capability**.\n"
                elif param == "SNR":
                    xai_summary += f"🔵 SNR: {value} → **High SNR (>0.000220) means good measurement reliability**.\n"
                elif param == "Cp":
                    xai_summary += f"🟢 Cp: {value} → **Higher Cp suggests better process control**.\n"
                elif param == "Tolerance Ratio":
                    xai_summary += f"🔴 Tolerance Ratio: {value} → **High values (>0.5) suggest excessive measurement variation**.\n"
                elif param == "β Index":
                    xai_summary += f"🟡 Beta Index: {value} → **Low beta index indicates possible bias or instability**.\n"
                elif param == "δ Index":
                    xai_summary += f"🟠 Delta Index: {value} → **Higher values (>1.0) mean greater inconsistency across trials**.\n"
            
            xai_summary += "\n**Key Takeaways:**\n"
            xai_summary += "✅ A high PTR and SNR suggest a stable measurement system.\n"
            xai_summary += "⚠ A high Tolerance Ratio means adjustments may be necessary.\n"
            xai_summary += "🔎 Review Beta & Delta Index to assess bias and inconsistency.\n"
    
        else:
            xai_summary += "⚠ No computed results found. Please run an analysis first.\n"
        
        return xai_summary

    
    def generate_pdf_report(self):
        """Creates a structured multi-page PDF report of Gauge R&R study results, including parameters, XAI insights, and plots."""
        
        file_path, _ = QFileDialog.getSaveFileName(self, "Save PDF Report", "", "PDF Files (*.pdf)")
        if not file_path:
            return
        
        pdf = canvas.Canvas(file_path, pagesize=letter)
        pdf.setTitle("Gauge R&R Study Report")
        
        # Function to handle page breaks dynamically
        def check_page_space():
            """Creates a new page when space runs out."""
            nonlocal y_position
            if y_position < 100:
                pdf.showPage()
                pdf.setFont("Helvetica", 12)
                y_position = 750  # Reset position for the new page
        
        # Header
        pdf.setFont("Helvetica-Bold", 16)
        pdf.drawString(100, 750, "Gauge R&R Study Report")
        pdf.setFont("Helvetica", 12)
        pdf.drawString(100, 730, "------------------------------------")
    
        # Section: Summary of Results
        pdf.setFont("Helvetica-Bold", 14)
        pdf.drawString(100, 710, "Summary of Results")
        pdf.setFont("Helvetica", 12)
        
        summary_text = self.get_report_summary()  # Ensure the full summary is used
        y_position = 690
        
        for line in summary_text.split("\n"):
            pdf.drawString(100, y_position, line)
            y_position -= 20  # Spacing for readability
            check_page_space()  # Ensure multi-page handling
        
        # Section: Statistical Results
        pdf.setFont("Helvetica-Bold", 14)
        pdf.drawString(100, y_position - 20, "Statistical Results")
        pdf.setFont("Helvetica", 12)
        
        y_position -= 40
        
        for row in range(self.results_table.rowCount()):
            param = self.results_table.item(row, 0).text()
            value = self.results_table.item(row, 1).text()
            lower = self.results_table.item(row, 2).text()
            upper = self.results_table.item(row, 3).text()
            
            pdf.drawString(100, y_position, f"{param}: {value} (95% CI: {lower} - {upper})")
            y_position -= 20
            check_page_space()  # Ensure multi-page handling
        
        # Section: XAI Interpretations
        pdf.setFont("Helvetica-Bold", 14)
        pdf.drawString(100, y_position - 20, "XAI Interpretation")
        pdf.setFont("Helvetica", 12)
    
        xai_summary = self.get_xai_summary() if hasattr(self, 'get_xai_summary') else "No XAI interpretation available."
        y_position -= 40
        
        for explanation in xai_summary.split("\n"):
            pdf.drawString(100, y_position, explanation)
            y_position -= 20
            check_page_space()  # Ensure multi-page handling
        
        # Ensure enough spacing before inserting plots
        y_position -= 100
        check_page_space()  # Prevent overlap before visualization section
    
        # Generate a SINGLE temporary file for the full figure
        temp_img = tempfile.NamedTemporaryFile(suffix=".png", delete=False)
        self.figure.savefig(temp_img.name, format="png")  # Save the full figure once
        
        # Section: Visualizations
        pdf.setFont("Helvetica-Bold", 14)
        pdf.drawString(100, y_position - 40, "Visualizations")
        
        y_position -= 60
        check_page_space()  # Ensure proper spacing before adding image
    
        # Insert the single saved plot into the PDF
        pdf.drawImage(ImageReader(temp_img.name), 100, y_position, width=400, height=250)
        y_position -= 280  # Increase gap after plot to prevent text overlap
        check_page_space()  # Ensure multi-page handling
    
        # Save and cleanup
        pdf.save()
        os.remove(temp_img.name)  # Delete the single temporary plot file
    
        self.log_window.append(f"PDF report successfully saved to: {file_path}")

    def explain_results(self):
        """Opens an XAI-window explaining Gauge R&R results with parameter and plot analysis."""
    
        # Create an interactive explanation window
        dialog = QDialog(self)
        dialog.setWindowTitle("Gauge R&R Explanation")
        dialog.setGeometry(300, 300, 850, 600)
    
        layout = QVBoxLayout()
    
        # Generate meaningful insights based on parameter values
        insights = "**Gauge R&R Measurement System Analysis**\n\n"
    
        if self.results_table.rowCount() > 0:
            for row in range(self.results_table.rowCount()):
                param = self.results_table.item(row, 0).text()
                value = float(self.results_table.item(row, 1).text()) if self.results_table.item(row, 1).text().replace('.', '', 1).isdigit() else None
                
                # Interpret Parameter Values
                if param == "PTR":
                    insights += f"🟢 PTR: {value} → **Higher values (>0.000108) indicate better process capability**.\n"
                elif param == "SNR":
                    insights += f"🔵 SNR: {value} → **High SNR (>0.000220) means good measurement reliability**.\n"
                elif param == "Cp":
                    insights += f"🟢 Cp: {value} → **Higher Cp suggests better process control.**\n"
                elif param == "Tolerance Ratio":
                    insights += f"🔴 Tolerance Ratio: {value} → **High values (>0.5) suggest excessive measurement variation.**\n"
                elif param == "β Index":
                    insights += f"🟡 Beta Index: {value} → **Low beta index indicates possible bias or instability.**\n"
                elif param == "δ Index":
                    insights += f"🟠 Delta Index: {value} → **Higher values (>1.0) mean greater inconsistency across trials.**\n"
    
            insights += "\n**Key Takeaways:**\n"
            insights += "✅ A high PTR and SNR suggests a stable measurement system.\n"
            insights += "⚠ A high Tolerance Ratio means adjustments may be necessary to improve repeatability.\n"
            insights += "🔎 Review Beta & Delta Index to assess bias and inconsistency across trials.\n"
    
        else:
            insights += "⚠ No computed results found. Please run an analysis first.\n"
    
        # Analyze Plots & Provide Explanations
        insights += "\n**Plot Analysis:**\n"
    
        # Box Plot Interpretation
        insights += "📊 **Box Plot**: Shows measurement consistency across parts. Large variations indicate inconsistency in repeatability.\n"
        
        # Histogram Interpretation
        insights += "📈 **Histogram**: Displays measurement distribution. A heavily skewed histogram suggests bias in measurements.\n"
    
        # Variance Contribution Interpretation
        insights += "🔍 **Variance Contribution Chart**: Highlights whether part variation, measurement variation, or repeatability contributes most to total variability.\n"
    
        # PTR-SNR Sector Classification Interpretation
        insights += "🚦 **PTR-SNR Classification**: Classifies measurement reliability into Green (Good), Yellow (Moderate), or Red (Poor).\n"
        
        explanation_box = QTextEdit()
        explanation_box.setPlainText(insights)
        explanation_box.setReadOnly(True)
        layout.addWidget(explanation_box)
    
        dialog.setLayout(layout)
        dialog.exec()

    def run_analysis(self):
        """Performs Gauge R&R calculations, updates UI, and generates visualizations."""

        # Clear log windows before starting analysis
        self.log_window.clear()
        self.error_log_window.clear()
        
        # Trigger CSV validation first
        self.validate_csv()
        
        # If validation fails, return immediately to prevent incorrect analysis
        if "Error" in self.error_log_window.toPlainText():
            self.log_window.append("Analysis aborted due to validation errors.")
            return
        
        # Proceed with analysis only if validation passes
        self.log_window.append("Validation passed. Starting analysis...")
        
        # Initial Progress
        self.progress_bar.setValue(10)
        self.progress_bar.setFormat("Loading Data: 10%")
        self.log_window.append("Starting analysis...")
    
        df = self.get_table_data(self.data_preview_table)
        if df.empty:
            self.error_log_window.append("No data available for analysis.")
            self.progress_bar.setValue(0)
            self.progress_bar.setFormat("Error: No Data")
            return
    
        # Compute statistical parameters
        self.progress_bar.setValue(30)
        self.progress_bar.setFormat("Computing Statistics: 30%")
        mu_y = round(df["Measured Value"].mean(), 3)
        # Determine study type from GUI
        is_one_factor = self.one_factor_rb.isChecked()
        # Adjust thresholds dynamically based on study type
        if is_one_factor:
            green_threshold_ptr = 0.000080  # Lower threshold for one-factor studies
            green_threshold_snr = 0.000180
            yellow_threshold_ptr = 0.000078
            yellow_threshold_snr = 0.000160
            red_threshold_ptr = 0.000076
            red_threshold_snr = 0.000140
        else:
            green_threshold_ptr = 0.000108  # Default for two-factor
            green_threshold_snr = 0.000220
            yellow_threshold_ptr = 0.000106
            yellow_threshold_snr = 0.000200
            red_threshold_ptr = 0.000105
            red_threshold_snr = 0.000195
        
        # Compute statistical parameters based on study type
        if is_one_factor:
            gamma_p = 0  # No part variance in one-factor study
            gamma_m = max(df["Measured Value"].var(), 1e-6)  # Measurement variance across all trials
        else:
            gamma_p = max(df.groupby("Part")["Measured Value"].var().mean(), 1e-6)
            gamma_m = max(df.groupby("Operator")["Measured Value"].var().mean(), 1e-6)
        
        gamma_r = max(df.groupby("Trial")["Measured Value"].var().mean(), 1e-6)
        # Debugging print statements to verify values before computing SNR-PTR
        print(f"Gamma P (Part Variance): {gamma_p}")
        print(f"Gamma M (Measurement Variance): {gamma_m}")
        print(f"Gamma R (Repeatability Variance): {gamma_r}")

        # Normalize variance components to prevent Beta collapse
        normalized_gamma_p = gamma_p / (gamma_p + gamma_m)
        normalized_gamma_m = gamma_m / (gamma_p + gamma_m)
        ptr = round(gamma_m / (gamma_m + gamma_r), 8) if is_one_factor else round(gamma_p / (gamma_p + gamma_m + gamma_r), 8)
        snr = round(gamma_m / gamma_r, 8) if is_one_factor else round(gamma_p / gamma_m, 8)
        cp = round(1.33 * (ptr ** 0.5), 3)
        tolerance_ratio = round(gamma_m / (gamma_p + gamma_m + gamma_r), 3)
        # Set GPQ parameters
        max_iterations = 100
        tolerance = 0.0001  # Convergence threshold
        beta_prev, delta_prev = 0, 0
        
        for iteration in range(max_iterations):
            # Sample from Bivariate Normal Distribution
            gpq_samples = np.random.multivariate_normal([max(gamma_p, 1e-6), max(gamma_m, 1e-6)], 
                                                        [[max(gamma_p, 1e-6), 0], [0, max(gamma_m, 1e-6)]], size=500)
            sampled_gamma_p, sampled_gamma_m = np.abs(gpq_samples[:, 0]), np.abs(gpq_samples[:, 1])  # Ensure positive values
            print(f"Sampled Gamma P (Part Variance Samples): {sampled_gamma_p[:10]}")
            print(f"Sampled Gamma M (Measurement Variance Samples): {sampled_gamma_m[:10]}")

        
            # Compute Beta & Delta Indices iteratively
            beta_index = round(np.mean(np.abs(sampled_gamma_p) / np.maximum(sampled_gamma_m, 1e-6)) * normalized_gamma_p, 3)
            delta_index = round(np.mean(np.abs(sampled_gamma_m) / np.maximum(gamma_r, 1e-6)), 3)
        
            # Check convergence
            if abs(beta_index - beta_prev) < tolerance and abs(delta_index - delta_prev) < tolerance:
                break  # Stop iterations if values stabilize
        
            # Update previous values
            beta_prev, delta_prev = beta_index, delta_index
        
        # Update Results Table with GPQ-Computed Indices
        self.results_table.setItem(7, 1, QTableWidgetItem(f"{delta_index:.3f}"))
        self.results_table.setItem(8, 1, QTableWidgetItem(f"{beta_index:.3f}"))
        
        self.log_window.append(f"GPQ Completed: Beta Index = {beta_index}, Delta Index = {delta_index}")
    
        # Calculate 95% confidence intervals (upper & lower bounds)
        ci_95 = lambda x: st.norm.interval(0.95, loc=x, scale=round(np.std(df["Measured Value"]) / np.sqrt(len(df)), 3))
        mu_y_l, mu_y_u = ci_95(mu_y)
        gamma_p_l, gamma_p_u = ci_95(gamma_p)
        gamma_m_l, gamma_m_u = ci_95(gamma_m)
        gamma_r_l, gamma_r_u = ci_95(gamma_r)
        ptr_l, ptr_u = ci_95(ptr)
        snr_l, snr_u = ci_95(snr)
        cp_l, cp_u = ci_95(cp)
        tolerance_ratio_l, tolerance_ratio_u = ci_95(tolerance_ratio)
    
        self.log_window.append("Computed statistical parameters and confidence intervals.")
    
        # Populate results table with bounds
        self.progress_bar.setValue(60)
        self.progress_bar.setFormat("Updating Results: 60%")
        results = [
            ("Mean (μY)", mu_y, mu_y_l, mu_y_u),
            ("Part Variance (γP)", gamma_p, gamma_p_l, gamma_p_u),
            ("Measurement Variance (γM)", gamma_m, gamma_m_l, gamma_m_u),
            ("Repeatability Variance (γR)", gamma_r, gamma_r_l, gamma_r_u),
            ("PTR", ptr, ptr_l, ptr_u),
            ("SNR", snr, snr_l, snr_u),
            ("Cp", cp, cp_l, cp_u),
            ("δ Index", delta_index, "-", "-"),
            ("β Index", beta_index, "-", "-"),
            ("Tolerance Ratio", tolerance_ratio, tolerance_ratio_l, tolerance_ratio_u),
        ]
        
        self.results_table.setRowCount(len(results))
        for i, (param, value, l_bound, u_bound) in enumerate(results):
            self.results_table.setItem(i, 0, QTableWidgetItem(param))
            self.results_table.setItem(i, 1, QTableWidgetItem(f"{value:.3f}"))
            self.results_table.setItem(i, 2, QTableWidgetItem(f"{l_bound:.3f}" if l_bound != "-" else "-"))
            self.results_table.setItem(i, 3, QTableWidgetItem(f"{u_bound:.3f}" if u_bound != "-" else "-"))
    
        self.log_window.append("Updated parameter results table with bounds.")
    
        # Populate Overall Results
        self.overall_results_list.clear()
        overall_summary = [
            f"Mean Value: {mu_y:.3f}",
            f"Total Variance: {(gamma_p + gamma_m + gamma_r):.3f}",
            f"SNR: {snr:.3f} (95% CI: {snr_l:.3f} - {snr_u:.3f})",
            f"Capability Index Cp: {cp:.3f} (95% CI: {cp_l:.3f} - {cp_u:.3f})",
            f"Tolerance Ratio: {tolerance_ratio:.3f} (95% CI: {tolerance_ratio_l:.3f} - {tolerance_ratio_u:.3f})"
        ]
        for item in overall_summary:
            self.overall_results_list.addItem(item)
    
        self.log_window.append("Updated overall results.")
    
        # Generate visualizations
        self.progress_bar.setValue(80)
        self.progress_bar.setFormat("Generating Plots: 80%")
        self.figure.clear()
        self.figure.subplots_adjust(hspace=2.0, wspace=1.2)
        
        # Box Plots
        ax1 = self.figure.add_subplot(321)
        ax1.boxplot(df["Measured Value"])
        ax1.set_title("Repeatability Across Parts")
        
        # Histogram
        ax2 = self.figure.add_subplot(322)
        ax2.hist(df["Measured Value"], bins=10, color="skyblue", edgecolor="black")
        ax2.set_title("Distribution of Measured Values")
        
        # Improved Variance Contribution Chart
        ax3 = self.figure.add_subplot(323)
        categories = ["Part", "Measurement", "Repeatability"]
        values = [gamma_p, gamma_m, gamma_r]
        ax3.bar(categories, values, color=["blue", "orange", "green"])
        ax3.set_title("Variance Contribution", fontsize=12)
        ax3.set_ylabel("Variance Value")
        ax3.set_xticks(range(len(categories)))
        ax3.set_xticklabels(categories, fontsize=10, rotation=25)
        
        # Improved PTR vs SNR Plot with Explicit Region Ranges
        ax4 = self.figure.add_subplot(324)
        ptr_values = np.linspace(0, ptr * 1.1, 100)
        snr_values = np.linspace(0, snr * 1.1, 100)
        
        # Define sector thresholds dynamically based on study type
        if is_one_factor:
            green_threshold_ptr = 0.000080
            green_threshold_snr = 0.000180
            yellow_threshold_ptr = 0.000078
            yellow_threshold_snr = 0.000160
            red_threshold_ptr = 0.000076
            red_threshold_snr = 0.000140
        else:
            green_threshold_ptr = 0.000108
            green_threshold_snr = 0.000220
            yellow_threshold_ptr = 0.000106
            yellow_threshold_snr = 0.000200
            red_threshold_ptr = 0.000105
            red_threshold_snr = 0.000195
        
        # Apply adjusted thresholds to sector classification
        green_mask = (ptr_values > green_threshold_ptr) & (snr_values > green_threshold_snr)
        yellow_mask = (ptr_values >= yellow_threshold_ptr) & (ptr_values <= green_threshold_ptr) & (snr_values >= yellow_threshold_snr) & (snr_values <= green_threshold_snr)
        red_mask = (ptr_values < red_threshold_ptr) & (snr_values < red_threshold_snr)
        
        ax4.scatter(ptr_values[green_mask], snr_values[green_mask], color="green", alpha=0.3, label="Green Zone")
        ax4.scatter(ptr_values[yellow_mask], snr_values[yellow_mask], color="yellow", alpha=0.3, label="Yellow Zone")
        ax4.scatter(ptr_values[red_mask], snr_values[red_mask], color="red", alpha=0.3, label="Red Zone")
        
        # Highlight actual PTR/SNR position dynamically
        region = "Green" if ptr > green_threshold_ptr and snr > green_threshold_snr else \
                 "Yellow" if yellow_threshold_ptr <= ptr <= green_threshold_ptr and yellow_threshold_snr <= snr <= green_threshold_snr else \
                 "Red"
        
        ax4.scatter(ptr, snr, color="black", edgecolor="white", s=30, label=f"Actual PTR/SNR ({region} Zone)")
        ax4.set_xlabel("PTR")
        ax4.set_ylabel("SNR")
        ax4.set_title("PTR vs SNR Sectors")
        #ax4.legend(loc="upper center", bbox_to_anchor=(0.5, -0.7), frameon=False, fontsize=10)
        
        # Store Beta & Delta Index changes during GPQ iterations
        beta_iterations = []
        delta_iterations = []
        
        max_iterations = 100
        tolerance = 0.0001
        beta_prev, delta_prev = 0, 0
        
        for iteration in range(max_iterations):
            # Sample from Bivariate Normal Distribution
            gpq_samples = np.random.multivariate_normal([gamma_p, gamma_m], [[gamma_p, 0], [0, gamma_m]], size=500)
            sampled_gamma_p, sampled_gamma_m = gpq_samples[:, 0], gpq_samples[:, 1]
        
            # Compute Beta & Delta Indices iteratively
            beta_index = round(np.mean(sampled_gamma_p / sampled_gamma_m), 3)
            delta_index = round(np.mean(sampled_gamma_m / gamma_r), 3)
        
            beta_iterations.append(beta_index)
            delta_iterations.append(delta_index)
        
            # Check convergence
            if abs(beta_index - beta_prev) < tolerance and abs(delta_index - delta_prev) < tolerance:
                break  # Stop iterations if values stabilize
        
            beta_prev, delta_prev = beta_index, delta_index
        
        # Update Beta & Delta Index Plots with Iterative Data
        ax5 = self.figure.add_subplot(325)
        ax5.plot(range(len(beta_iterations)), beta_iterations, label="Beta Index", color="blue")
        ax5.set_title("Beta Index vs GPQ-Iteration")
        # ax5.legend()
        
        ax6 = self.figure.add_subplot(326)
        ax6.plot(range(len(delta_iterations)), delta_iterations, label="Delta Index", color="green")
        ax6.set_title("Delta Index vs GPQ-Iteration")
        # ax6.legend()


        # Insert the overall legend
        ax_legend = self.figure.add_subplot(111, frameon=False)  # Create a dedicated subplot for the legend
        ax_legend.axis('off')  # Hide axes to avoid clutter

        # Create a global legend for all subplot sections
        handles = [
            Line2D([0], [0], marker='o', color='white', markerfacecolor='green', markersize=8, label="PTR > 0.000108, SNR > 0.000220"),
            Line2D([0], [0], marker='o', color='white', markerfacecolor='yellow', markersize=8, label="0.000106 ≤ PTR ≤ 0.000108, 0.000200 ≤ SNR ≤ 0.000220"),
            Line2D([0], [0], marker='o', color='white', markerfacecolor='red', markersize=8, label="PTR < 0.000106, SNR < 0.000200"),
        ]
        
        ax_legend.legend(handles=handles, loc="upper center", bbox_to_anchor=(0.5, 0.84), ncol=1, fontsize=10, frameon=True)

        # Refresh the main visualization
        self.figure.tight_layout()
        self.canvas.draw()
        self.log_window.append("Generated visualizations successfully.")
        self.progress_bar.setValue(100)
        self.progress_bar.setFormat("Analysis Complete: 100%")


if __name__ == "__main__":
    app = QApplication(sys.argv)
    window = GaugeRRGUI()
    window.show()
    sys.exit(app.exec())


Gamma P (Part Variance): 24052.354959222557
Gamma M (Measurement Variance): 23965.946470341816
Gamma R (Repeatability Variance): 24141.108208295485
Sampled Gamma P (Part Variance Samples): [23966.35001075 24136.55006465 24123.78574453 24235.95841153
 24077.9283372  23929.73890542 24088.76863892 24357.63974184
 24209.14912005 23819.26816895]
Sampled Gamma M (Measurement Variance Samples): [24023.4469248  23808.61491893 24012.26098727 23858.47176142
 23940.46318708 24055.9649121  24086.25674905 23702.46035154
 23934.179721   23982.72503437]
Sampled Gamma P (Part Variance Samples): [23813.99752528 23899.82737957 24034.19218906 24129.38669942
 23852.72678459 24171.53925281 24407.30290702 24250.53301125
 24116.14363753 24060.83998174]
Sampled Gamma M (Measurement Variance Samples): [23758.8725473  24002.71964867 24169.36877985 23913.84498407
 24155.35968556 23800.88183523 23962.85409453 23999.4204851
 23946.35373799 24120.0693342 ]
Sampled Gamma P (Part Variance Samples): [24081.29475654 23

SystemExit: 0

  warn("To exit: use 'exit', 'quit', or Ctrl-D.", stacklevel=1)


# 4. Use cases

We now generate a Python function that creates **12 datasets** for both **one-factor** and **two-factor Gauge R&R studies**, distinguishing between:  

✅ **Correct datasets** (clean, properly structured)  
✅ **Corrupt datasets** (with missing values, duplicate rows, and outliers that require cleaning)  

Each dataset will have:
- **1000 measurements** for statistical inference  
- **PTR, SNR, Cp-values, and Delta-Beta indices** properly distributed across **green, yellow, and red domains**  
- **Fluctuating Beta & Delta indices** for realistic statistical behavior  

---

### 🚀 **Generating Synthetic Gauge R&R Data**
The following function:  

1️⃣ Creates **clean datasets** where statistical parameters behave correctly.  
2️⃣ Creates **corrupt datasets** with missing values, duplicates, and extreme outliers.  
3️⃣ Saves all **datasets** with appropriate filenames.

```python
import pandas as pd
import numpy as np

def generate_gauge_rr_datasets():
    """Creates 12 datasets for one-factor and two-factor Gauge R&R studies, including correct and corrupt versions."""
    
    np.random.seed(42)  # Ensuring reproducibility
    
    def generate_data(study_type, condition, corrupt=False):
        """Generates synthetic Gauge R&R data with optional corruption."""
        
        num_samples = 1000
        operators = np.random.choice(["A", "B", "C", "D"], num_samples)
        parts = np.random.choice(["P1", "P2", "P3", "P4", "P5"], num_samples)
        trials = np.random.randint(1, 6, num_samples)  # Random trial numbers
        
        # Simulated measurement values with controlled variability
        if condition == "green":
            measured_values = np.random.normal(100, 2, num_samples)  # Low variance
            ptr, snr, cp = 0.00012, 0.00025, 1.5
        elif condition == "yellow":
            measured_values = np.random.normal(100, 5, num_samples)  # Moderate variance
            ptr, snr, cp = 0.000107, 0.000215, 1.2
        elif condition == "red":
            measured_values = np.random.normal(100, 10, num_samples)  # High variance
            ptr, snr, cp = 0.000105, 0.000195, 0.8

        # Generate fluctuating Beta & Delta indices dynamically
        beta_values = np.random.uniform(0.2, 1.5, num_samples)
        delta_values = np.random.uniform(0.5, 2.0, num_samples)

        # Introduce corruption (missing values, outliers, and duplicates)
        if corrupt:
            missing_indices = np.random.choice(num_samples, size=50, replace=False)
            measured_values[missing_indices] = np.nan  # Introduce missing values
            
            outlier_indices = np.random.choice(num_samples, size=30, replace=False)
            measured_values[outlier_indices] *= 10  # Extreme outliers
            
            duplicate_rows = pd.DataFrame({
                "Operator": ["A"] * 10,
                "Part": ["P1"] * 10,
                "Trial": [1] * 10,
                "Measured Value": [200] * 10
            })
        else:
            duplicate_rows = None
        
        # Create DataFrame
        df = pd.DataFrame({
            "Operator": operators,
            "Part": parts if study_type == "two_factor" else ["N/A"] * num_samples,
            "Trial": trials,
            "Measured Value": measured_values,
            "PTR": [ptr] * num_samples,
            "SNR": [snr] * num_samples,
            "Cp": [cp] * num_samples,
            "Beta Index": beta_values,
            "Delta Index": delta_values
        })
        
        # Append duplicate rows for corruption scenario
        if corrupt and duplicate_rows is not None:
            df = pd.concat([df, duplicate_rows], ignore_index=True)

        # Save to CSV
        filename = f"{study_type}_gauge_{condition}{'_corrupt' if corrupt else ''}.csv"
        df.to_csv(filename, index=False)
        print(f"Saved: {filename}")

    # Generate datasets for each scenario
    for study_type in ["one_factor", "two_factor"]:
        for condition in ["green", "yellow", "red"]:
            generate_data(study_type, condition, corrupt=False)  # Correct datasets
            generate_data(study_type, condition, corrupt=True)   # Corrupt datasets

generate_gauge_rr_datasets()
```

---

### ✅ **Features of This Function**
- **Creates 12 datasets** matching correct & corrupt scenarios.  
- **Implements correct statistical variability** for PTR, SNR, Cp, Beta & Delta indices.  
- **Fluctuating Beta & Delta values** for realistic GPQ behavior.  
- **Handles corruption cases (missing data, duplicates, extreme outliers).**  
- **Saves datasets with appropriate filenames** based on study type & condition.

This will generate **datasets ready for testing our Gauge R&R framework**!
```


In [1]:
import pandas as pd
import numpy as np

def generate_gauge_rr_test_datasets():
    """Creates 12 datasets for one-factor and two-factor Gauge R&R studies with minimal columns."""
    
    np.random.seed(42)  # Ensuring reproducibility
    
    def generate_data(study_type, condition, corrupt=False):
        """Generates synthetic Gauge R&R data with optional corruption."""
        
        num_samples = 1000
        operators = np.random.choice(["A", "B", "C", "D"], num_samples)
        parts = np.random.choice(["P1", "P2", "P3", "P4", "P5"], num_samples) if study_type == "two_factor" else ["N/A"] * num_samples
        trials = np.random.randint(1, 6, num_samples)  # Random trial numbers
        
        # Simulated measurement values with controlled variability
        if condition == "green":
            measured_values = np.random.normal(100, 2, num_samples)  # Low variance
        elif condition == "yellow":
            measured_values = np.random.normal(100, 5, num_samples)  # Moderate variance
        elif condition == "red":
            measured_values = np.random.normal(100, 10, num_samples)  # High variance

        # Introduce corruption (missing values, outliers, and duplicates)
        if corrupt:
            missing_indices = np.random.choice(num_samples, size=50, replace=False)
            measured_values[missing_indices] = np.nan  # Introduce missing values
            
            outlier_indices = np.random.choice(num_samples, size=30, replace=False)
            measured_values[outlier_indices] *= 10  # Extreme outliers
            
            duplicate_rows = pd.DataFrame({
                "Operator": ["A"] * 10,
                "Part": ["P1"] * 10,
                "Trial": [1] * 10,
                "Measured Value": [200] * 10
            })
        else:
            duplicate_rows = None
        
        # Create DataFrame with only necessary columns
        df = pd.DataFrame({
            "Operator": operators,
            "Part": parts,
            "Trial": trials,
            "Measured Value": measured_values
        })
        
        # Append duplicate rows for corruption scenario
        if corrupt and duplicate_rows is not None:
            df = pd.concat([df, duplicate_rows], ignore_index=True)

        # Save to CSV
        filename = f"{study_type}_gauge_{condition}{'_corrupt' if corrupt else ''}.csv"
        df.to_csv(filename, index=False)
        print(f"Saved: {filename}")

    # Generate datasets for each scenario
    for study_type in ["one_factor", "two_factor"]:
        for condition in ["green", "yellow", "red"]:
            generate_data(study_type, condition, corrupt=False)  # Correct datasets
            generate_data(study_type, condition, corrupt=True)   # Corrupt datasets

generate_gauge_rr_test_datasets()

Saved: one_factor_gauge_green.csv
Saved: one_factor_gauge_green_corrupt.csv
Saved: one_factor_gauge_yellow.csv
Saved: one_factor_gauge_yellow_corrupt.csv
Saved: one_factor_gauge_red.csv
Saved: one_factor_gauge_red_corrupt.csv
Saved: two_factor_gauge_green.csv
Saved: two_factor_gauge_green_corrupt.csv
Saved: two_factor_gauge_yellow.csv
Saved: two_factor_gauge_yellow_corrupt.csv
Saved: two_factor_gauge_red.csv
Saved: two_factor_gauge_red_corrupt.csv


In [3]:
import pandas as pd
import numpy as np

def generate_final_red_snr_ptr_dataset():
    """Creates a one-factor Gauge R&R study dataset that forces SNR and PTR below 0.0001 (Red Zone)."""

    np.random.seed(42)  # Ensuring reproducibility

    num_samples = 1000
    operators = np.random.choice(["A", "B", "C", "D"], num_samples)
    trials = np.random.randint(1, 6, num_samples)  # Random trial numbers

    # **Artificially shrink Gamma M (Measurement Variance)**
    measured_values = np.full(num_samples, 100.0000001)  # Near-zero measurement variation

    # **Introduce extreme repeatability failure (Gamma R)**
    extreme_noise = np.random.choice([-500000, 500000], num_samples)  # Massive trial-to-trial deviation
    measured_values += extreme_noise

    # Create DataFrame (one-factor study: no "Part" column)
    df = pd.DataFrame({
        "Operator": operators,
        "Part": ["N/A"] * num_samples,
        "Trial": trials,
        "Measured Value": measured_values
    })

    # Save to CSV
    filename = "one_factor_gauge_final_red_snr_ptr.csv"
    df.to_csv(filename, index=False)
    print(f"Saved: {filename}")

# Generate the dataset
generate_final_red_snr_ptr_dataset()

Saved: one_factor_gauge_final_red_snr_ptr.csv


## 4.1 Test Case: one_factor_gauge_green

![one_factor_gauge_green.PNG](attachment:828364e4-c683-4abe-88e9-9f0cc2c3d217.PNG)

## 4.2 Test Case: two_factor_gauge_green_corrupt

![two_factor_gauge_green_corrupt.PNG](attachment:7ded6bf2-ec5f-429b-bb49-36b377621e1e.PNG)