<a href="https://colab.research.google.com/github/fengfrankgthb/Demonstrations/blob/main/2025_SAT_embeddings.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# SAT HHH tested w/ all-mpnet-base-v2 Model 2025.05.28

In this project, an well-trained **transformer model** is used to represent an knowledgable super-student in front of SAT and GRE reading quexstions. **Biases** and **confusions** are disclosed, likely attributable to **over-fitting** to the training of the model. Over-fitting is the ML terminology for **test cramming**, a phenominon when fitting specificities of **training dataset** caused the model to not being able to fit to specificities of **testing dataset**.

## 1. Install Necessary Libraries
* **sentence-transformers** This is the text embedding library
* **scikit-learn** This is the machine learning library
* **matplotlib** This is the mat-lab style plotting library

In [None]:
!pip install sentence-transformers scikit-learn matplotlib

## 2. import necessary modules from the libraries



**mpl_toolkits.mplot3d**: matplot 3D plotting lib

**numpy**: Numerical Python, the fundamental python lib

**Axes3D** 3D plotting class

**PCA** Principal Components Analysis for linear dimension reduction.

**TSNE** t-SNE non-linear dimension reduction to creat more scattered effect

In [2]:
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from sentence_transformers import SentenceTransformer
from sklearn.decomposition import PCA
from sklearn.manifold import TSNE
import numpy as np

# set matplot to inline (static) mode, or notebook for interactive mode)
# even though default is inline mode, be explicit to avoid any confusion
# the interactive notebook mode is often unstable at colab environment
# alternative is set as 'inline' and use 'plotly'.
%matplotlib inline

## 3. Imput text data

Choose one subsection below:

### 3.1 Example 1: Aristotle Politics, All-mighty's age_intel vs 5Vs

Used **248HHH Q170** on **Aritotle Politics** for illustration, breaking into 9 components:

* Pa = All Sentences combined in Passage
* P1 = 1st sentence in Passage
* P2 = 2nd sentence in Passage
* P3 = 3rd sentence in Passage
* Q? = the Question sentence
* Av = correct answer A
* Bx = wrong choice B
* Cx = wrong choice C
* Dx = wrong choice D

In this example of old text, we discover that an advantage of an well-knowledgable person, **age_intel** as it's dubbed here. Age_intel come from training experience by a large amount of text data, like the 10Billion words comprehended by All-mighty Model. This is a level of intelligence reachable by few individuals, thus should **NOT** be your objective in test preparation.

Surprisingly, using Unified Moodels relying on bold 5Vs, it takes you below a minute to spot the correct answer out of wrong choices, each of which has a redundant V irrelevant to the prompt itself.

In [35]:
# Example 1: Aristotle Politics Question 347Words
# This is a 2024-25 **** question used to discover the age_intel of All-mighty old text.
# All-mighty, or a Englkish language well-knowledgable person may have an advantage, unreachable by school smartness.
# This is called age_intel.
# Why is All-mighty good at this question?
# Because the "confusing" text are from Aritstotle who did not purposely set up traps.
# Its confusion comes from distance between modern understanding and old language.
sentences = ["Pa: A student in a political science course is writing a paper on Aristotle’s The Politics, in which Aristotle offers his opinion on political instability and gives advice on how constitutions can be preserved. Aristotle observes that different forms of government can fall in diferent ways--for example, oligarchies might grant power to military leaders during wartime who refuse to relinquish that power during peacetime--but some methods of preserving order apply across all forms of government. The student claims that in particular Aristotle asserts that in a healthy state obedience to law must be as close to absolute as possible and that even minor infractions should not be ignored.",
    "P1: A student in a political science course is writing a paper on Aristotle’s The Politics, in which Aristotle offers his opinion on political instability and gives advice on how constitutions can be preserved.",
    "P2: Aristotle observes that different forms of government can fall in diferent ways--for example, oligarchies might grant power to military leaders during wartime who refuse to relinquish that power during peacetime--but some methods of preserving order apply across all forms of government.",
    "P3: The student claims that in particular Aristotle asserts that in a healthy state obedience to law must be as close to absolute as possible and that even minor infractions should not be ignored. ",
    "Q?: Which quotation from a philosopher's analysis of The Politics would best support the student's claim?",
    "Av) When constructing his argument regarding the characteristics of a well-functioning government, Aristotle asserts that ‘Transgression creeps in unperceived and at last ruins the state,’ illustrating this idea with a comparison to frequent small expenditures slowly and almost imperceptibly chipping away at a fortune until it is ultimately depleted.",
    "Bx) When Aristotle considers the health of constitutions, he states that ‘Constitutions are preserved when their destroyers are at a distance, and sometimes also because they are near, for the fear of them makes the government keep in hand the constitution.' He holds that rulers who wish to see constitutions preserved must continually remind the populace of the dangers that would result from a constitutional collapse.",
    "Cx) When contrasting different forms of government, Aristotle holds that ‘oligarchies may last, not from any inherent stability in such forms of government, but because the rulers are on good terms both with the unenfranchised and with the governing classes,’ That is, oligarchic leaders who wish to hold on to power will introduce members of disenfranchised classes into government in a participatory role.",
    "Dx) When Aristotle writes on the necessity of avoiding corruption in government, he proposes that 'every state should be so administered and so regulated by law that its magistrates cannot possibly make money.' In particular he thinks oligarchies are particularly susceptible to corruption through bribery."
]

In [None]:
# Example 1 (Alt-1): Aristotle Politics Question 347Words
# This is 1st alteration to discover the inability of long_comp_conf in old text.
sentences = ["Pa: A student in a political science course is writing a paper on Aristotle’s The Politics, in which Aristotle offers his opinion on political instability and gives advice on how constitutions can be preserved. Aristotle observes that different forms of government can fall in diferent ways--for example, oligarchies might grant power to military leaders during wartime who refuse to relinquish that power during peacetime--but some methods of preserving order apply across all forms of government. The student claims that in particular Aristotle asserts that in a healthy state obedience to law must be as close to absolute as possible and that even minor infractions should not be ignored.",
    "P1: In The Politics, Aristotle offers his opinion on political instability and gives advice on how constitutions can be preserved.",
    "P2: Aristotle observes that government can fall in different forms, but some methods of preserving order apply across all forms of government.",
    "P3: Aristotle asserts that obedience to law must be absolute and that even minor infractions should not be ignored.",
    "Q?: Which quotation from a philosopher's analysis of The Politics would best support this assertion?",
    "Av) When constructing his argument regarding the characteristics of a well-functioning government, ‘Transgression creeps in unperceived and at last ruins the state,’ illustrating this idea with a comparison to frequent small expenditures slowly and almost imperceptibly chipping away at a fortune until it is ultimately depleted.",
    "Bx) When Aristotle considers the health of constitutions, he states that ‘Constitutions are preserved when their destroyers are at a distance, and sometimes also because they are near, for the fear of them makes the government keep in hand the constitution.' He holds that rulers who wish to see constitutions preserved must continually remind the populace of the dangers that would result from a constitutional collapse.",
    "Cx) When contrasting different forms of government, Aristotle holds that ‘oligarchies may last, not from any inherent stability in such forms of government, but because the rulers are on good terms both with the unenfranchised and with the governing classes,’ That is, oligarchic leaders who wish to hold on to power will introduce members of disenfranchised classes into government in a participatory role.",
    "Dx) When Aristotle writes on the necessity of avoiding corruption in government, he proposes that 'every state should be so administered and so regulated by law that its magistrates cannot possibly make money.' In particular he thinks oligarchies are particularly susceptible to corruption through bribery."
]

In [None]:
# Example 1 (Alt-2): Aristotle Politics Question 347Words
# This is 2nd alteration to discover the inability of long_comp_unconf in old text.
sentences = ["Pa: A student in a political science course is writing a paper on Aristotle’s The Politics, in which Aristotle offers his opinion on political instability and gives advice on how constitutions can be preserved. Aristotle observes that different forms of government can fall in diferent ways--for example, oligarchies might grant power to military leaders during wartime who refuse to relinquish that power during peacetime--but some methods of preserving order apply across all forms of government. The student claims that in particular Aristotle asserts that in a healthy state obedience to law must be as close to absolute as possible and that even minor infractions should not be ignored.",
    "P1: In The Politics, Aristotle offers his opinion on political instability and gives advice on how constitutions can be preserved.",
    "P2: Aristotle observes that government can fall in different forms, but some methods of preserving order apply across all forms of government.",
    "P3: Aristotle asserts that obedience to law must be absolute and that even minor infractions should not be ignored.",
    "Q?: Which quotation from a philosopher's analysis of The Politics would best support this assertion?",
    "Av) 'Transgression creeps in unperceived and at last ruins the state': this idea with a comparison to frequent small expenditures slowly chipping away at a fortune until it is ultimately depleted.",
    "Bx) 'Constitutions are preserved when their destroyers are at a distance, and sometimes also because they are near, for the fear of them makes the government keep in hand the constitution': rulers must continually remind the populace of the dangers from a constitutional collapse.",
    "Cx) 'Oligarchies may last, not from any inherent stability in such forms of government, but because the rulers are on good terms both with the unenfranchised and with the governing classes': oligarchic leaders will introduce disenfranchised classes into participatory roles.",
    "Dx) 'Every state should be so administered and so regulated by law that its magistrates cannot possibly make money': oligarchies are particularly susceptible to corruption through bribery."
]

In [None]:
# Example 1 (Alt-3): Aristotle Politics Question 347Words
# This is 3rd alrteration to discover the inability of long_comp_conf in old text.
sentences = ["Pa: A student in a political science course is writing a paper on Aristotle’s The Politics, in which Aristotle offers his opinion on political instability and gives advice on how constitutions can be preserved. Aristotle observes that different forms of government can fall in diferent ways--for example, oligarchies might grant power to military leaders during wartime who refuse to relinquish that power during peacetime--but some methods of preserving order apply across all forms of government. The student claims that in particular Aristotle asserts that in a healthy state obedience to law must be as close to absolute as possible and that even minor infractions should not be ignored.",
    "P1: In The Politics, Aristotle offers his opinion on political instability and gives advice on how constitutions can be preserved.",
    "P2: Aristotle observes that government can fall in different forms, but some methods of preserving order apply across all forms of government.",
    "P3: Aristotle asserts that obedience to law must be absolute and that even minor infractions should not be ignored.",
    "Q?: Which quotation from a philosopher's analysis of The Politics would best support this assertion?",
    "Av) frequent small expenditures slowly chipping away at a fortune until it is ultimately depleted.",
    "Bx) rulers must continually remind the populace of the dangers from a constitutional collapse.",
    "Cx) oligarchic leaders will introduce disenfranchised classes into participatory roles.",
    "Dx) oligarchies are particularly susceptible to corruption through bribery."
]

In [None]:
# Example 1 (Alt-4): Aristotle Politics Question 347Words
# This is 4th alteration to discover the long_comp_conf in old text.
sentences = ["Pa: A student in a political science course is writing a paper on Aristotle’s The Politics, in which Aristotle offers his opinion on political instability and gives advice on how constitutions can be preserved. Aristotle observes that different forms of government can fall in diferent ways--for example, oligarchies might grant power to military leaders during wartime who refuse to relinquish that power during peacetime--but some methods of preserving order apply across all forms of government. The student claims that in particular Aristotle asserts that in a healthy state obedience to law must be as close to absolute as possible and that even minor infractions should not be ignored.",
    "P1: In The Politics, Aristotle offers his opinion on political instability and gives advice on how constitutions can be preserved.",
    "P2: Aristotle observes that government can fall in different forms, but some methods of preserving order apply across all forms of government.",
    "P3: Aristotle asserts that obedience to law must be absolute and that even minor infractions should not be ignored.",
    "Q?: Which quotation from a philosopher's analysis of The Politics would best support this assertion?",
    "Av) 'Transgression creeps in unperceived and at last ruins the state.'",
    "Bx) 'Constitutions are preserved when their destroyers are at a distance, and sometimes also because they are near, for the fear of them makes the government keep in hand the constitution.'",
    "Cx) 'Oligarchies may last, not from any inherent stability in such forms of government, but because the rulers are on good terms both with the unenfranchised and with the governing classes.'",
    "Dx) 'Every state should be so administered and so regulated by law that its magistrates cannot possibly make money.'"
]

In [38]:
# Example 1 (Alt-5): Aristotle Politics Question 347Words
# This is 5th alteration to finally solve the long_comp_conf in old text by trimming sentence off all peripherals/specifics.
sentences = ["Pa: A student in a political science course is writing a paper on Aristotle’s The Politics, in which Aristotle offers his opinion on political instability and gives advice on how constitutions can be preserved. Aristotle observes that different forms of government can fall in diferent ways--for example, oligarchies might grant power to military leaders during wartime who refuse to relinquish that power during peacetime--but some methods of preserving order apply across all forms of government. The student claims that in particular Aristotle asserts that in a healthy state obedience to law must be as close to absolute as possible and that even minor infractions should not be ignored.",
    "P1: political instability and how constitutions can be preserved.",
    "P2: government can fall, but methods of preserving apply.",
    "P3: obedience to law must be absolute and minor infractions should not be ignored.",
    "Q?: Which quotation would best support this assertion?",
    "Av) 'Transgression creeps and ruins the state.'",
    "Bx) 'Constitutions are preserved.'",
    "Cx) 'Oligarchies may last.'",
    "Dx) 'Every state should be so administered and regulated.'"
]

In [41]:
# Example 1 (Alt-6): Aristotle Politics Question 347Words
# This is 6th alreration to discover the inertia of age_intel in working with 5Vs.
sentences = ["Pa: subjective; positive.",
    "P1: subjective; positive.",
    "P2: subjective; positive.",
    "P3: subjective; positive.",
    "Q?: subjective; positive?",
    "Av) subjective; positive.",
    "Bx) subjective; positive; cause effect.",
    "Cx) subjective; positive; cause effect.",
    "Dx) subjective; negative."
]

## 4. Generate All-mighty Embeddings for Aristotle Sentences.


**So, Let run the embedding code below** ↓ ↓ ↓

In [None]:
# Load the pre-trained model
model = SentenceTransformer('all-mpnet-base-v2')

# Generate the embeddings
embeddings = model.encode(sentences)

print("Shape of embeddings:", embeddings.shape)
print("Example embedding (last sentence):\n", embeddings[8][:100]) # Print the first 100 dimensions only

# 5. Reduce Dimensionality: 3D with PCA and showing with P-sphere, P-tet, and Scatterplot

This is the **primary demonstration** of analytical result, showing the question elements (**prompt**, **question**, and **choices**) in all three renderings (**sphere**, **tetrehedron**, and **scatterplots**)

In [None]:
# Assumed Embeddings from section 4.Generate Embeddings for Sentences.
# Sphere, tetrahedron, and scatterplot.
import plotly.graph_objects as go
import numpy as np
from sklearn.decomposition import PCA

labels = [s[:2] for s in sentences]  # Get first two letters.

# Perform PCA to reduce to 3 dimensions
pca_3d = PCA(n_components=3)
reduced_embeddings_3d = pca_3d.fit_transform(embeddings)

# Directly define the indices, assuming P1, P2, P3, Q? are at indices 1, 2, 3, and 4
p1_index = 1
p2_index = 2
p3_index = 3
q_index = 4

# Extract the 3D coordinates of P1, P2, P3, and Q?
p1_coords = reduced_embeddings_3d[p1_index]
p2_coords = reduced_embeddings_3d[p2_index]
p3_coords = reduced_embeddings_3d[p3_index]
q_coords = reduced_embeddings_3d[q_index]

# Define the vertices of the tetrahedron
tetrahedron_vertices = np.array([p1_coords, p2_coords, p3_coords, q_coords])

# Define the edges of the tetrahedron (indices of vertices)
tetrahedron_edges = [
    (0, 1), (0, 2), (0, 3),  # Edges from P1 to P2, P3, Q?
    (1, 2), (1, 3),          # Edges from P2 to P3, Q?
    (2, 3)                   # Edge from P3 to Q?
]

# Create the lines for the tetrahedron edges
lines = []
for i, (start_index, end_index) in enumerate(tetrahedron_edges):
    start_point = tetrahedron_vertices[start_index]
    end_point = tetrahedron_vertices[end_index]
    lines.append(
        go.Scatter3d(
            x=[start_point[0], end_point[0]],
            y=[start_point[1], end_point[1]],
            z=[start_point[2], end_point[2]],
            mode='lines',
            line=dict(color='red', width=4),  # Style the lines
            name=f'Edge {i+1}'
        )
    )

# --- Start of new code for sphere calculation ---
# Points for sphere calculation
P = [p1_coords, p2_coords, p3_coords, q_coords]
x = [p[0] for p in P]
y = [p[1] for p in P]
z = [p[2] for p in P]

# System of equations to find sphere center (cx, cy, cz)
# (xi-cx)^2 + (yi-cy)^2 + (zi-cz)^2 = R^2
# Subtracting equation for P1 from P2, P3, P4:
# 2(x2-x1)cx + 2(y2-y1)cy + 2(z2-z1)cz = (x2^2+y2^2+z2^2) - (x1^2+y1^2+z1^2)
# ... and so on for P3 and P4

A = np.array([
    [2*(x[1]-x[0]), 2*(y[1]-y[0]), 2*(z[1]-z[0])],
    [2*(x[2]-x[0]), 2*(y[2]-y[0]), 2*(z[2]-z[0])],
    [2*(x[3]-x[0]), 2*(y[3]-y[0]), 2*(z[3]-z[0])]
])

B = np.array([
    (x[1]**2 + y[1]**2 + z[1]**2) - (x[0]**2 + y[0]**2 + z[0]**2),
    (x[2]**2 + y[2]**2 + z[2]**2) - (x[0]**2 + y[0]**2 + z[0]**2),
    (x[3]**2 + y[3]**2 + z[3]**2) - (x[0]**2 + y[0]**2 + z[0]**2)
])

try:
    sphere_center = np.linalg.solve(A, B)
    cx, cy, cz = sphere_center[0], sphere_center[1], sphere_center[2]
    sphere_radius = np.sqrt((x[0]-cx)**2 + (y[0]-cy)**2 + (z[0]-cz)**2)

    # Generate sphere surface points
    u = np.linspace(0, 2 * np.pi, 50) # Azimuthal angle
    v = np.linspace(0, np.pi, 25)    # Polar angle

    sphere_x = cx + sphere_radius * np.outer(np.cos(u), np.sin(v))
    sphere_y = cy + sphere_radius * np.outer(np.sin(u), np.sin(v))
    sphere_z = cz + sphere_radius * np.outer(np.ones(np.size(u)), np.cos(v)) # Corrected from np.ones_like(u) to np.ones(np.size(u))

    sphere_surface = go.Surface(
        x=sphere_x, y=sphere_y, z=sphere_z,
        opacity=0.3,
        colorscale='Blues', # You can choose other colorscales e.g. 'Viridis', 'RdBu'
        showscale=False, # Hide the color scale bar for the sphere
        name='Sphere'
    )
    sphere_added = True
except np.linalg.LinAlgError:
    print("Could not determine the sphere: Points might be coplanar or collinear.")
    sphere_surface = None # No sphere to add if calculation fails
    sphere_added = False
# --- End of new code for sphere calculation ---

# Create the 3D scatter plot and add the lines and sphere
data_elements = [go.Scatter3d(
    x=reduced_embeddings_3d[:, 0],
    y=reduced_embeddings_3d[:, 1],
    z=reduced_embeddings_3d[:, 2],
    mode='markers+text',
    text=labels,  # Use first 2 characters of labels
    textposition="middle right",
    marker=dict(size=4),
    name='Data Points'
)] + lines

if sphere_added and sphere_surface:
    data_elements.append(sphere_surface)

fig = go.Figure(data=data_elements)


# Set the title and axis labels
fig.update_layout(
    title="3D PCA in Scatterplot, P-Tet, and Sphere",
    scene=dict(
        xaxis_title="PC 1",
        yaxis_title="PC 2",
        zaxis_title="PC 3",
        # Aspect ratio to make the sphere look more like a sphere
        aspectmode='data' # 'auto', 'cube', 'data', 'manual'
    ),
    margin=dict(l=0, r=0, b=0, t=50),
    showlegend=True
)

# Show the plot
fig.show()

## 6. How about 5-Vs?

In this section, values along 5Vs are provided.

Thus, **All-mighty embedding is unnecessary to call**.

Below is a typical preception with inital training/ While erros exist across V2/V3/V4/V5, the student can vaguely see the correct answer Bv from the wrong choices Ax, Cx, and Dx.

5Vs---Pa---P1---P2---P3---Q?---Av---Bx---Cx---Dx---

V1-----1-----1-----1------1-----1------1-----1-----1-----0---

V2-----1-----1-----1-----1------1-----1------1------1-----1---

V3-----0-----0-----0-----0-----0------0------0------0-----0---

V4-----0-----0-----0-----0-----0------0------0------0-----0---

V5-----0-----0-----0-----0-----0------0------1------1-----0---

In [None]:
# Scatterplot, P-Tet, and P-sphere: 5V-fin
import numpy as np
import pandas as pd
import plotly.graph_objects as go
from sklearn.decomposition import PCA
from io import StringIO

# Your data as a string (including header row)
data_string = """datapoint,V1,V2,V3,V4,V5
Pa,1,1,0,0,0
P1,1,1,0,0,0
P2,1,1,0,0,0
P3,1,1,0,0,0
Q?,1,1,0,0,0
Av,1,1,0,0,0
Bx,1,1,0,0,1
Cx,1,1,0,0,1
Dx,0,1,0,0,0
"""

# Load the data from the string into a Pandas DataFrame
df = pd.read_csv(StringIO(data_string))

# Extract the data for PCA (exclude the 'datapoint' column)
X = df.iloc[:, 1:].values  # Get values from columns V1 to V5

# Extract the datapoint labels
labels = df['datapoint'].tolist()

# Perform PCA
pca = PCA(n_components=3)
reduced_data = pca.fit_transform(X)

# Get the indices of P1, P2, P3, and Q?
p1_index = labels.index('P1')
p2_index = labels.index('P2')
p3_index = labels.index('P3')
q_index = labels.index('Q?')

# Extract the 3D coordinates of P1, P2, P3, and Q?
p1_coords = reduced_data[p1_index]
p2_coords = reduced_data[p2_index]
p3_coords = reduced_data[p3_index]
q_coords = reduced_data[q_index]

# Define the vertices of the tetrahedron (also the points defining the sphere)
sphere_defining_points = [p1_coords, p2_coords, p3_coords, q_coords]
tetrahedron_vertices = np.array(sphere_defining_points)

# --- Functions to calculate sphere center and radius ---
def get_sphere_coeffs(p1, p2, p3, p4):
    """
    Calculates the coefficients A, B, C, D of the sphere equation:
    x^2 + y^2 + z^2 + Ax + By + Cz + D = 0
    passing through four points p1, p2, p3, p4.
    """
    points = np.array([p1, p2, p3, p4])
    # Form matrix M for the system M * [A, B, C, D]' = -[x^2+y^2+z^2]'
    M = np.ones((4, 4))
    M[:, 0] = points[:, 0]  # x coordinates
    M[:, 1] = points[:, 1]  # y coordinates
    M[:, 2] = points[:, 2]  # z coordinates

    # Right hand side vector
    rhs = -(points[:, 0]**2 + points[:, 1]**2 + points[:, 2]**2)

    try:
        # Solve for A, B, C, D
        coeffs = np.linalg.solve(M, rhs)
        return coeffs
    except np.linalg.LinAlgError:
        print("Error: The four points might be coplanar. Cannot determine a unique sphere.")
        return None

def calculate_sphere_center_radius(p1, p2, p3, p4):
    """
    Calculates the center and radius of a sphere passing through four points.
    p1, p2, p3, p4 are 3D points as NumPy arrays or lists [x, y, z].
    Returns (center, radius) or (None, None) if points are coplanar.
    """
    coeffs = get_sphere_coeffs(np.array(p1), np.array(p2), np.array(p3), np.array(p4))

    if coeffs is None:
        return None, None

    A, B, C, D_coeff = coeffs

    # Center (xc, yc, zc)
    xc = -A / 2
    yc = -B / 2
    zc = -C / 2
    center = np.array([xc, yc, zc])

    # Radius R
    # R^2 = xc^2 + yc^2 + zc^2 - D_coeff
    radius_sq = xc**2 + yc**2 + zc**2 - D_coeff
    if radius_sq < 0:
        print("Error: Calculated radius squared is negative (points might be collinear or an issue with PCA reduction).")
        return None, None
    radius = np.sqrt(radius_sq)

    return center, radius

# --- Function to generate Plotly sphere surface ---
def get_plotly_sphere_surface(center, radius, color='rgba(0,180,255,0.3)', resolution=50, name='Circumsphere'):
    """
    Generates Plotly go.Surface data for a sphere.
    center: NumPy array or list for the sphere's center [xc, yc, zc].
    radius: Radius of the sphere.
    color: Color of the sphere as an rgba string (e.g., 'rgba(R,G,B,A)').
    resolution: Number of points for theta and phi.
    name: Legend name for the sphere.
    """
    theta = np.linspace(0, 2 * np.pi, resolution)
    phi = np.linspace(0, np.pi, resolution)
    theta, phi = np.meshgrid(theta, phi)

    x = center[0] + radius * np.cos(theta) * np.sin(phi)
    y = center[1] + radius * np.sin(theta) * np.sin(phi)
    z = center[2] + radius * np.cos(phi)

    # To achieve a single color with opacity, we set the color directly in the colorscale
    # and ensure opacity is handled by the color string itself or the opacity property of go.Surface

    # Extract RGB from the rgba string if provided, default to a blue if format is unexpected
    try:
        rgb_color_part = color.split('(')[1].split(')')[0].split(',')
        r, g, b = rgb_color_part[0], rgb_color_part[1], rgb_color_part[2]
        plotly_color = f'rgb({r},{g},{b})'
        opacity_val = float(rgb_color_part[3]) if len(rgb_color_part) > 3 else 0.3
    except:
        plotly_color = 'rgb(0,180,255)' # Default blue
        opacity_val = 0.3

    return go.Surface(
        x=x, y=y, z=z,
        colorscale=[[0, plotly_color], [1, plotly_color]], # Solid color
        showscale=False,
        opacity=opacity_val,
        name=name,
        hoverinfo='skip' # Optional: disable hover for the sphere surface
    )

# --- Plotting ---

# Create the 3D scatter plot for all points
scatter = go.Scatter3d(
    x=reduced_data[:, 0],
    y=reduced_data[:, 1],
    z=reduced_data[:, 2],
    mode='markers+text',
    text=labels,
    textposition="middle right",
    marker=dict(size=4, color='black'), # Changed marker color for better visibility against sphere
    name='Data Points'
)

# Define the edges of the tetrahedron
tetrahedron_edges_indices = [
    (0, 1), (0, 2), (0, 3),  # Edges from P1
    (1, 2), (1, 3),          # Edges from P2
    (2, 3)                   # Edge from P3
]

# Create the lines for the tetrahedron edges
lines = []
for i, (start_idx, end_idx) in enumerate(tetrahedron_edges_indices):
    start_point = tetrahedron_vertices[start_idx]
    end_point = tetrahedron_vertices[end_idx]
    lines.append(
        go.Scatter3d(
            x=[start_point[0], end_point[0]],
            y=[start_point[1], end_point[1]],
            z=[start_point[2], end_point[2]],
            mode='lines',
            line=dict(color='red', width=3),
            name=f'Tetrahedron Edge' # Simplified name
        )
    )
# To avoid multiple "Tetrahedron Edge" legends, only the first one will show by default if names are identical.
# Or, make them unique if needed, or group them. For simplicity, keep as is or assign name to only one.
if lines:
    lines[0].showlegend = True # Show legend for the first edge only as representative
    for line_trace in lines[1:]:
        line_trace.showlegend = False


# Calculate sphere center and radius using the PCA-reduced coordinates
sphere_center, sphere_radius = calculate_sphere_center_radius(p1_coords, p2_coords, p3_coords, q_coords)

# Initialize data list for the figure
data_traces = [scatter] + lines

# Add sphere to the plot if calculation was successful
sphere_trace = None
if sphere_center is not None and sphere_radius is not None:
    print(f"Sphere Center (PCA coords): {sphere_center}")
    print(f"Sphere Radius (PCA coords): {sphere_radius}")
    sphere_trace = get_plotly_sphere_surface(
        sphere_center,
        sphere_radius,
        color='rgba(100, 180, 255, 0.3)', # Light blue, semi-transparent
        resolution=40, # Lower resolution for faster rendering, increase for smoother sphere
        name='Circumsphere P1-P2-P3-Q?'
    )
    data_traces.append(sphere_trace)
else:
    print("Could not calculate sphere parameters. Sphere will not be plotted.")


# Set the title and axis labels
layout = go.Layout(
    title="Scatterplot, P-Tet, and P-sphere: 5V-fin",
    scene=dict(
        xaxis_title="PC 1",
        yaxis_title="PC 2",
        zaxis_title="PC 3",
        aspectmode='data' # 'data', 'cube', 'auto', 'manual'
                         # 'data' ensures that the scaling of axes matches the data range
                         # 'cube' makes the plot region a cube
    ),
    margin=dict(l=0, r=0, b=0, t=50),
    showlegend=True
)

# Create the figure
fig = go.Figure(data=data_traces, layout=layout)

# Show the plot
fig.show()