In [None]:
import pandas as pd
from sklearn.decomposition import PCA
from sklearn.preprocessing import StandardScaler

input_data = pd.read_csv("hw1_input.csv")
output_real = pd.read_csv("hw1_real.csv")
output_imag = pd.read_csv("hw1_img.csv")

combined_data = pd.concat([input_data, output_real, output_imag], axis=1)


scaler = StandardScaler()
combined_data_scaled = scaler.fit_transform(combined_data)


pca = PCA()
pca.fit(combined_data_scaled)


explained_variance = pca.explained_variance_ratio_[:3]
cumulative_variance = explained_variance.cumsum()


explained_variance_df = pd.DataFrame({
    'Principal Component': [f'PC{i+1}' for i in range(3)],
    'Explained Variance Ratio': explained_variance,
    'Cumulative Variance Ratio': cumulative_variance
})

loadings = pca.components_[:3, :11]
loadings_df = pd.DataFrame(loadings, columns=combined_data.columns[:11])
loadings_df.index = [f'PC{i+1}' for i in range(3)]


print("Explained Variance and Cumulative Variance for PC1 to PC3:")
print(explained_variance_df)

print("\nLoadings (Contribution of Each of the First 11 Features to PC1 to PC3):")
print(loadings_df)


Explained Variance and Cumulative Variance for PC1 to PC3:
  Principal Component  Explained Variance Ratio  Cumulative Variance Ratio
0                 PC1                  0.520379                   0.520379
1                 PC2                  0.142717                   0.663096
2                 PC3                  0.110752                   0.773848

Loadings (Contribution of Each of the First 11 Features to PC1 to PC3):
     length of patch  width of patch  height of patch  height of substrate  \
PC1        -0.004894        0.059770         0.004230             0.064432   
PC2        -0.012305       -0.013879        -0.008578            -0.002984   
PC3         0.005707       -0.036499        -0.010962            -0.030202   

     height of solder resist layer  radius of the probe     c_pad  c_antipad  \
PC1                      -0.001097             0.003096 -0.003419  -0.004743   
PC2                      -0.003461             0.028281  0.000656  -0.015860   
PC3            

1-Yes, focusing on the first three components hold nearly the 80% of cumulative variance thus the information, that shows us we can simply explain this data using three components. With just PC1, PC2, and PC3, we capture the most critical design factors. This dimensionality reduction allows us to reduce complexity significantly without losing essential information, we also suggest that these components contain the primary patterns influencing S11 magnitudes.

2-PC1, PC2, and PC3 together explain 77.4% of the total variance, which is close to the 80% threshold we aim to capture.
This means underlying structure represented by these first three components explains that nearly 80% of the variability in the design parameters (and likely much of the impact on S11 magnitude response)   


3.1  Key Takeaways for PC1

Width of patch (0.0598) and height of substrate (0.0644) have the highest positive coefficients in PC1, indicating that they are the strongest contributors to the variance captured by this component.
Dielectric constant of substrate (0.0273) also contributes positively, suggesting that the material property related to substrate plays a role in this main pattern.
The length of patch and height of patch have near-zero loadings, indicating minimal contribution to PC1.
In summary, PC1 is primarily influenced by the width of the patch, height of the substrate, and the dielectric constant of the substrate. These parameters are likely crucial in determining the main geometric and material pattern in the data, which explains the majority of the variance related to S11.

3.2 Key Takeaways for PC2

Radius of the probe (0.0283) and c_probe (0.0262) are the most significant contributors to PC2, indicating that probe positioning and capacitance factors play a prominent role in this secondary pattern.
The width of patch and height of substrate show lower, near-zero contributions in PC2 compared to PC1, suggesting that these geometric parameters are less significant in the variation captured by PC2.
In summary, PC2 is primarily influenced by probe-related parameters such as radius of the probe and c_probe. This component likely represents variation related to the electrical characteristics and positioning of the probe, rather than overall geometry.

3.3 Key Takeaways for PC3
c_probe (0.0373) and radius of the probe (0.0214) are the main contributors to PC3, indicating that probe-related properties are again significant in this component.
Width of patch (-0.0365) and dielectric constant of substrate (-0.0368) have high negative loadings, suggesting that these factors influence the design variability represented in PC3 but in an opposite direction compared to other parameters.
Other parameters such as height of patch and c_antipad have lower loadings, indicating they play a smaller role in this component.