# Scikit-learn Engineering Analysis Examples
Scikit-learn (sklearn) is a powerful Python library for machine learning, widely used in various fields, including engineering analysis. Here is an example of how to perform a basic engineering analysis task, such as predicting material properties, using sklearn.
Example: Predicting Material Hardness
This example demonstrates predicting a material's hardness based on its composition using a linear regression model.</br>
#### Explanation:
* Data Preparation:</br>
A sample dataset representing material composition and hardness is created. In a real-world scenario, this would be loaded from a file (e.g., CSV, Excel).</br>

* Feature and Target Definition:<br>
The independent variables (material composition) are defined as X, and the dependent variable (hardness) as y.
* Data Splitting:</br>
The data is split into training and testing sets to evaluate the model's generalization performance on unseen data.
* Model Training:</br>
A LinearRegression model from sklearn.linear_model is initialized and trained using the fit() method on the training data.
* Prediction:</br>
The trained model predicts hardness values for the test set using the predict() method.
* Model Evaluation:</br>
The mean_squared_error and root_mean_squared_error metrics are used to quantify the difference between predicted and actual hardness values.
* New Predictions:</br>
The trained model can then be used to predict the hardness of new, unobserved material compositions.</br>

This example demonstrates a fundamental application of sklearn in engineering analysis. More complex scenarios might involve feature engineering, different machine learning algorithms (e.g., Support Vector Machines, Random Forests), and more elaborate evaluation metrics.

In [4]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error
import numpy as np

# 1. Create a sample dataset (replace with your actual engineering data)
data = {
    'Carbon_Content': [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0],
    'Manganese_Content': [0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 1.1, 1.2, 1.3, 1.4],
    'Hardness_HV': [150, 180, 210, 240, 270, 300, 330, 360, 390, 420]
}
df = pd.DataFrame(data)

# 2. Define features (X) and target (y)
X = df[['Carbon_Content', 'Manganese_Content']]
y = df['Hardness_HV']

# 3. Split data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# 4. Initialize and train the Linear Regression model
model = LinearRegression()
model.fit(X_train, y_train)

# 5. Make predictions on the test set
y_pred = model.predict(X_test)

# 6. Evaluate the model's performance
mse = mean_squared_error(y_test, y_pred)
rmse = np.sqrt(mse)

print(f"Mean Squared Error: {mse:0.4e}")
print(f"Root Mean Squared Error: {rmse:0.4e}")

# 7. Use the trained model for new predictions
new_material_composition = pd.DataFrame([[0.55, 0.95]], columns=['Carbon_Content', 'Manganese_Content'])
predicted_hardness = model.predict(new_material_composition)
print(f"Predicted Hardness for new material: {predicted_hardness[0]:.2f} HV")

Mean Squared Error: 8.0779e-27
Root Mean Squared Error: 8.9877e-14
Predicted Hardness for new material: 285.00 HV
