Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ML model is predicting the same output for different inputs #19427

Closed
na2209 opened this issue Apr 3, 2024 · 4 comments
Closed

ML model is predicting the same output for different inputs #19427

na2209 opened this issue Apr 3, 2024 · 4 comments
Assignees

Comments

@na2209
Copy link

na2209 commented Apr 3, 2024

My ML model is predicting the same output for different inputs can someone help me?
Here is the code

# Import necessary libraries
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score

# Step 1: Load the dataset
data = pd.read_excel('C:\\Users\\hp\\Desktop\\SugarcaneDataset.xlsx')  # Adjust the file path as needed

# Step 2: Split the dataset into features (X) and target variable (y)
X = data.drop(columns=['Brix'])  # Features
y = data['Brix']  # Target variable

# Step 3: Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Step 4: Initialize and train the Random Forest Regression model
rf_regressor = RandomForestRegressor(n_estimators=100, random_state=42)  # Adjust hyperparameters as needed
rf_regressor.fit(X_train, y_train)

# Step 5: Make predictions on the testing set
y_pred = rf_regressor.predict(X_test)

# Step 6: Evaluate the model's performance
mse = mean_squared_error(y_test, y_pred)
mae = mean_absolute_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)

print("Mean Squared Error:", mse)
print("Mean Absolute Error:", mae)
print("R-squared:", r2)

Step 7: Optionally, deploy the model and document the process

Now when i want to get predicted value of diff samples it gives me the same for both even these both are different:
1)

predicted_brix = rf_regressor.predict([  
[
530.279,	162.9097,	74.48293,	41.56035,	29.57368,	17.8308
]])
print("Predicted Brix value:", predicted_brix)
Predicted Brix value: [17.774]

predicted_brix = rf_regressor.predict([  
[
613.5688,	172.6195,	82.26472,	43.53942,	31.61325,	19.01953

]])
print("Predicted Brix value:", predicted_brix)
Predicted Brix value: [17.774]

Why this is happening?

@mauriceoboya
Copy link

Your model might be overfitting the training data, capturing noise rather than underlying patterns. This can lead to poor generalization to new data. You could try to; Check for multicollinearity among features, normalizing your features if they are on different scales or even Experiment with different machine learning algorithms.

@na2209
Copy link
Author

na2209 commented Apr 3, 2024

Still facing the same issue

@sachinprasadhs
Copy link
Collaborator

This repo is for reporting the bugs for the Keras framework only.
From your code, it does not look like you're using Keras.
For support related questions you can post in Stack Overflow community.

Copy link

Are you satisfied with the resolution of your issue?
Yes
No

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants