<a href="https://colab.research.google.com/github/innovativenexusbd/AgroAI/blob/main/waterresource.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

***Water Resource Analysis***

Importing

In [None]:
import pandas as pd
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report, confusion_matrix
import matplotlib.pyplot as plt

Loading the Dataset, setting the thresholds and evaluate the quality

In [None]:
sample_df = pd.read_csv('synthetic_water_data.csv')
#thresholds needs to be adjusted
quality_thresholds = {
    'pH': (6.5, 8.5),
    'Turbidity': 5,
    'Dissolved Oxygen': 8,
    'Nitrate': 10,
    'Phosphate': 1,
}
def evaluate_quality(row):
    if (quality_thresholds['pH'][0] <= row['pH'] <= quality_thresholds['pH'][1] and
        row['Turbidity'] <= quality_thresholds['Turbidity'] and
        row['Dissolved Oxygen'] >= quality_thresholds['Dissolved Oxygen'] and
        row['Nitrate'] <= quality_thresholds['Nitrate'] and
        row['Phosphate'] <= quality_thresholds['Phosphate']):
        return 1  # Good quality
    else:
        return 0  # Poor quality
sample_df['Quality'] = sample_df.apply(evaluate_quality, axis=1)



Prepare and Split the Dataset

In [None]:
X = sample_df.drop(['id', 'Quality'], axis=1)
y = sample_df['Quality']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

Initiatize and Train Random Forest Model

In [None]:
rf = RandomForestClassifier(n_estimators=100, random_state=42)
rf.fit(X_train, y_train)


Prediction, classification and Identifying the Best Water Source

In [None]:
y_pred = rf.predict(X_test)
print("Classification Report:")
print(classification_report(y_test, y_pred))
print("Confusion Matrix:")
print(confusion_matrix(y_test, y_pred))
sample_df['Prediction'] = rf.predict(X)
best_sources = sample_df[sample_df['Prediction'] == 1]

Classification Report:
              precision    recall  f1-score   support

           0       0.99      1.00      1.00       199
           1       0.00      0.00      0.00         1

    accuracy                           0.99       200
   macro avg       0.50      0.50      0.50       200
weighted avg       0.99      0.99      0.99       200

Confusion Matrix:
[[199   0]
 [  1   0]]


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


Saving the Data

In [None]:
sample_df.to_csv('water_quality_predictions.csv', index=False)
best_sources.to_csv('best_water_sources.csv', index=False)
print("Water Quality Predictions Saved:")
print(sample_df.head())
print("\nBest Water Sources:")
print(best_sources.head())

Water Quality Predictions Saved:
   id        pH  Turbidity  Dissolved Oxygen    Nitrate  Phosphate  Quality  \
0   1  6.310890   1.851329          7.355351  33.635150   2.859979        0   
1   2  8.327500   5.419009          7.222809  39.834070   4.027162        0   
2   3  7.561979   8.729458         13.156291  12.523395   3.800805        0   
3   4  7.095305   7.322249          7.245916  31.243705   0.769500        0   
4   5  5.546065   8.065611          7.447548  28.587299   0.746247        0   

   Prediction  
0           0  
1           0  
2           0  
3           0  
4           0  

Best Water Sources:
      id        pH  Turbidity  Dissolved Oxygen   Nitrate  Phosphate  Quality  \
9     10  7.478254   4.894250          9.388141  2.591790   0.283402        1   
161  162  7.213070   1.288699         10.588357  1.095666   0.425282        1   
378  379  7.850640   3.592334          9.017776  4.419247   0.726633        1   
503  504  7.848283   3.412478         13.604424  1.