In [20]:
'''
This program simulates the creation of a synthetic dataset related to patient data for medications and conditions. It then splits this dataset into 
training and testing sets to train a machine learning model and evaluate its accuracy.

Let's break down the steps:

Data Generation: Synthetic patient data is created, including features such as age, medication, condition, and outcome. This mimics a scenario where 
patient information regarding age, prescribed medications, existing conditions, and the outcome of a treatment is collected.

Data Preparation: The categorical variables ('Medication' and 'Condition') are encoded using one-hot encoding to convert them into numerical format 
suitable for machine learning algorithms.

Train-Test Split: The dataset is divided into training and testing subsets using train_test_split from sklearn.model_selection. This step ensures that 
the model can learn from a portion of the data and be tested on another unseen portion.

Model Training: A RandomForestClassifier is trained on the training data (X_train and y_train). This classifier learns patterns from the provided 
features (age, medication, and condition) to predict the outcome.

Prediction and Evaluation: The trained model (clf) makes predictions on the test set (X_test), and the accuracy of these predictions is calculated 
using accuracy_score. The accuracy score indicates the proportion of correctly predicted outcomes from the test data.

Overall, this program is a basic demonstration of a machine learning workflow for a binary classification task using a RandomForestClassifier on a 
synthetic dataset representing patient data related to medications and conditions. The print statement displays the accuracy achieved by the model on 
the test data.
'''

import numpy as np
import pandas as pd
from sklearn.ensemble import RandomForestClassifier
from sklearn.preprocessing import LabelEncoder
from sklearn.metrics import accuracy_score

# Generate synthetic medication dataset
np.random.seed(42)
num_patients = 1000
medications = ['Medication_A', 'Medication_B', 'Medication_C', 'Medication_D']
conditions = ['Condition_X', 'Condition_Y', 'Condition_Z']
patient_data = {
    'Age': np.random.randint(20, 80, num_patients),
    'Medication': np.random.choice(medications, num_patients),
    'Condition': np.random.choice(conditions, num_patients),
    'Outcome': np.random.randint(0, 2, num_patients)  # Binary outcome: 0 or 1
}
medication_df = pd.DataFrame(patient_data)

# Encoding categorical variables
label_encoders = {}
for col in ['Medication', 'Condition']:
    label_encoders[col] = LabelEncoder()
    medication_df[col] = label_encoders[col].fit_transform(medication_df[col])

# Training a RandomForestClassifier
X = medication_df[['Age', 'Medication', 'Condition']]
y = medication_df['Outcome']

clf = RandomForestClassifier(random_state=42)
clf.fit(X, y)

# Calculate training accuracy
y_pred_train = clf.predict(X)
training_accuracy = accuracy_score(y, y_pred_train)
print(f"Training Accuracy: {training_accuracy}")

# Function to recommend medication based on user input
def recommend_medication(age, medication, condition):
    medication_idx = label_encoders['Medication'].transform([medication])[0]
    condition_idx = label_encoders['Condition'].transform([condition])[0]
    
    # Create a DataFrame with named features for prediction
    user_input = pd.DataFrame([[age, medication_idx, condition_idx]], columns=['Age', 'Medication', 'Condition'])

    # Make prediction based on user input DataFrame
    prediction = clf.predict(user_input)
    
    if prediction == 1:
        return "Recommendation: Prescribe Medication"
    else:
        return "Recommendation: No Medication Needed"

# Take user input
age_input = int(input("Enter patient's age: "))
medication_input = input("Enter medication (Medication_A, Medication_B, Medication_C, Medication_D): ")
condition_input = input("Enter condition (Condition_X, Condition_Y, Condition_Z): ")

recommendation = recommend_medication(age_input, medication_input, condition_input)
print(recommendation)


Training Accuracy: 0.794


Enter patient's age:  89
Enter medication (Medication_A, Medication_B, Medication_C, Medication_D):  Medication_A
Enter condition (Condition_X, Condition_Y, Condition_Z):  Condition_X


Recommendation: No Medication Needed
