# **MindGen AI® Project**
---
MindGen AI® is an advanced artificial intelligence model developed by Phoenix Labs, aimed at exploring the
genetic foundations of mental health. This AI model seeks to develop personalized intervention strategies by
analyzing genetic data to identify biomarkers, genetic variants, and molecular pathways associated with
various mental health conditions. MindGen AI® integrates state-of-the-art AI algorithms to enhance early
detection, precise diagnosis, and tailored interventions, ultimately promoting proactive mental health
management and improved therapeutic outcomes.

Dataset Methodology/Resources and dataset are give below that are used for the `MindGen Project`


#1. 🥶 **Depression Analysis**
⚛
**Dataset Link**
[`https://drive.google.com/file/d/1OJqKfwibxydSmus-bvPjhxJ0fKYWJDuI/view?usp=drive_link`](https://drive.google.com/file/d/1OJqKfwibxydSmus-bvPjhxJ0fKYWJDuI/view?usp=drive_link)

Information about dataset is as followed
<br>

✅ **Overview**

This dataset is structured for use in predictive modeling and analysis of depression based on biomarkers, genetics, nutrigenomics, and other relevant parameters. The dataset includes 1000 samples, with a balanced distribution across different types of depression and healthy individuals.

All values are carefully designed using logical mappings based on clinical research articles, public datasets, and medical literature.

<br>

🧾 **Depression Data Dictionary**

| Column Name                | Description                               | Possible Values                                                                                                                                                                                                                                                                                                                                                                             | What Each Value Means                                                                                                   |
| -------------------------- | ----------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------- |
| `Age`                      | Age in years                              | 18–80                                                                                                                                                                                                                                                                                                                                                                                       | Age range of individuals. Risk of depression varies across age groups.                                                  |
| `Sex`                      | Biological sex                            | **Male** – Assigned male at birth<br>**Female** – Assigned female at birth                                                                                                                                                                                                                                                                                                                  | Used for assessing gender-related differences in depression, hormones, and genetics.                                    |
| `BMI`                      | Body Mass Index                           | 15–40                                                                                                                                                                                                                                                                                                                                                                                       | BMI is a weight-to-height ratio. High BMI (>30) may correlate with inflammation and depression.                         |
| `SleepHours`               | Average sleep per night                   | 2–12                                                                                                                                                                                                                                                                                                                                                                                        | Fewer than 6 hours = sleep deprivation; more than 9 may indicate hypersomnia. Both are linked with depression types.    |
| `PhysicalActivity`         | Weekly physical activity level            | **Low** – Rarely exercises<br>**Moderate** – Exercises 2–4 times/week<br>**High** – Exercises 5+ times/week                                                                                                                                                                                                                                                                                 | Physical activity helps reduce depression risk through endorphin release.                                               |
| `CRP`                      | C-reactive protein level (mg/L)           | 0.1–10                                                                                                                                                                                                                                                                                                                                                                                      | Higher CRP levels indicate inflammation in the body. Depression is linked with chronic low-level inflammation.          |
| `CortisolLevel`            | Morning cortisol (µg/dL)                  | 5–30                                                                                                                                                                                                                                                                                                                                                                                        | Cortisol is the body’s stress hormone. Chronically high cortisol is common in melancholic and major depression.         |
| `BDNF_Level`               | Brain-derived neurotrophic factor (ng/mL) | 4–40                                                                                                                                                                                                                                                                                                                                                                                        | BDNF supports neuron growth. Lower levels are associated with poor brain plasticity and depressive symptoms.            |
| `VitaminD_Level`           | Vitamin D in blood (ng/mL)                | 5–50                                                                                                                                                                                                                                                                                                                                                                                        | Low vitamin D (<20 ng/mL) is associated with depressive symptoms.                                                       |
| `Folate_Level`             | Folate (Vitamin B9) in blood (ng/mL)      | 2–25                                                                                                                                                                                                                                                                                                                                                                                        | Deficiency in folate impairs serotonin and dopamine synthesis.                                                          |
| `Tryptophan_Level`         | Amino acid tryptophan level (µmol/L)      | 20–100                                                                                                                                                                                                                                                                                                                                                                                      | Tryptophan is the precursor to serotonin. Low levels reduce serotonin, a mood-stabilizing neurotransmitter.             |
| `Omega3_Index`             | % of omega-3 fatty acids in blood         | 2–12                                                                                                                                                                                                                                                                                                                                                                                        | Omega-3 fatty acids help brain function. A low index is linked with mood disorders.                                     |
| `5HTTLPR_Genotype`         | Serotonin transporter gene variation      | **LL** – Two long alleles → low risk<br>**LS** – One short, one long allele → moderate risk<br>**SS** – Two short alleles → high risk of depression under stress                                                                                                                                                                                                                            | This gene regulates serotonin reuptake. SS type has been linked to increased vulnerability to depression when stressed. |
| `BDNF_Genotype`            | BDNF gene variation (Val66Met)            | **Val/Val** – Normal BDNF levels<br>**Val/Met** – Moderately reduced BDNF<br>**Met/Met** – Reduced BDNF secretion, higher depression risk                                                                                                                                                                                                                                                   | The Met allele impairs BDNF release → affects memory, mood.                                                             |
| `MTHFR_Genotype`           | Folate metabolism gene                    | **CC** – Normal enzyme activity<br>**CT** – Moderately reduced activity<br>**TT** – Poor folate metabolism, high risk                                                                                                                                                                                                                                                                       | TT genotype leads to lower methylation, a process essential for neurotransmitter synthesis.                             |
| `TNF_Alpha_Level`          | Inflammatory cytokine TNF-α (pg/mL)       | 1–20                                                                                                                                                                                                                                                                                                                                                                                        | High values indicate inflammation in the brain, often linked to treatment-resistant depression.                         |
| `IL6_Level`                | Inflammatory marker interleukin-6 (pg/mL) | 0.5–10                                                                                                                                                                                                                                                                                                                                                                                      | Higher levels of IL-6 correlate with chronic and atypical depression.                                                   |
| `GutMicrobiota_Diversity`  | Microbiome richness (score out of 5)      | 1–5                                                                                                                                                                                                                                                                                                                                                                                         | A diverse gut microbiome promotes mental wellness. Lower values = poor gut health, linked with depression.              |
| `DietPattern`              | General eating pattern                    | **Western** – High in sugar/fat<br>**Mediterranean** – High in fruits, nuts, fish<br>**Balanced** – Moderate of all groups                                                                                                                                                                                                                                                                  | Mediterranean/Balanced diets are protective. Western diet is linked to inflammation and depression.                     |
| `AlcoholIntake`            | Weekly alcohol consumption                | **None** – 0 drinks/week<br>**Moderate** – 1–7 drinks/week<br>**High** – >7 drinks/week                                                                                                                                                                                                                                                                                                     | Excess alcohol can worsen mood and increase depressive symptoms.                                                        |
| `SmokingStatus`            | Smoking habit                             | **Never** – Has never smoked<br>**Former** – Smoked before but quit<br>**Current** – Currently smokes                                                                                                                                                                                                                                                                                       | Current smokers have higher rates of depression; nicotine affects dopamine.                                             |
| `FamilyHistory_Depression` | Any family history of depression          | **Yes** – Has biological relative with depression<br>**No** – No known history                                                                                                                                                                                                                                                                                                              | Family history increases genetic vulnerability.                                                                         |
| `CognitiveImpairment`      | Cognitive symptoms observed               | **None** – No symptoms<br>**Mild** – Some memory/attention issues<br>**Severe** – Daily life affected                                                                                                                                                                                                                                                                                       | Seen often in major and persistent depression.                                                                          |
| `NeuroinflammationMarker`  | Presence of brain inflammation markers    | **Yes** – Detected<br>**No** – Not detected                                                                                                                                                                                                                                                                                                                                                 | Elevated in treatment-resistant depression and MDD.                                                                     |
| `SleepDisruptionType`      | Type of sleep issues                      | **None** – Normal sleep<br>**Insomnia** – Difficulty sleeping<br>**Hypersomnia** – Excessive sleeping                                                                                                                                                                                                                                                                                       | Insomnia is more common in melancholic depression, hypersomnia in atypical depression.                                  |
| `DepressionDiagnosis`      | Final clinical diagnosis                  | **False** – No depression<br>**Major Depressive Disorder** – Severe low mood & symptoms for ≥2 weeks<br>**Persistent Depressive Disorder** – Long-term, less intense symptoms<br>**Atypical Depression** – Mood improves with positive events, hypersomnia<br>**Melancholic Depression** – Anhedonia, severe symptoms<br>**Seasonal Affective Disorder** – Symptoms during specific seasons | Assigned based on biomarker profile, symptoms, and genetic traits.                                                      |

<br>


🧾 **Column Name: DepressionScore_PHQ9**

| Attribute              | Description                                                                                                                                                                     |
| ---------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Type**               | Numeric (Integer)                                                                                                                                                               |
| **Range**              | 0 – 27                                                                                                                                                                          |
| **Source/Logic**       | Simulated based on the presence and severity of depressive symptoms. The score correlates with other features (e.g., sleep quality, appetite change, genetic risk, biomarkers). |
| **Purpose**            | Used as a **clinical proxy for depression severity**. Helps in mapping depression type (e.g., Major Depression vs. Mild Depression) and useful in ML models for prediction.     |
| **Used For Labeling?** | Yes. A key feature used to determine depression subtype when `DepressionDiagnosis` was updated to contain values like `Major Depression`, `Atypical Depression`, etc.           |


📉 **What the Values Signify**

| **PHQ-9 Score** | **Severity Level**    | **What It Means**                                                                                                                 |
| --------------- | --------------------- | --------------------------------------------------------------------------------------------------------------------------------- |
| 0–4             | **Minimal or None**   | No significant signs of depression.                                                                                               |
| 5–9             | **Mild**              | Early symptoms; may not need treatment but should be monitored.                                                                   |
| 10–14           | **Moderate**          | Common cutoff for **clinical diagnosis** of depression. Often treated with counseling or medication.                              |
| 15–19           | **Moderately Severe** | Functional impairment is likely; active treatment recommended.                                                                    |
| 20–27           | **Severe**            | High risk; requires **urgent** clinical intervention (medication, therapy, etc.). Often corresponds to Major Depressive Disorder. |

<br>

**Value Significance Highlights**

Genetic Markers:

* 5HTTLPR: Short allele (S) linked with poor serotonin regulation → increased depression risk under stress.

* BDNF Val66Met: Met allele → reduced BDNF secretion → decreased neuroplasticity.

* MTHFR TT: Inefficient methylation → poor folate metabolism → neurotransmitter dysregulation.

Biomarkers:

* CRP, IL6, TNF-Alpha: Inflammation markers associated with chronic depression.

* Cortisol: Marker for HPA axis dysregulation (elevated in melancholic and chronic depression).

Nutritional Factors:

* Low Omega-3, Tryptophan, Folate, Vitamin D: Deficiency associated with reduced neurotransmitter synthesis and synaptic function.

<br>

🧬 **Methodology**

Step-by-Step Process:

**Literature Review:**

* Examined peer-reviewed papers (PubMed, Elsevier, Frontiers, Nature).

* Extracted clinical correlations from epidemiological studies.

**Value Mapping:**

* Created ranges based on clinical test values from Mayo Clinic, WebMD, and LabCorp.

* Correlated certain biomarker ranges with corresponding depression types.

**Depression Classification:**

* Used DSM-5 criteria and online diagnostic guidelines.

* Mapped inflammation + genetic + symptom traits to subtype.

**Balancing:**

* Ensured each depression subtype and false (non-depressed) cases were equally distributed (~166 entries each).

<br>

📚 **Reference Resources**

Articles & Journals:
* NIH – Genetics and Depression

* Nature – Depression and Inflammation

* PubMed – Nutritional Psychiatry

* BDNF and Depression – Research

* WebMD – Symptoms and Types of Depression

Medical Labs & Norms:
* LabCorp: Diagnostic test ranges (e.g., BDNF, Folate, CRP)

* Mayo Clinic: Cortisol, Omega-3, Vitamin D level interpretations

Depression Classifications:
* DSM-5 (American Psychiatric Association)

* ICD-10

* Psychology Today and NIMH classification comparisons

<br>

External Links:

https://pmc.ncbi.nlm.nih.gov/articles/PMC3130626/
https://pmc.ncbi.nlm.nih.gov/articles/PMC4471964/
https://health.clevelandclinic.org/mthfr-gene-mutation
https://www.psychiatry.org/psychiatrists/practice/dsm
https://www.frontiersin.org/journals/psychiatry/articles/10.3389/fpsyt.2019.00249/full
https://www.nature.com/articles/mp20116
https://pmc.ncbi.nlm.nih.gov/articles/PMC6469455/

In [None]:
#Depression Dataset Preview
import pandas as pd
d=pd.read_csv('/content/Updated_Depression_Dataset.csv')
d.head().T

Unnamed: 0,0,1,2,3,4
Age,23,39,28,65,33
SleepDuration,6.9,6.0,6.3,8.7,4.8
Genotype_5HTTLPR,Short/Short,Long/Long,Short/Short,Long/Long,Short/Short
Genotype_COMT,Val/Val,Met/Met,Val/Val,Val/Val,Val/Val
Genotype_MAOA,Low Activity,Low Activity,High Activity,Low Activity,Low Activity
Cortisol,9.6,10.3,16.1,9.0,10.9
BDNF_Level,13.04,14.13,10.84,14.45,9.53
CRP,1.87,1.17,2.91,1.71,2.35
Vitamin_D,-0.3,19.9,30.9,29.8,22.6
Tryptophan,56.8,32.9,59.9,32.1,51.7


## **Model And Prediction**

For Details Refer the following colab notebook:
https://colab.research.google.com/drive/11vladlhbeFTxCqMkT0-tJyStoNgjH9Se?usp=drive_link

In [None]:
import pandas as pd
from sklearn.model_selection import train_test_split, RandomizedSearchCV
from sklearn.preprocessing import OneHotEncoder, StandardScaler, LabelEncoder
from sklearn.compose import ColumnTransformer
from sklearn.pipeline import Pipeline
from sklearn.feature_selection import SelectKBest, f_classif
from sklearn.metrics import classification_report, accuracy_score
from imblearn.pipeline import Pipeline as ImbPipeline
from imblearn.over_sampling import SMOTE
from xgboost import XGBClassifier
from sklearn.ensemble import RandomForestClassifier

In [None]:
# Step 1: Load dataset
d = pd.read_csv("/content/Balanced_Depression_Dataset.csv")

In [None]:
# Step 2: Encode target label
target_col = "DepressionDiagnosis"
label_encoder = LabelEncoder()
d[target_col] = label_encoder.fit_transform(d[target_col])

In [None]:
# Step 3: Separate features and target
X = d.drop(target_col, axis=1)
y = d[target_col]

In [None]:
# Step 4: Identify categorical columns
categorical_cols = X.select_dtypes(include=['object']).columns.tolist()

In [None]:
# Step 5: Preprocessing for categorical variables
preprocessor = ColumnTransformer([
    ('cat', OneHotEncoder(handle_unknown='ignore'), categorical_cols)
], remainder='passthrough')

In [None]:
# Step 6: Train-test split (stratified)
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42, stratify=y
)

In [None]:
# Step 7: Create pipeline with SMOTE and model
pipeline = ImbPipeline(steps=[
    ('preprocessor', preprocessor),
    ('smote', SMOTE(random_state=42)),
    ('classifier', RandomForestClassifier(random_state=42))
])

In [None]:
# Step 8: Train the model
pipeline.fit(X_train, y_train)

The format of the columns of the 'remainder' transformer in ColumnTransformer.transformers_ will change in version 1.7 to match the format of the other transformers.
At the moment the remainder columns are stored as indices (of type int). With the same ColumnTransformer configuration, in the future they will be stored as column names (of type str).



In [None]:
# Save the best model using joblib
import joblib

# Install joblib if you haven't already
!pip install joblib

joblib.dump(pipeline, "DepressionModel.joblib")



['DepressionModel.joblib']

In [None]:
#To load the model later:
depression_model = joblib.load("/content/DepressionModel.joblib")

### Accuracy of RandomForest + SMOTE

In [None]:
# Step 9: Evaluate
y_pred = pipeline.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)

In [None]:
print("Accuracy:", accuracy)

Accuracy: 0.7555555555555555


In [None]:
print(classification_report(y_test, y_pred, target_names=label_encoder.classes_))

                                precision    recall  f1-score   support

           Atypical Depression       0.76      0.87      0.81        30
                         False       0.46      0.43      0.45        30
     Major Depressive Disorder       0.74      0.83      0.78        30
Persistent Depressive Disorder       0.83      0.80      0.81        30
          Psychotic Depression       0.90      0.87      0.88        30
   Seasonal Affective Disorder       0.85      0.73      0.79        30

                      accuracy                           0.76       180
                     macro avg       0.76      0.76      0.75       180
                  weighted avg       0.76      0.76      0.75       180



In [None]:
# Save the best model using joblib
import joblib

# Install joblib if you haven't already
!pip install joblib

joblib.dump(label_encoder, "DepressionEncoder.joblib")



['DepressionEncoder.joblib']

In [None]:
#To load the model later:
depression_encoder = joblib.load("/content/DepressionEncoder.joblib")

### Prediction

In [None]:
## Step 10: Prediction on user given data

def predict_depression():
    """These function predicts depression from values such as
    Age: Age of a person (int)
    SleepDuration: Duration of sleep in hours (float)
    Genotype_5HTTLPR: Genotype of a person ('LL', 'LS', 'SS')
    Genotype_COMT: Genotype of a person ('Val/Val', 'Met/Met', 'Val/Met')
    Genotype_MAOA: Genotype of a person ('Low', 'High')
    Cortisol: Cortisol level of a person (float)
    BDNF_Level: BDNF level of a person (float)
    CRP: CRP level of a person (float)
    Vitamin_D: Vitamin D level of a person (float)
    Tryptophan: Tryptophan level of a person (float)
    Omega3_Index: Omega-3 index of a person (float)
    MTHFR_Genotype: Genotype of a person ('CC', 'CT', 'TT')
    Neuroinflammation_Score: Neuroinflammation score of a person (float)
    Monoamine_Oxidase_Level: Monoamine Oxidase level of a person (float)
    Serotonin_Level: Serotonin level of a person (float)
    HPA_Axis_Dysregulation: HPA Axis Dysregulation of a person (float)
    DepressionScore_PHQ9: Depression Score (PHQ-9) of a person (int)
    """

    input_data = {
        "Age": int(input("Enter  Age: (int) ")),
        "SleepDuration": float(input("Enter sleep duration (float) :")),
        "Genotype_5HTTLPR": input("Enter Genotype_5HTTLPR ('LL', 'LS', 'SS') : "),
        "Genotype_COMT": input("Enter Genotype_COMT ('Val/Val', 'Met/Met', 'Val/Met') : "),
        "Genotype_MAOA": input("Enter Genotype_MAOA ('Low', 'High') : "),
        "Cortisol": float(input("Enter Cortisol level (float) :")),
        "BDNF_Level": float(input("Enter BDNF_Level (float) :")),
        "CRP": float(input("Enter CRP (float) :")),
        "Vitamin_D": float(input("Enter Vitamin_D level (float) :")),
        "Tryptophan": float(input("Enter Tryptophan (float) :")),
        "Omega3_Index": float(input("Enter Omega3_Index (float) :")),
        "MTHFR_Genotype": input("Enter MTHFR_Genotype ('CC', 'CT', 'TT') : "),
        "Neuroinflammation_Score": float(input("Enter Neuroinflammation_Score (float) :")),
        "Monoamine_Oxidase_Level": float(input("Enter Monoamine_Oxidase_Level (float) :")),
        "Serotonin_Level": float(input("Enter Serotonin_Level (float) :")),
        "HPA_Axis_Dysregulation": float(input("Enter HPA_Axis_Dysregulation (float) :")),
        "DepressionScore_PHQ9": int(input("Enter DepressionScore_PHQ9 (int) :"))
    }

    # Convert to DataFrame
    user_df = pd.DataFrame([input_data])

    # Predict using the pipeline
    prediction = depression_model.predict(user_df)
    predicted_label = depression_encoder.inverse_transform(prediction)
    print("Predicted Depression Type:", predicted_label[0])

    return predicted_label[0]

In [None]:
Depression_Type=predict_depression()

Enter  Age: (int) 23
Enter sleep duration (float) :3.4
Enter Genotype_5HTTLPR ('LL', 'LS', 'SS') : LL
Enter Genotype_COMT ('Val/Val', 'Met/Met', 'Val/Met') : Val/Val
Enter Genotype_MAOA ('Low', 'High') : Low
Enter Cortisol level (float) :45.3
Enter BDNF_Level (float) :56.2
Enter CRP (float) :33.2
Enter Vitamin_D level (float) :34.4
Enter Tryptophan (float) :12.2
Enter Omega3_Index (float) :67
Enter MTHFR_Genotype ('CC', 'CT', 'TT') : TT
Enter Neuroinflammation_Score (float) :3.2
Enter Monoamine_Oxidase_Level (float) :23
Enter Serotonin_Level (float) :21
Enter HPA_Axis_Dysregulation (float) :3
Enter DepressionScore_PHQ9 (int) :2
Predicted Depression Type: Seasonal Affective Disorder


#2. 🔆**Anxiety Analysis**


**Dataset Link**: [`https://drive.google.com/file/d/18TOHEtRQenraMzq63FtV0qXXbvJ0y7J9/view?usp=drive_link`](https://drive.google.com/file/d/18TOHEtRQenraMzq63FtV0qXXbvJ0y7J9/view?usp=drive_link)

<br>

🧠**Anxiety Prediction Dataset Information is as Followed**
<br>


✅**Overview**

This dataset aims to predict and classify types of anxiety disorders (e.g., GAD, Social Anxiety, Panic Disorder) using behavioral, genetic, physiological, and lifestyle-related markers. It contains enriched information with detailed definitions suitable for clinical insights and machine learning predictions.

<br>

📋 **Column Descriptions/Data Dictionary**:

| Column Name               | Description                              | Possible Values                                                                                                                                                                                                                                                                                                                                                   | What Each Value Means                                                                          |
| ------------------------- | ---------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------- |
| `Age`                     | Age in years                             | 18–80                                                                                                                                                                                                                                                                                                                                                             | Age range of participants. Some anxiety types are more common in youth (e.g., Social Anxiety). |
| `Sex`                     | Biological sex                           | **Male** – Assigned male at birth<br>**Female** – Assigned female at birth                                                                                                                                                                                                                                                                                        | Anxiety disorders are generally more common in females.                                        |
| `HeartRate`               | Resting heart rate (BPM)                 | 50–120                                                                                                                                                                                                                                                                                                                                                            | Elevated heart rate (>90 BPM) is often seen in panic or generalized anxiety.                   |
| `BloodPressure_Systolic`  | Systolic BP (top number)                 | 90–180                                                                                                                                                                                                                                                                                                                                                            | Stress and anxiety can elevate BP. Very high values may indicate chronic anxiety.              |
| `BloodPressure_Diastolic` | Diastolic BP (bottom number)             | 60–110                                                                                                                                                                                                                                                                                                                                                            | Combined with systolic to measure stress response.                                             |
| `CortisolLevel`           | Cortisol (µg/dL)                         | 5–30                                                                                                                                                                                                                                                                                                                                                              | Cortisol is a stress hormone. Chronic anxiety may lead to high baseline levels.                |
| `CRP`                     | Inflammation marker (mg/L)               | 0.1–10                                                                                                                                                                                                                                                                                                                                                            | Chronic anxiety may be linked to low-level systemic inflammation.                              |
| `GAD1_Genotype`           | Genetic marker linked to GABA regulation | **CC** – Normal function<br>**CT** – Slightly reduced GABA activity<br>**TT** – Poor GABA regulation, higher anxiety risk                                                                                                                                                                                                                                         | GABA is the brain’s calming neurotransmitter. TT is associated with higher anxiety.            |
| `5HTTLPR_Genotype`        | Serotonin transporter gene               | **LL** – Low anxiety risk<br>**LS** – Medium risk<br>**SS** – High risk under stress                                                                                                                                                                                                                                                                              | SS type correlates with anxiety sensitivity and panic.                                         |
| `MAOA_Genotype`           | Monoamine oxidase A gene                 | **High-activity** – Breaks down neurotransmitters efficiently<br>**Low-activity** – Slower breakdown, linked to emotional reactivity                                                                                                                                                                                                                              | Low-activity variants are linked to increased emotional intensity and anxiety.                 |
| `SleepQuality`            | Overall sleep quality score (1–10)       | 1–10                                                                                                                                                                                                                                                                                                                                                              | Lower scores indicate poor sleep, linked to higher anxiety levels.                             |
| `SleepDuration`           | Average hours of sleep/night             | 2–12                                                                                                                                                                                                                                                                                                                                                              | Short (<6 hrs) or excessive (>10 hrs) sleep can worsen anxiety symptoms.                       |
| `DietType`                | General diet pattern                     | **Western** – Processed, high sugar/fat<br>**Mediterranean** – Whole foods, healthy fats<br>**Vegetarian** – No meat<br>**Balanced** – Mix of all                                                                                                                                                                                                                 | Mediterranean diets reduce anxiety symptoms. Western diets are pro-inflammatory.               |
| `CaffeineIntake`          | Caffeine consumption/day (mg)            | 0–800                                                                                                                                                                                                                                                                                                                                                             | High caffeine (>400 mg) can trigger anxiety attacks or restlessness.                           |
| `AlcoholUse`              | Weekly alcohol consumption               | **None** – 0 drinks<br>**Moderate** – 1–7<br>**High** – >7                                                                                                                                                                                                                                                                                                        | High alcohol use is linked to social and generalized anxiety.                                  |
| `SmokingStatus`           | Smoking habit                            | **Never** – Never smoked<br>**Former** – Quit smoking<br>**Current** – Currently smokes                                                                                                                                                                                                                                                                           | Nicotine can cause or worsen anxiety. Current smokers are at greater risk.                     |
| `PhysicalActivity`        | Exercise level                           | **Low** – Rarely exercises<br>**Moderate** – 2–4 times/week<br>**High** – 5+ times/week                                                                                                                                                                                                                                                                           | Regular physical activity helps reduce anxiety through stress hormone regulation.              |
| `PanicAttackFrequency`    | Times panic attacks occur per week       | 0–10                                                                                                                                                                                                                                                                                                                                                              | Panic Disorder is identified with recurrent, unexpected attacks.                               |
| `AvoidanceBehavior`       | Avoids social or triggering situations   | **Yes** – Avoids<br>**No** – Does not avoid                                                                                                                                                                                                                                                                                                                       | Common in Social Anxiety Disorder and Agoraphobia.                                             |
| `SocialInteractionLevel`  | Social activity level                    | **Low** – Rarely social<br>**Moderate** – Occasionally social<br>**High** – Very social                                                                                                                                                                                                                                                                           | Social avoidance often indicates Social Anxiety.                                               |
| `ConcentrationIssues`     | Attention/concentration difficulty       | **None**, **Mild**, **Severe**                                                                                                                                                                                                                                                                                                                                    | Trouble focusing is common in GAD and Panic Disorder.                                          |
| `MuscleTension`           | Physical symptom of anxiety              | **Yes**, **No**                                                                                                                                                                                                                                                                                                                                                   | Muscle tension is a physiological marker for GAD.                                              |
| `FamilyHistory_Anxiety`   | Family history of anxiety                | **Yes**, **No**                                                                                                                                                                                                                                                                                                                                                   | Anxiety may run in families.                                                                   |
| `IntrusiveThoughts`       | Recurring anxious thoughts               | **None** – No intrusive thoughts<br>**Occasional** – Sometimes happens<br>**Frequent** – Recurring thoughts impacting daily life                                                                                                                                                                                                                                  | Frequent thoughts suggest OCD or Generalized Anxiety.                                          |
| `Hypervigilance`          | Alertness or jumpiness                   | **Yes**, **No**                                                                                                                                                                                                                                                                                                                                                   | Hypervigilance is common in PTSD and Panic Disorders.                                          |
| `StartleResponse`         | Strong reaction to noise/stimuli         | **Low**, **Moderate**, **High**                                                                                                                                                                                                                                                                                                                                   | High startle response is a sign of PTSD or Panic Disorder.                                     |
| `AnxietyDiagnosis`        | Final clinical diagnosis                 | **False** – No anxiety<br>**Generalized Anxiety Disorder (GAD)** – Chronic worry, restlessness<br>**Social Anxiety Disorder** – Fear of being judged/social situations<br>**Panic Disorder** – Sudden panic attacks<br>**Agoraphobia** – Fear of crowded or open spaces<br>**OCD** – Obsessive thoughts & repetitive behaviors<br>**PTSD** – Anxiety after trauma | These diagnoses are based on behavior, symptoms, and genetic indicators.                       |


<br>

🔬 **Step-by-Step Methodology**

1. Source Research
Studied classification criteria for anxiety types from:

* DSM-5 (Diagnostic and Statistical Manual of Mental Disorders)

* NIH/NIMH articles, PubMed journals, and Mayo Clinic descriptions

* Data points and risk factors from open datasets on anxiety and stress (e.g., Mental Health in Tech 2020)

2. Diagnosis Classification
Replaced AnxietyDiagnosis True/False with:

* GAD (Generalized Anxiety Disorder)

* Social Anxiety Disorder

* Panic Disorder

* OCD

* PTSD

* Agoraphobia

* False (no diagnosed anxiety)

Balanced dataset with 150–180 records per disorder class + ~300 "False" (no anxiety), ensuring realistic proportions.

3. Added Clinically Relevant Columns

Included:

* Genetics: GAD1_Genotype, MAOA_Genotype, 5HTTLPR_Genotype

* Symptoms & behaviors: MuscleTension, Hypervigilance, StartleResponse, AvoidanceBehavior

* Lifestyle: DietType, SleepQuality, SleepDuration, CaffeineIntake

* Psychological history: IntrusiveThoughts, ConcentrationIssues, FamilyHistory_Anxiety

4. Simulated Realistic Values
Used condition-specific patterns:

* Panic Disorder → high heart rate, high startle response

* OCD → frequent intrusive thoughts, moderate cortisol

* PTSD → high hypervigilance, high startle, poor sleep

* Social Anxiety → low social interaction, avoidance behavior

* GAD → chronic muscle tension, moderate cortisol, poor concentration

5. Validation

* Manually checked balance and class distributions.

* Confirmed logical consistency (e.g., no person with "High Social Interaction" + "AvoidanceBehavior = Yes").

<br>

✴️ **Value Significance Highlights (Key Predictors)**

| Feature                | Importance in Diagnosing Anxiety                                   |
| ---------------------- | ------------------------------------------------------------------ |
| `GAD1_Genotype`        | TT associated with poor GABA regulation → linked to GAD and panic. |
| `MAOA_Genotype`        | Low-activity variant = emotional reactivity → higher anxiety risk. |
| `5HTTLPR_Genotype`     | SS variant → high anxiety under stress, common in PTSD, OCD.       |
| `SleepQuality`         | Poor sleep is both a **symptom** and **worsener** of anxiety.      |
| `AvoidanceBehavior`    | Key indicator for **Social Anxiety**, **Agoraphobia**.             |
| `PanicAttackFrequency` | Direct indicator of **Panic Disorder**.                            |
| `IntrusiveThoughts`    | Sign of **OCD**, **GAD**, and sometimes **PTSD**.                  |
| `Hypervigilance`       | Flag for **PTSD**, **Panic Disorder**.                             |
| `DietType`             | Western diet linked with higher inflammation, worsening anxiety.   |
| `CortisolLevel`        | High cortisol → stress marker, used to differentiate GAD, PTSD.    |
| `ConcentrationIssues`  | Often linked with GAD and OCD, sometimes Panic.                    |

<br>


🧾 Column Name: PHQ_Anxiety_Score

📌 1. What is it?

The PHQ_Anxiety_Score refers to a score derived from a standardized anxiety screening questionnaire—typically based on GAD-7 (Generalized Anxiety Disorder 7-item scale) or anxiety-related items from the PHQ-4/PHQ-9.

It assesses how often the individual has been bothered by anxiety-related symptoms over the last two weeks.

This score is quantitative, designed to screen for and measure the severity of generalized anxiety disorder (GAD) symptoms.

📊 2. What values can it have?

It is a numerical score, usually ranging from 0 to 21, if modeled on the GAD-7 scale.

| **Score Range** | **Severity Level** |
| --------------- | ------------------ |
| 0–4             | Minimal anxiety    |
| 5–9             | Mild anxiety       |
| 10–14           | Moderate anxiety   |
| 15–21           | Severe anxiety     |


🧠 3. What each score range means?

| **Score** | **Interpretation**                                                                                   |
| --------- | ---------------------------------------------------------------------------------------------------- |
| **0–4**   | No to minimal anxiety. Most individuals are coping well or have normal stress levels.                |
| **5–9**   | Mild anxiety. Common in stressful academic/work/life situations. Monitoring advised.                 |
| **10–14** | Moderate anxiety. Indicates noticeable symptoms. Intervention or therapy may help.                   |
| **15–21** | Severe anxiety. Suggests high impairment. Often requires clinical attention, therapy, or medication. |


🔍 4. How is it calculated?

| **Response**            | **Score** |
| ----------------------- | --------- |
| Not at all              | 0         |
| Several days            | 1         |
| More than half the days | 2         |
| Nearly every day        | 3         |

Then:
Total Score = Sum of all 7 responses


📚 **Resources & References**

* National Institute of Mental Health (NIMH) – Anxiety Disorders

* DSM-5 Criteria for Anxiety

* Genetics of Anxiety (Nature Review)

* GAD1 and GABA Research (PubMed)

* Sleep and Anxiety Connection

* Caffeine & Anxiety Study

* Social Anxiety Explained (PsychCentral)

<br>

🔗**External Links**

https://www.psychiatry.org/psychiatrists/practice/dsm

https://www.nimh.nih.gov/health/topics/anxiety-disorders

https://pubmed.ncbi.nlm.nih.gov/29727964/

https://www.sleepfoundation.org/mental-health/anxiety-and-sleep

https://pmc.ncbi.nlm.nih.gov/articles/PMC10131535/

https://pmc.ncbi.nlm.nih.gov/articles/PMC8109802/

https://www.frontiersin.org/journals/psychiatry/articles/10.3389/fpsyt.2023.1188427/full

https://clinicalepigeneticsjournal.biomedcentral.com/articles/10.1186/s13148-025-01819-x



### Model And Prediction

For more Information Reffer:
https://colab.research.google.com/drive/17bsdwty3N7CMnN4XUajiaGVG5WLlRPMH?usp=drive_link


In [None]:
#Anxiety Dataset Preview
a=pd.read_csv('/content/Updated_Anxiety_Dataset.csv')
a.head().T

Unnamed: 0,0,1,2,3,4
Age,50,30,42,59,57
SleepDuration,8.3,6.7,8.4,4.8,7.8
Genotype_5HTTLPR,Short/Short,Short/Short,Short/Long,Short/Short,Short/Long
Genotype_COMT,Met/Met,Val/Val,Val/Val,Val/Met,Val/Val
Genotype_MAOA,Low Activity,High Activity,Low Activity,Low Activity,Low Activity
Cortisol,20.7,7.6,18.2,14.1,23.0
Alpha_Amylase,98.6,114.2,151.2,81.5,113.1
HRV (Heart Rate Variability),93.2,53.8,82.4,63.6,47.1
GABA,0.88,0.89,1.5,0.8,1.32
IL6,0.92,3.99,4.95,2.44,0.89


In [None]:
#checking shape
a.shape

(1000, 19)

In [None]:
#value counts of target column
a["AnxietyDiagnosis"].value_counts()

Unnamed: 0_level_0,count
AnxietyDiagnosis,Unnamed: 1_level_1
False,500
Agoraphobia,110
Panic Disorder,103
Social Anxiety Disorder,97
Specific Phobia,96
Generalized Anxiety Disorder,94


In [None]:
import pandas as pd
import numpy as np
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split

In [None]:
# Convert string columns to categorical ordered format
for label, content in a.items():
    if pd.api.types.is_string_dtype(content) or pd.api.types.is_object_dtype(content):
        a[label] = content.astype("category").cat.as_ordered()

In [None]:
# Save mappings from category codes to original strings
category_mappings = {
    'Genotype_5HTTLPR': dict(enumerate(a['Genotype_5HTTLPR'].cat.categories)),
    'Genotype_COMT': dict(enumerate(a['Genotype_COMT'].cat.categories)),
    'Genotype_MAOA': dict(enumerate(a['Genotype_MAOA'].cat.categories)),
    'AnxietyDiagnosis': dict(enumerate(a['AnxietyDiagnosis'].cat.categories))
}


In [None]:
# Create new columns with codes for categorical data
a['Genotype_5HTTLPR_Codes'] = a['Genotype_5HTTLPR'].cat.codes
a['Genotype_COMT_Codes'] = a['Genotype_COMT'].cat.codes
a['Genotype_MAOA_Codes'] = a['Genotype_MAOA'].cat.codes
a['AnxietyDiagnosis_Codes'] = a['AnxietyDiagnosis'].cat.codes

In [None]:
# Drop original string columns
a = a.drop(['Genotype_5HTTLPR', 'Genotype_COMT', 'Genotype_MAOA', 'AnxietyDiagnosis'], axis=1)

In [None]:
# Define features and target
X = a.drop('AnxietyDiagnosis_Codes', axis=1)
y = a['AnxietyDiagnosis_Codes']

In [None]:
X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.2,random_state=42)

In [None]:
# Train a Random Forest model
model = RandomForestClassifier(random_state=42)
model.fit(X, y)

In [None]:
y_preds=model.predict(X_test)

In [None]:
X_test

Unnamed: 0,Age,SleepDuration,Cortisol,Alpha_Amylase,HRV (Heart Rate Variability),GABA,IL6,TNF_alpha,Tryptophan,Vitamin_B6,Omega3_Index,HPA_Axis_Dysregulation,Sympathetic_Activation_Score,GABAergic_Function_Score,AnxietyScore_GAD7,Genotype_5HTTLPR_Codes,Genotype_COMT_Codes,Genotype_MAOA_Codes
521,39,6.6,14.6,80.5,91.3,0.72,5.50,2.07,46.0,15.7,8.26,0.27,0.08,0.40,6,1,1,1
737,33,4.8,6.2,83.6,75.6,0.65,2.99,4.77,54.1,15.4,3.06,0.25,0.85,0.62,3,2,2,1
740,69,6.8,21.4,93.3,96.0,1.23,4.92,3.68,47.8,19.1,4.26,0.73,0.58,0.91,11,1,1,1
660,37,7.7,21.2,101.5,59.4,0.66,4.18,2.32,43.8,15.9,4.47,0.32,0.67,0.31,10,0,2,1
411,27,5.3,22.7,96.9,75.5,1.48,4.43,5.21,52.2,17.3,4.87,0.24,0.81,0.68,16,1,0,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
408,60,7.6,7.7,99.1,50.4,0.97,3.13,3.13,57.4,24.3,2.27,0.67,0.71,0.59,2,2,0,1
332,21,5.2,20.8,159.1,46.0,1.12,2.95,5.72,67.2,20.5,8.55,0.16,0.65,0.08,0,1,1,0
208,45,4.5,6.7,128.1,61.7,0.64,5.34,4.43,44.6,25.2,3.48,0.31,0.14,0.82,10,2,1,1
613,59,5.2,16.7,128.8,40.8,0.67,3.20,4.78,50.0,18.5,9.15,0.49,0.41,0.89,2,1,0,1


In [None]:
y_test

Unnamed: 0,AnxietyDiagnosis_Codes
521,0
737,1
740,1
660,1
411,2
...,...
408,1
332,1
208,2
613,1


In [None]:
model.score(X_test,y_test)

1.0

In [None]:
print(classification_report(y_test,y_preds))

              precision    recall  f1-score   support

           0       1.00      1.00      1.00        19
           1       1.00      1.00      1.00       101
           2       1.00      1.00      1.00        20
           3       1.00      1.00      1.00        22
           4       1.00      1.00      1.00        20
           5       1.00      1.00      1.00        18

    accuracy                           1.00       200
   macro avg       1.00      1.00      1.00       200
weighted avg       1.00      1.00      1.00       200



In [None]:
# Save the best model using joblib
import joblib

# Install joblib if you haven't already
!pip install joblib

joblib.dump(model, "AnxietyModel.joblib")



['AnxietyModel.joblib']

### Prediction

In [None]:
#To load the model later:
anxiety_model = joblib.load("/content/AnxietyModel.joblib")

In [None]:
for col in X.columns:
  print(col)

Age
SleepDuration
Cortisol
Alpha_Amylase
HRV (Heart Rate Variability)
GABA
IL6
TNF_alpha
Tryptophan
Vitamin_B6
Omega3_Index
HPA_Axis_Dysregulation
Sympathetic_Activation_Score
GABAergic_Function_Score
AnxietyScore_GAD7
Genotype_5HTTLPR_Codes
Genotype_COMT_Codes
Genotype_MAOA_Codes


In [None]:
#To load the model later:
anxiety_label_encoder = joblib.load("anxiety_label_encoder.joblib")

In [None]:
# Save the preprocessor using joblib
import joblib
joblib.dump(preprocessor, "anxiety_preprocessor.joblib")

['anxiety_preprocessor.joblib']

In [None]:
# User input function
def predict_anxiety():
    print("\nPlease enter the following details to predict the type of Anxiety Disorder:\n")
    user_input = {}

    for col in X.columns:
        if "Genotype" in col:
            original_col = col.replace("_Codes", "")
            options = list(category_mappings[original_col].values())
            print(f"{original_col} options: {options}")
            val = input(f"Enter {original_col}: ")
            while val not in options:
                print("Invalid input. Please choose from the given options.")
                val = input(f"Enter {original_col}: ")
            # Convert string to code
            code = list(category_mappings[original_col].keys())[options.index(val)]
            user_input[col] = code
        else:
            while True:
                try:
                    val = float(input(f"Enter {col}: "))
                    user_input[col] = val
                    break
                except ValueError:
                    print("Invalid number. Try again.")

    # Convert input to DataFrame
    input_df = pd.DataFrame([user_input])

    # Make prediction
    prediction_code = anxiety_model.predict(input_df)[0]
    prediction_label = category_mappings['AnxietyDiagnosis'][prediction_code]

    print("\n🎯 Predicted Anxiety Diagnosis:", prediction_label)
    return prediction_label

In [None]:
predict_anxiety()


Please enter the following details to predict the type of Anxiety Disorder:

Enter Age: 21
Enter SleepDuration: 2
Enter Cortisol: 22
Enter Alpha_Amylase: 21
Enter HRV (Heart Rate Variability): 32
Enter GABA: 44
Enter IL6: 12
Enter TNF_alpha: 43
Enter Tryptophan: 12
Enter Vitamin_B6: 54
Enter Omega3_Index: 21
Enter HPA_Axis_Dysregulation: 44
Enter Sympathetic_Activation_Score: 12
Enter GABAergic_Function_Score: 21
Enter AnxietyScore_GAD7: 2
Genotype_5HTTLPR options: ['Long/Long', 'Short/Long', 'Short/Short']
Enter Genotype_5HTTLPR: Long/Long
Genotype_COMT options: ['Met/Met', 'Val/Met', 'Val/Val']
Enter Genotype_COMT: Met/Met
Genotype_MAOA options: ['High Activity', 'Low Activity']
Enter Genotype_MAOA: High Activity

🎯 Predicted Anxiety Diagnosis: False


'False'

In [None]:
import joblib

# Save model
joblib.dump(model, "AnxietyModel.joblib")

# Save mappings and column structure
joblib.dump({
    'category_mappings': category_mappings,
    'columns': X.columns.tolist()
}, "AnxietyMetadata.joblib")


#3. 🧬**Bipolar Disorder Analysis**


**Dataset Link**: [`https://drive.google.com/file/d/1hz-IdScbQzkEg8VPaLoWZXifzLEVLLWH/view?usp=sharing`](https://drive.google.com/file/d/1hz-IdScbQzkEg8VPaLoWZXifzLEVLLWH/view?usp=sharing)

<br>


🧠**Bipolar Disorder Prediction Dataset Information is as Followed**
<br>

🧾 **Data Dictionary for Bipolar Disorder Dataset**
<br>

| **Column Name**                 | **Description**                                           | **Possible Values**                     | **Value Explanation**                                                                                                       | **Relationship to Bipolar Disorder**                                             |
| ------------------------------- | --------------------------------------------------------- | --------------------------------------- | --------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------- |
| **Age**                         | Age of the individual                                     | 18–65                                   | Numerical value (years)                                                                                                     | BD often begins in young adulthood; early onset may predict severity.            |
| **Sex**                         | Biological sex of the individual                          | `M`, `F`                                | M = Male, F = Female                                                                                                        | BD-I slightly more common in males; BD-II and cyclothymia more in females.       |
| **Family\_History**             | Whether individual has family members with mood disorders | `Yes`, `No`                             | Yes = Family history present; No = No known history                                                                         | Strong genetic component; family history increases BD risk significantly.        |
| **ANK3\_rs10994336**            | Genetic variant (SNP) in ANK3 gene                        | `AA`, `AG`, `GG`                        | A and G are alleles (gene versions); combinations indicate genotype                                                         | G allele is associated with increased risk of BD, especially BD-I.               |
| **CACNA1C\_rs1006737**          | SNP in CACNA1C gene, involved in brain calcium channels   | `AA`, `AG`, `GG`                        | A = risk allele, G = normal; AG and AA = increased BD risk                                                                  | A allele linked to mood dysregulation, depression, BD-II.                        |
| **ODZ4\_rs12576775**            | SNP in ODZ4 gene, linked to neuron development            | `CC`, `CT`, `TT`                        | C = common, T = risk; CT/TT may indicate susceptibility                                                                     | Variants in this gene have been implicated in BD-I and BD-II.                    |
| **Glutamate\_Level**            | Level of glutamate (a neurotransmitter) in blood/brain    | `High`, `Normal`, `Low`                 | High = Excessive glutamate activity; Low = reduced levels                                                                   | High glutamate is linked with manic episodes and BD-I.                           |
| **Tryptophan\_Metabolites**     | Metabolites from tryptophan, precursor to serotonin       | `Altered`, `Normal`                     | Altered = Disrupted serotonin/melatonin pathway                                                                             | Imbalances affect mood regulation; common in BD-II and cyclothymia.              |
| **Cortisol\_Level**             | Hormone related to stress (via adrenal glands)            | `Elevated`, `Normal`                    | Elevated = Higher stress hormone levels                                                                                     | BD-I and BD-II often show elevated cortisol due to HPA axis dysfunction.         |
| **Circadian\_Gene\_Disruption** | Disruption in genes controlling biological clock          | `Yes`, `No`                             | Yes = Disrupted sleep/wake cycle genes                                                                                      | Sleep/circadian rhythm disruption is a hallmark of bipolar episodes.             |
| **Mitochondrial\_Dysfunction**  | Issues in energy-producing parts of cells                 | `Yes`, `No`                             | Yes = Mitochondrial dysfunction markers detected                                                                            | BD patients often show cellular energy imbalance; especially BD-I.               |
| **Neuroinflammation**           | Immune system activity in the brain                       | `Yes`, `No`                             | Yes = Inflammatory markers like cytokines detected                                                                          | Brain inflammation contributes to mood instability, especially in BD.            |
| **Omega3\_Intake**              | Dietary intake of omega-3 fatty acids                     | `Low`, `Adequate`, `High`               | Low = Not enough omega-3s; Adequate = sufficient levels                                                                     | Low intake linked with more frequent mood swings and depression in BD.           |
| **Folate\_Level**               | Level of Vitamin B9 (Folate) in the body                  | `Deficient`, `Normal`                   | Deficient = Not enough folate; Normal = healthy level                                                                       | Low folate worsens depressive symptoms; common in BD patients.                   |
| **VitaminD\_Level**             | Level of Vitamin D in the body                            | `Deficient`, `Normal`                   | Deficient = Low levels of vitamin D                                                                                         | Vitamin D supports mood and brain health; deficiency seen in BD-I & Cyclothymia. |
| **Average\_Sleep\_Hours**       | Average sleep hours per day                               | 3–8 hours                               | Numeric (no labels)                                                                                                         | BD patients sleep less during manic episodes; sleep cycles are irregular.        |
| **Physical\_Activity\_Level**   | Daily physical activity level                             | `Low`, `Moderate`, `High`               | Low = sedentary; Moderate = regular movement; High = very active                                                            | Sedentary lifestyle is common in BD due to depressive or manic states.           |
| **BD\_Type**                    | Type of bipolar disorder (target variable)                | `BD-I`, `BD-II`, `Cyclothymia`, `False` | BD-I = Full manic episodes; BD-II = Hypomania + depression; Cyclothymia = Milder mood swings; False = Unaffected individual | Classification target. Helps model predict BD type or absence.                   |



<br>

🧠 **Key Genetics Terms**
* SNP (Single Nucleotide Polymorphism): A small genetic variation in a DNA sequence. Everyone has them, but some are linked to health conditions.

* Genotype Notation (e.g., AA, AG, GG):

  * AA: Both alleles are the "A" variant

  * AG: One "A" and one "G" allele (heterozygous)

  * GG: Both alleles are the "G" variant

  * Risk is often higher in heterozygous or homozygous mutant individuals (e.g., AA or AG for CACNA1C)

**Methodology**
1. **Class Balancing**

* 4 Classes: BD-I, BD-II, Cyclothymia, and False (unaffected)

* almost 250 entries per class to ensure balance and support robust model training.

2. **Attribute-Driven Generation Based on Evidence**

Each attribute was generated based on published clinical associations with specific types of bipolar disorder. This ensures realism while maintaining anonymity and synthetic nature for safe experimentation.

<br>

🧬 **Feature Significance and Evidence-Based Justification**

| **Feature**                     | **Why It Matters**                              | **Value Meaning and Clinical Insight**                          |
| ------------------------------- | ----------------------------------------------- | --------------------------------------------------------------- |
| **Age**                         | Onset usually 20s–30s                           | Controlled between 18–65 to reflect realistic adult cohort      |
| **Sex**                         | BD-I = M>F; BD-II = F>M                         | Balanced randomly                                               |
| **Family\_History**             | Strong genetic link in BD                       | BD classes have more “Yes” values                               |
| **ANK3\_rs10994336**            | SNP linked to BD-I                              | Simulated genotype (AA/AG/GG), risk allele present more in BD-I |
| **CACNA1C\_rs1006737**          | SNP linked to mood disorders                    | Higher risk allele frequency in BD-II and Cyclothymia           |
| **ODZ4\_rs12576775**            | Involved in neuronal development                | Controlled based on BD subtype                                  |
| **Glutamate\_Level**            | Elevated in manic episodes                      | “High” more prevalent in BD-I and BD-II                         |
| **Tryptophan\_Metabolites**     | Tied to serotonin, mood                         | “Altered” more common in BD-II & Cyclothymia                    |
| **Cortisol\_Level**             | HPA axis dysregulation                          | “Elevated” in BD-I/II, “Normal” in controls                     |
| **Circadian\_Gene\_Disruption** | BD strongly affects sleep patterns              | “Yes” for BD, “No” for controls                                 |
| **Mitochondrial\_Dysfunction**  | Affects cellular energy and mood                | “Yes” more common in BD types                                   |
| **Neuroinflammation**           | Cytokine markers elevated in BD                 | “Yes” in BD-I/II/Cyclothymia                                    |
| **Omega3\_Intake**              | Anti-inflammatory, mood stabilizing             | “Low” associated with increased BD risk                         |
| **Folate\_Level**               | Cofactor in methylation, mood regulation        | “Deficient” more common in BD                                   |
| **VitaminD\_Level**             | Deficiency linked to depression                 | “Deficient” in BD-I and Cyclothymia                             |
| **Average\_Sleep\_Hours**       | Sleep dysregulation is diagnostic               | BD-I/II have fewer hours (<6); normal = 6–8                     |
| **Physical\_Activity\_Level**   | Sedentary behavior linked to poor mental health | “Low” activity in BD, “Moderate/High” in controls               |

<br>

🧠 **Key Design Principles**
* Medical Literature-Guided Simulation: Values are constrained based on observed distributions in real-world BD studies (PGC, dbBIP, BDgene).

* Controlled Correlation Between Features and Target Class: The generation logic ensures that features relevant to a class (e.g., BD-I) show up more frequently but not exclusively (to avoid overfitting).

<br>

🔗 **External Links**

https://genomeinterpretation.org/cagi4-bipolar.html

https://pubmed.ncbi.nlm.nih.gov/23764453/

https://www.nimhgenetics.org/resources/genomic-data/pgc-bp

https://pubmed.ncbi.nlm.nih.gov/35779245/

https://gwas.mrcieu.ac.uk/datasets/ieu-b-41/

https://genomeinterpretation.org/cagi4-bipolar.html


In [None]:
#Bipolar Dataset Preview:
b=pd.read_csv('/content/Updated_BD_Dataset.csv')
b.head().T

Unnamed: 0,0,1,2,3,4
Age,44,46,62,56,48
Sex,F,M,M,M,F
Family_History,Yes,Yes,No,Yes,Yes
ANK3_rs10994336,AA,AG,GG,AA,AA
CACNA1C_rs1006737,AA,AG,AA,GG,AG
ODZ4_rs12576775,AA,AA,AG,GG,GG
Glutamate_Level,High,Normal,Low,Low,High
Tryptophan_Metabolites,Altered,Normal,Normal,Altered,Altered
Cortisol_Level,Elevated,Normal,Normal,Normal,Elevated
Circadian_Gene_Disruption,Yes,No,No,Yes,Yes


### Random Forest + SMOTE

For More information reffer:

https://colab.research.google.com/drive/1_OrvWcdfK51go-bMPMRF7Pq4eNdZ_T1M?usp=drive_link

In [None]:
# Step 1: Load dataset
b = pd.read_csv("/content/Updated_BD_Dataset.csv")

In [None]:
# Step 2: Encode target label
target_col = "BD_Type"
label_encoder = LabelEncoder()
b[target_col] = label_encoder.fit_transform(b[target_col])

In [None]:
# Step 3: Separate features and target
X = b.drop(target_col, axis=1)
y = b[target_col]

In [None]:
# Step 4: Identify categorical columns
categorical_cols = X.select_dtypes(include=['object']).columns.tolist()

In [None]:
# Step 5: Preprocessing for categorical variables
preprocessor = ColumnTransformer([
    ('cat', OneHotEncoder(handle_unknown='ignore'), categorical_cols)
], remainder='passthrough')

In [None]:
# Step 6: Train-test split (stratified)
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42, stratify=y
)

In [None]:
# Step 7: Create pipeline with SMOTE and model
pipeline1 = ImbPipeline(steps=[
    ('preprocessor', preprocessor),
    ('smote', SMOTE(random_state=42)),
    ('classifier', RandomForestClassifier(random_state=42))
])

In [None]:
# Step 8: Train the model
pipeline1.fit(X_train, y_train)

The format of the columns of the 'remainder' transformer in ColumnTransformer.transformers_ will change in version 1.7 to match the format of the other transformers.
At the moment the remainder columns are stored as indices (of type int). With the same ColumnTransformer configuration, in the future they will be stored as column names (of type str).



In [None]:
# Step 9: Evaluate
y_pred = pipeline1.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)
print(classification_report(y_test, y_pred, target_names=label_encoder.classes_))

Accuracy: 1.0
              precision    recall  f1-score   support

        BD-I       1.00      1.00      1.00        50
       BD-II       1.00      1.00      1.00        50
 Cyclothymia       1.00      1.00      1.00        50
       False       1.00      1.00      1.00        50

    accuracy                           1.00       200
   macro avg       1.00      1.00      1.00       200
weighted avg       1.00      1.00      1.00       200



### Prediction

In [None]:
for col in X.columns:
  print(col)

Age
Sex
Family_History
ANK3_rs10994336
CACNA1C_rs1006737
ODZ4_rs12576775
Glutamate_Level
Tryptophan_Metabolites
Cortisol_Level
Circadian_Gene_Disruption
Mitochondrial_Dysfunction
Neuroinflammation
Omega3_Intake
Folate_Level
VitaminD_Level
Average_Sleep_Hours
Physical_Activity_Level


In [None]:
# Save the best model using joblib
import joblib

# Install joblib if you haven't already
!pip install joblib

joblib.dump(pipeline1, "BDModel.joblib")



['BDModel.joblib']

In [None]:
#To load the model later:
BD = joblib.load("BDModel.joblib")

In [None]:
# Save the best model using joblib
import joblib

# Install joblib if you haven't already
!pip install joblib

joblib.dump(label_encoder, "BD_label_encoder.joblib")



['BD_label_encoder.joblib']

In [None]:
#To load the model later:
BD_label_encoder = joblib.load("BD_label_encoder.joblib")

In [None]:
def predict_BP():
    """These function predicts depression from values such as
    Age
    Sex
    Family_History
    ANK3_rs10994336
    CACNA1C_rs1006737
    ODZ4_rs12576775
    Glutamate_Level
    Tryptophan_Metabolites
    Cortisol_Level
    Circadian_Gene_Disruption
    Mitochondrial_Dysfunction
    Neuroinflammation
    Omega3_Intake
    Folate_Level
    VitaminD_Level
    Average_Sleep_Hours
    Physical_Activity_Level
    """

    input_data = {
        "Age": int(input("Enter  Age: (int) ")),
        "Sex": input("Enter Gender: (F-female ,M-Male)"),
        "Family_History": input("Enter Family_History ('Yes','No'): "),
        "ANK3_rs10994336": input("Enter ANK3_rs10994336 ('AA', 'AG', 'GG'): "),
        "CACNA1C_rs1006737": input("Enter CACNA1C_rs1006737 ('AA', 'AG', 'GG'): "),
        "ODZ4_rs12576775": input("Enter ODZ4_rs12576775 ('AA', 'AG', 'GG'): "),
        "Glutamate_Level": input("Enter Glutamate_Level: (High,Noraml,Low) "),
        "Tryptophan_Metabolites":(input("Enter Tryptophan_Metabolites: (Altered,Noraml) ")),
        "Cortisol_Level": input("Enter Cortisol_Level: (Elevated,Noraml) "),
        "Circadian_Gene_Disruption": input("Enter Circadian_Gene_Disruption: (Yes, No) "),
        "Mitochondrial_Dysfunction": input("Enter Mitochondrial_Dysfunction: (Yes, No) "),
        "Neuroinflammation": input("Enter Neuroinflammation: (Yes,No) "),
        "Omega3_Intake": input("Enter Omega3_Intake: (Low,Adequate,High) "),
        "Folate_Level": input("Enter Folate_Level: (Deficient,Normal) "),
        "VitaminD_Level": input("Enter VitaminD_Level: (Deficient,Normal) "),
        "Average_Sleep_Hours": int(input("Enter Average_Sleep_Hours: (int) ")),
        "Physical_Activity_Level": input("Enter Physical_Activity_Level: (Moderate,Low,High) ")

        }

    # Convert to DataFrame
    user_df = pd.DataFrame([input_data])

    # Predict using the pipeline
    prediction = BD.predict(user_df)
    predicted_label = BD_label_encoder.inverse_transform(prediction)


    print("Bipolar Disorder Type :",predicted_label[0])
    return predicted_label[0]

In [None]:
BP_Type=predict_BP()

Enter  Age: (int) 32
Enter Gender: (F-female ,M-Male)F
Enter Family_History ('Yes','No'): Yes
Enter ANK3_rs10994336 ('AA', 'AG', 'GG'): AG
Enter CACNA1C_rs1006737 ('AA', 'AG', 'GG'): AA
Enter ODZ4_rs12576775 ('AA', 'AG', 'GG'): AG
Enter Glutamate_Level: (High,Noraml,Low) High
Enter Tryptophan_Metabolites: (Altered,Noraml) Normal
Enter Cortisol_Level: (Elevated,Noraml) Elevated
Enter Circadian_Gene_Disruption: (Yes, No) Yes
Enter Mitochondrial_Dysfunction: (Yes, No) Yes
Enter Neuroinflammation: (Yes,No) No
Enter Omega3_Intake: (Low,Adequate,High) Low
Enter Folate_Level: (Deficient,Normal) Deficient
Enter VitaminD_Level: (Deficient,Normal) Normal
Enter Average_Sleep_Hours: (int) 4
Enter Physical_Activity_Level: (Moderate,Low,High) Low
Bipolar Disorder Type : BD-II


## **All Combined**

Run Everything in a single cell

In [4]:
import pandas as pd
import joblib

def predict_all_disorders():
    # Load models and encoders
    depression_model = joblib.load("DepressionModel.joblib")
    depression_encoder = joblib.load("DepressionEncoder.joblib")

    BD_model = joblib.load("BDModel.joblib")
    BD_label_encoder = joblib.load("BD_label_encoder.joblib")

    anxiety_model = joblib.load("AnxietyModel.joblib")
    metadata = joblib.load("AnxietyMetadata.joblib")

    anxiety_columns = metadata['columns']
    anxiety_mappings = metadata['category_mappings']

    print("\n🧠 Welcome to the Unified Mental Health Diagnostic Tool 🧠\n")

    # --- Step 1: Shared Inputs ---
    shared_inputs = {}
    print("🔹 Enter shared inputs (these will be used for all models):\n")

    shared_inputs["Age"] = int(input("Age (int): "))
    shared_inputs["SleepDuration"] = float(input("Sleep Duration in hours (float): "))
    shared_inputs["Cortisol"] = float(input("Cortisol Level (float): "))
    shared_inputs["Vitamin_D"] = float(input("Vitamin D Level (float): "))

    # --- Step 2: Depression-specific Inputs ---
    print("\n🔹 Enter additional inputs for Depression Prediction:\n")
    depression_input = {
        **shared_inputs,
        "Genotype_5HTTLPR": input("Genotype_5HTTLPR ('Long/Long', 'Long/Short', 'Short/Short'): "),
        "Genotype_COMT": input("Genotype_COMT ('Val/Val', 'Met/Met', 'Val/Met'): "),
        "Genotype_MAOA": input("Genotype_MAOA ('Low', 'High'): "),
        "BDNF_Level": float(input("BDNF Level (float): ")),
        "CRP": float(input("CRP (float): ")),
        "Tryptophan": float(input("Tryptophan (float): ")),
        "Omega3_Index": float(input("Omega3 Index (float): ")),
        "MTHFR_Genotype": input("MTHFR_Genotype ('CC', 'CT', 'TT'): "),
        "Neuroinflammation_Score": float(input("Neuroinflammation Score (float): ")),
        "Monoamine_Oxidase_Level": float(input("Monoamine Oxidase Level (float): ")),
        "Serotonin_Level": float(input("Serotonin Level (float): ")),
        "HPA_Axis_Dysregulation": float(input("HPA Axis Dysregulation (float): ")),
        "DepressionScore_PHQ9": int(input("Depression Score PHQ-9 (int): "))
    }

    # --- Step 3: Bipolar-specific Inputs ---
    print("\n🔹 Enter additional inputs for Bipolar Disorder Prediction:\n")
    bipolar_input = {
        "Age": shared_inputs["Age"],
        "Sex": input("Sex (M or F): "),
        "Family_History": input("Family History ('Yes', 'No'): "),
        "ANK3_rs10994336": input("ANK3_rs10994336 ('AA', 'AG', 'GG'): "),
        "CACNA1C_rs1006737": input("CACNA1C_rs1006737 ('AA', 'AG', 'GG'): "),
        "ODZ4_rs12576775": input("ODZ4_rs12576775 ('AA', 'AG', 'GG'): "),
        "Glutamate_Level": input("Glutamate_Level ('High', 'Normal', 'Low'): "),
        "Tryptophan_Metabolites": input("Tryptophan_Metabolites ('Altered', 'Normal'): "),
        "Cortisol_Level": input("Cortisol_Level ('Elevated', 'Normal'): "),
        "Circadian_Gene_Disruption": input("Circadian_Gene_Disruption ('Yes', 'No'): "),
        "Mitochondrial_Dysfunction": input("Mitochondrial_Dysfunction ('Yes', 'No'): "),
        "Neuroinflammation": input("Neuroinflammation ('Yes', 'No'): "),
        "Omega3_Intake": input("Omega3_Intake ('Low', 'Adequate', 'High'): "),
        "Folate_Level": input("Folate_Level ('Deficient', 'Normal'): "),
        "VitaminD_Level": input("VitaminD_Level ('Deficient', 'Normal'): "),
        "Average_Sleep_Hours": int(shared_inputs["SleepDuration"]),
        "Physical_Activity_Level": input("Physical_Activity_Level ('Moderate', 'Low', 'High'): ")
    }

    # --- Step 4: Anxiety-specific Inputs ---
    print("\n🔹 Enter additional inputs for Anxiety Disorder Prediction:\n")
    anxiety_input = {}

    for col in anxiety_columns:
        if "Genotype" in col:
            original = col.replace("_Codes", "")
            options = list(anxiety_mappings[original].values())
            print(f"{original} options: {options}")
            val = input(f"Enter {original}: ")
            while val not in options:
                val = input(f"Invalid. Enter {original} again from {options}: ")
            # Convert to code
            code = list(anxiety_mappings[original].keys())[options.index(val)]
            anxiety_input[col] = code
        else:
            if col == "Age":
                anxiety_input[col] = shared_inputs["Age"]
            elif col == "SleepDuration":
                anxiety_input[col] = shared_inputs["SleepDuration"]
            elif col == "Cortisol":
                anxiety_input[col] = shared_inputs["Cortisol"]
            elif col == "Vitamin_D":
                anxiety_input[col] = shared_inputs["Vitamin_D"]
            else:
                while True:
                    try:
                        anxiety_input[col] = float(input(f"Enter {col} (float): "))
                        break
                    except ValueError:
                        print("Invalid input. Try again.")

    # --- Step 5: Predictions ---
    print("\n🧪 Running Predictions...\n")

    # Depression Prediction
    depression_df = pd.DataFrame([depression_input])
    depression_pred = depression_encoder.inverse_transform(depression_model.predict(depression_df))[0]

    # Bipolar Prediction
    bipolar_df = pd.DataFrame([bipolar_input])
    bipolar_pred = BD_label_encoder.inverse_transform(BD_model.predict(bipolar_df))[0]

    # Anxiety Prediction
    anxiety_df = pd.DataFrame([anxiety_input])
    anxiety_pred_code = anxiety_model.predict(anxiety_df)[0]
    anxiety_pred = anxiety_mappings['AnxietyDiagnosis'][anxiety_pred_code]

    # --- Step 6: Results ---
    print("🎯 Prediction Results:")
    print(f"  • Depression Type: {depression_pred}")
    print(f"  • Bipolar Disorder Type: {bipolar_pred}")
    print(f"  • Anxiety Disorder Type: {anxiety_pred}")

    return {
        "Depression": depression_pred,
        "BipolarDisorder": bipolar_pred,
        "Anxiety": anxiety_pred
    }




## Recommendation path

### **End-to-End Methodology for Personalized Mental Health Intervention Based on MindGen AI® Recommendations**

The recommended pathway begins with a comprehensive diagnostic confirmation process utilizing structured clinical interviews (e.g., SCID, MINI) and standardized assessment tools tailored to each predicted condition. For depression, this involves PHQ-9 evaluation and analysis of specific genetic markers like 5-HTTLPR and BDNF variants; for bipolar disorders, mood episode history and circadian gene analysis take priority; while anxiety disorders require careful assessment of avoidance behaviors alongside serotonin transporter gene variants. This multilayered diagnostic approach ensures accurate differentiation between conditions and their subtypes, which is particularly crucial when comorbidities are present.

<br>

Following confirmation, the intervention strategy adopts a phased approach that prioritizes the most impairing condition first - typically bipolar disorder when present, followed by major depression, then anxiety disorders. The pharmacological plan is carefully constructed using pharmacogenomic insights to select medications with the highest likelihood of efficacy and tolerability based on the individual's genetic profile. For instance, a patient with Major Depressive Disorder and COMT Met/Met genotype might benefit from noradrenergic agents, while someone with BD-I and ANK3 variants would typically start with lithium. Concurrently, nutrigenomic recommendations are implemented, focusing on correcting biochemical imbalances through targeted supplementation (e.g., omega-3s for inflammation, methylfolate for MTHFR variants) and dietary modifications that support neurotransmitter production and stress response systems.

<br>

The psychotherapeutic component combines evidence-based modalities tailored to the specific condition cluster. A patient with Persistent Depressive Disorder and Social Anxiety might receive CBT with enhanced social skills training, while someone with BD-II and Panic Disorder would benefit from IPSRT integrated with interoceptive exposure. Lifestyle interventions are precisely calibrated to address the unique pathophysiology of each condition - strict sleep hygiene and social rhythm regulation for bipolar disorders, graded behavioral activation for depression, and systematic exposure practice for anxiety conditions. Regular monitoring incorporates both subjective symptom tracking and objective biomarkers (e.g., cortisol levels, inflammatory markers), with treatment adjustments made through a decision-support algorithm that weighs therapeutic response against genetic predispositions. Throughout the 12-month intervention period, the plan dynamically adapts based on continuous data integration from clinical outcomes, wearable device metrics, and periodic genetic expression analyses, embodying the MindGen AI® vision of truly personalized, predictive, and preventive mental healthcare.

<br>

This methodology emphasizes early intervention through predictive modeling while maintaining robust ethical safeguards for genetic data usage. By simultaneously addressing biological, psychological, and social determinants through genetically-informed modalities, the approach significantly improves upon traditional trial-and-error treatment paradigms. The pathway concludes with a maintenance phase that focuses on relapse prevention through ongoing genetic monitoring and booster sessions, ensuring sustained mental well-being anchored in each individual's unique neurobiological profile.

In [5]:
def recommended_path(depression_pred, bipolar_pred, anxiety_pred):
    """
    Generates a comprehensive, formatted treatment plan report based on predicted mental health conditions.
    """
    # Generate the treatment plan dictionary (using the previous function's logic)
    treatment_plan = generate_treatment_plan_dict(depression_pred, bipolar_pred, anxiety_pred)

    # Create the formatted report
    report = []

    # Header
    report.append("="*80)
    report.append("MINDGEN AI® PERSONALIZED TREATMENT PLAN REPORT")
    report.append("="*80)
    report.append("\n")

    # Overview section
    report.append("OVERVIEW")
    report.append("-"*80)
    report.append(treatment_plan["Overview"])
    report.append("\n")

    # Conditions Detected
    report.append("CONDITIONS IDENTIFIED")
    report.append("-"*80)
    if depression_pred != "False":
        report.append(f"- {depression_pred}")
    if bipolar_pred != "False":
        report.append(f"- {bipolar_pred}")
    if anxiety_pred != "False":
        report.append(f"- {anxiety_pred}")
    if depression_pred == "False" and bipolar_pred == "False" and anxiety_pred == "False":
        report.append("- No significant mental health conditions detected")
    report.append("\n")

    # Genetic Considerations
    if treatment_plan["Genetic_Considerations"]:
        report.append("GENETIC CONSIDERATIONS")
        report.append("-"*80)
        for item in treatment_plan["Genetic_Considerations"]:
            report.append(f"• {item}")
        report.append("\n")

    # Diagnostic Confirmation
    if treatment_plan["Diagnostic_Confirmation"]:
        report.append("DIAGNOSTIC CONFIRMATION STEPS")
        report.append("-"*80)
        for item in treatment_plan["Diagnostic_Confirmation"]:
            report.append(f"• {item}")
        report.append("\n")

    # Personalized Interventions
    if treatment_plan["Personalized_Interventions"]:
        report.append("PERSONALIZED INTERVENTIONS")
        report.append("-"*80)
        for i, item in enumerate(treatment_plan["Personalized_Interventions"], 1):
            report.append(f"{i}. {item}")
        report.append("\n")

    # Pharmacological Approach
    if treatment_plan["Pharmacological_Approach"]:
        report.append("PHARMACOLOGICAL APPROACH")
        report.append("-"*80)
        for i, item in enumerate(treatment_plan["Pharmacological_Approach"], 1):
            report.append(f"{i}. {item}")
        report.append("\n")

    # Nutrigenomic Recommendations
    if treatment_plan["Nutrigenomic_Recommendations"]:
        report.append("NUTRIGENOMIC RECOMMENDATIONS")
        report.append("-"*80)
        for item in treatment_plan["Nutrigenomic_Recommendations"]:
            report.append(f"• {item}")
        report.append("\n")

    # Lifestyle Modifications
    if treatment_plan["Lifestyle_Modifications"]:
        report.append("LIFESTYLE MODIFICATIONS")
        report.append("-"*80)
        for item in treatment_plan["Lifestyle_Modifications"]:
            report.append(f"• {item}")
        report.append("\n")

    # Therapeutic Approaches
    if treatment_plan["Therapeutic_Approaches"]:
        report.append("THERAPEUTIC APPROACHES")
        report.append("-"*80)
        for i, item in enumerate(treatment_plan["Therapeutic_Approaches"], 1):
            report.append(f"{i}. {item}")
        report.append("\n")

    # Monitoring and Follow-up
    if treatment_plan["Monitoring_and_Followup"]:
        report.append("MONITORING AND FOLLOW-UP PLAN")
        report.append("-"*80)
        for item in treatment_plan["Monitoring_and_Followup"]:
            report.append(f"• {item}")
        report.append("\n")

    # Special Considerations
    if treatment_plan["Special_Considerations"]:
        report.append("SPECIAL CONSIDERATIONS")
        report.append("-"*80)
        for item in treatment_plan["Special_Considerations"]:
            report.append(f"⚠️ {item}")
        report.append("\n")

    # Footer
    report.append("="*80)
    report.append("END OF REPORT")
    report.append("="*80)

    # Join all lines with newlines and return
    return "\n".join(report)


def generate_treatment_plan_dict(depression_pred, bipolar_pred, anxiety_pred):
    """
    Provides a customized treatment plan based on predicted mental health conditions.

    Parameters:
    - depression_pred: One of the depression types or 'False'
    - bipolar_pred: One of the bipolar disorder types or 'False'
    - anxiety_pred: One of the anxiety types or 'False'

    Returns:
    - A detailed treatment plan dictionary with sections for each condition and combined recommendations
    """

    # Initialize the treatment plan
    treatment_plan = {
        "Overview": "",
        "Genetic_Considerations": [],
        "Diagnostic_Confirmation": [],
        "Personalized_Interventions": [],
        "Pharmacological_Approach": [],
        "Nutrigenomic_Recommendations": [],
        "Lifestyle_Modifications": [],
        "Therapeutic_Approaches": [],
        "Monitoring_and_Followup": [],
        "Special_Considerations": []
    }

    # Helper function to add unique items to a section
    def add_unique(section, items):
        for item in items:
            if item not in treatment_plan[section]:
                treatment_plan[section].append(item)

    # Overview section
    conditions = []
    if depression_pred != "False":
        conditions.append(depression_pred)
    if bipolar_pred != "False":
        conditions.append(bipolar_pred)
    if anxiety_pred != "False":
        conditions.append(anxiety_pred)

    if not conditions:
        treatment_plan["Overview"] = "No significant mental health conditions detected. Maintain current wellness practices."
        return treatment_plan
    else:
        treatment_plan["Overview"] = f"Comprehensive treatment plan for: {', '.join(conditions)}"

    # ========================
    # DEPRESSION RECOMMENDATIONS
    # ========================
    if depression_pred != "False":
        # Genetic considerations for depression
        dep_genetic = [
            "Review 5-HTTLPR, COMT, and MAOA genotypes for serotonin metabolism insights",
            "Assess BDNF levels and genetic variants for neuroplasticity impact",
            "Evaluate MTHFR status for folate metabolism implications"
        ]
        add_unique("Genetic_Considerations", dep_genetic)

        # Diagnostic confirmation for depression
        dep_diagnostic = [
            "Confirm diagnosis with structured clinical interview (e.g., SCID)",
            "Assess severity using PHQ-9 and clinician-rated scales",
            "Evaluate for comorbid medical conditions affecting mood"
        ]
        add_unique("Diagnostic_Confirmation", dep_diagnostic)

        # Depression-specific interventions
        if depression_pred == "Major Depressive Disorder":
            dep_interventions = [
                "Initiate evidence-based psychotherapy (CBT or IPT)",
                "Consider pharmacogenomic testing for antidepressant selection",
                "Implement mood monitoring system",
                "Assess suicide risk and develop safety plan"
            ]
        elif depression_pred == "Persistent Depressive Disorder":
            dep_interventions = [
                "Long-term psychotherapy approach (CBT or psychodynamic)",
                "Consider combination treatment with medication and therapy",
                "Focus on building resilience and coping strategies",
                "Address chronic stressors and interpersonal factors"
            ]
        elif depression_pred == "Atypical Depression":
            dep_interventions = [
                "Prioritize MAOIs or SSRIs with noradrenergic effects",
                "Focus on regulating sleep and appetite patterns",
                "Behavioral activation to counteract lethargy",
                "Address rejection sensitivity in therapy"
            ]
        elif depression_pred == "Psychotic Depression":
            dep_interventions = [
                "Requires combination of antidepressant and antipsychotic",
                "Close monitoring for safety concerns",
                "Consider inpatient care if severe",
                "Family education and support"
            ]
        elif depression_pred == "Seasonal Affective Disorder":
            dep_interventions = [
                "Light therapy (10,000 lux for 30-45 min daily)",
                "Consider vitamin D supplementation",
                "Timed melatonin administration",
                "Cognitive-behavioral therapy adapted for SAD"
            ]
        add_unique("Personalized_Interventions", dep_interventions)

        # Pharmacological approach for depression
        dep_pharma = [
            "Select antidepressant based on genetic profile and subtype",
            "Consider SSRI first-line unless contraindicated",
            "Monitor for 4-6 weeks before assessing efficacy",
            "Adjust dose based on therapeutic drug monitoring if available"
        ]
        add_unique("Pharmacological_Approach", dep_pharma)

        # Nutrigenomic recommendations for depression
        dep_nutri = [
            "Ensure adequate tryptophan intake (precursor to serotonin)",
            "Optimize omega-3 fatty acids (EPA/DHA 1-2g daily)",
            "Consider methylfolate if MTHFR variants present",
            "Address potential micronutrient deficiencies (B12, zinc, magnesium)"
        ]
        add_unique("Nutrigenomic_Recommendations", dep_nutri)

        # Lifestyle modifications for depression
        dep_lifestyle = [
            "Regular aerobic exercise (3-5x/week)",
            "Sleep hygiene education and regulation",
            "Structured daily routine",
            "Social connection and support system building"
        ]
        add_unique("Lifestyle_Modifications", dep_lifestyle)

    # ========================
    # BIPOLAR DISORDER RECOMMENDATIONS
    # ========================
    if bipolar_pred != "False":
        # Genetic considerations for bipolar
        bp_genetic = [
            "Review ANK3, CACNA1C, and ODZ4 variants for calcium channel insights",
            "Assess circadian gene polymorphisms",
            "Evaluate mitochondrial DNA variants if dysfunction suspected"
        ]
        add_unique("Genetic_Considerations", bp_genetic)

        # Diagnostic confirmation for bipolar
        bp_diagnostic = [
            "Confirm diagnosis with MINI or SCID",
            "Detailed mood episode history and family history",
            "Rule out substance-induced mood episodes",
            "Assess for mixed features"
        ]
        add_unique("Diagnostic_Confirmation", bp_diagnostic)

        # Bipolar-specific interventions
        if bipolar_pred == "BD-I":
            bp_interventions = [
                "Mood stabilizer as foundation (lithium, valproate, or lamotrigine)",
                "Monitor for manic/hypomanic symptoms closely",
                "Psychoeducation about illness course",
                "Develop relapse prevention plan"
            ]
        elif bipolar_pred == "BD-II":
            bp_interventions = [
                "Lamotrigine or quetiapine as first-line",
                "Focus on depression prevention",
                "Careful monitoring for hypomania with antidepressants",
                "Address interpersonal and social rhythm disruptions"
            ]
        elif bipolar_pred == "Cyclothymia":
            bp_interventions = [
                "Consider low-dose mood stabilizer if impairing",
                "Focus on lifestyle regularity",
                "Cognitive therapy for mood swings",
                "Monitor for progression to BD-I or II"
            ]
        add_unique("Personalized_Interventions", bp_interventions)

        # Pharmacological approach for bipolar
        bp_pharma = [
            "Avoid antidepressants without mood stabilizer in BD-I",
            "Consider lithium for suicide prevention in BD",
            "Monitor valproate levels in women of childbearing age",
            "Adjust treatment based on phase (acute vs maintenance)"
        ]
        add_unique("Pharmacological_Approach", bp_pharma)

        # Nutrigenomic recommendations for bipolar
        bp_nutri = [
            "Ensure adequate omega-3 intake (may have mood stabilizing effects)",
            "Consider N-acetylcysteine as adjunctive",
            "Monitor homocysteine levels (may relate to folate metabolism)",
            "Address circadian-related nutrition (timed meals, caffeine management)"
        ]
        add_unique("Nutrigenomic_Recommendations", bp_nutri)

        # Lifestyle modifications for bipolar
        bp_lifestyle = [
            "Strict sleep-wake cycle maintenance",
            "Social rhythm therapy to stabilize daily patterns",
            "Stress reduction techniques",
            "Avoidance of substances and sleep deprivation"
        ]
        add_unique("Lifestyle_Modifications", bp_lifestyle)

    # ========================
    # ANXIETY DISORDER RECOMMENDATIONS
    # ========================
    if anxiety_pred != "False":
        # Genetic considerations for anxiety
        anx_genetic = [
            "Review SLC6A4 and other serotonin transporter variants",
            "Assess COMT Val158Met for stress response impact",
            "Evaluate GABA receptor polymorphisms if panic features present"
        ]
        add_unique("Genetic_Considerations", anx_genetic)

        # Diagnostic confirmation for anxiety
        anx_diagnostic = [
            "Confirm diagnosis with ADIS or similar structured interview",
            "Assess avoidance behaviors and functional impact",
            "Rule out medical causes (hyperthyroidism, etc.)",
            "Evaluate for trauma history if relevant"
        ]
        add_unique("Diagnostic_Confirmation", anx_diagnostic)

        # Anxiety-specific interventions
        if anxiety_pred == "Generalized Anxiety Disorder":
            anx_interventions = [
                "CBT with worry exposure and cognitive restructuring",
                "Mindfulness-based stress reduction",
                "Address intolerance of uncertainty",
                "Problem-solving skills training"
            ]
        elif anxiety_pred == "Panic Disorder":
            anx_interventions = [
                "Interoceptive exposure therapy",
                "Cognitive restructuring of catastrophic interpretations",
                "Breathing retraining",
                "Gradual exposure to avoided situations"
            ]
        elif anxiety_pred == "Social Anxiety Disorder":
            anx_interventions = [
                "Social skills training if deficits present",
                "Cognitive restructuring of negative beliefs",
                "Exposure to social situations",
                "Attention retraining for self-focused attention"
            ]
        elif anxiety_pred == "Agoraphobia":
            anx_interventions = [
                "In vivo exposure hierarchy development",
                "Cognitive challenging of safety behaviors",
                "Gradual expansion of safe zone",
                "Partner/family involvement if helpful"
            ]
        elif anxiety_pred == "Specific Phobia":
            anx_interventions = [
                "Exposure therapy tailored to phobic stimulus",
                "Systematic desensitization",
                "Cognitive restructuring of threat appraisal",
                "Modeling and reinforcement techniques"
            ]
        add_unique("Personalized_Interventions", anx_interventions)

        # Pharmacological approach for anxiety
        anx_pharma = [
            "Consider SSRI/SNRI as first-line pharmacotherapy",
            "Short-term benzodiazepine only if severe impairment",
            "Monitor for initial anxiety exacerbation with SSRIs",
            "Consider buspirone for GAD if SSRI not tolerated"
        ]
        add_unique("Pharmacological_Approach", anx_pharma)

        # Nutrigenomic recommendations for anxiety
        anx_nutri = [
            "Ensure balanced blood sugar (avoid hypoglycemia triggers)",
            "Consider L-theanine and magnesium for relaxation",
            "Monitor caffeine and alcohol intake",
            "Adequate protein intake for amino acid precursors"
        ]
        add_unique("Nutrigenomic_Recommendations", anx_nutri)

        # Lifestyle modifications for anxiety
        anx_lifestyle = [
            "Regular exercise (yoga can be particularly helpful)",
            "Breathing and relaxation practice",
            "Stimulant reduction (caffeine, nicotine)",
            "Sleep hygiene optimization"
        ]
        add_unique("Lifestyle_Modifications", anx_lifestyle)

    # ========================
    # COMBINATION CONSIDERATIONS
    # ========================

    # Special considerations for combinations
    combo_special = []

    # Depression + Anxiety
    if depression_pred != "False" and anxiety_pred != "False":
        combo_special.extend([
            "Address depression first if severe as it may limit anxiety treatment engagement",
            "Consider SNRIs that treat both conditions",
            "Modify CBT to address both disorders simultaneously",
            "Monitor for increased suicide risk with mixed depression/anxiety"
        ])

    # Bipolar + Anxiety
    if bipolar_pred != "False" and anxiety_pred != "False":
        combo_special.extend([
            "Stabilize mood first before aggressively treating anxiety",
            "Avoid benzodiazepines if possible (risk of misuse, worsening depression)",
            "Consider quetiapine or lurasidone which may help both",
            "Address anxiety in context of mood stability"
        ])

    # Bipolar + Depression
    if bipolar_pred != "False" and depression_pred != "False":
        combo_special.extend([
            "Differentiate between unipolar and bipolar depression in treatment approach",
            "Caution with antidepressants - use only with mood stabilizer",
            "Consider lamotrigine for bipolar depression",
            "Monitor closely for switching to hypomania/mania"
        ])

    # All three conditions
    if (depression_pred != "False" and bipolar_pred != "False"
        and anxiety_pred != "False"):
        combo_special.extend([
            "Prioritize mood stabilization as foundation",
            "Sequential treatment approach - bipolar stability first, then depression, then anxiety",
            "Consider comprehensive DBT approach for emotion regulation",
            "Multidisciplinary team management essential"
        ])

    add_unique("Special_Considerations", combo_special)

    # ========================
    # THERAPEUTIC APPROACHES
    # ========================
    therapies = []

    # Common evidence-based therapies
    therapies.extend([
        "Cognitive Behavioral Therapy (tailored to primary diagnosis)",
        "Psychoeducation about condition(s) and treatment",
        "Mindfulness-based interventions",
        "Behavioral activation (especially for depression)"
    ])

    # Condition-specific therapies
    if bipolar_pred != "False":
        therapies.extend([
            "Interpersonal and Social Rhythm Therapy (IPSRT)",
            "Family-focused therapy for bipolar disorder"
        ])

    if anxiety_pred != "False":
        therapies.extend([
            "Exposure-based therapies",
            "Acceptance and Commitment Therapy (ACT)"
        ])

    if depression_pred != "False":
        therapies.extend([
            "Behavioral Activation",
            "Problem-Solving Therapy"
        ])

    add_unique("Therapeutic_Approaches", therapies)

    # ========================
    # MONITORING AND FOLLOWUP
    # ========================
    monitoring = [
        "Regular clinical follow-up (frequency depends on severity)",
        "Standardized symptom tracking (e.g., mood charts, anxiety diaries)",
        "Routine labs as needed (lithium levels, metabolic monitoring)",
        "Periodic re-assessment of treatment plan efficacy",
        "Functional outcome assessment (work, relationships, quality of life)"
    ]

    if bipolar_pred != "False":
        monitoring.extend([
            "Mood episode symptom monitoring",
            "Early warning sign identification plan"
        ])

    if depression_pred != "False":
        monitoring.extend([
            "Suicide risk reassessment at each contact",
            "PHQ-9 tracking over time"
        ])

    if anxiety_pred != "False":
        monitoring.extend([
            "Exposure hierarchy progress tracking",
            "Anxiety diary review"
        ])

    add_unique("Monitoring_and_Followup", monitoring)

    return treatment_plan


In [7]:
# Example usage:
result = predict_all_disorders()
report = recommended_path(result["Depression"], result["BipolarDisorder"], result["Anxiety"])
print(report)


🧠 Welcome to the Unified Mental Health Diagnostic Tool 🧠

🔹 Enter shared inputs (these will be used for all models):

Age (int): 23
Sleep Duration in hours (float): 4
Cortisol Level (float): 12
Vitamin D Level (float): 43

🔹 Enter additional inputs for Depression Prediction:

Genotype_5HTTLPR ('Long/Long', 'Long/Short', 'Short/Short'): Long/Long
Genotype_COMT ('Val/Val', 'Met/Met', 'Val/Met'): Val/Met
Genotype_MAOA ('Low', 'High'): Low
BDNF Level (float): 43.1
CRP (float): 43.9
Tryptophan (float): 4
Omega3 Index (float): 0.76
MTHFR_Genotype ('CC', 'CT', 'TT'): TT
Neuroinflammation Score (float): 42
Monoamine Oxidase Level (float): 76
Serotonin Level (float): 32.89
HPA Axis Dysregulation (float): 42
Depression Score PHQ-9 (int): 3

🔹 Enter additional inputs for Bipolar Disorder Prediction:

Sex (M or F): M
Family History ('Yes', 'No'): Yes
ANK3_rs10994336 ('AA', 'AG', 'GG'): AG
CACNA1C_rs1006737 ('AA', 'AG', 'GG'): AA
ODZ4_rs12576775 ('AA', 'AG', 'GG'): AG
Glutamate_Level ('High', 'Nor