# ***Building RAYA ‚Äî The Architect of Type 1 Civilization***

 ‚ÄúDriving humanity toward a sustainable and intelligent civilization.‚Äù



---

### üåç ***Problem Statement: "The Human Sustainability Index (HSI) 2.0 ‚Äî AI for a Livable Future"***

The next frontier for humanity isn‚Äôt just technological progress ‚Äî it‚Äôs **sustainable survival**.

**HSI 2.0** acts as an intelligent sustainability compass ‚Äî an AI-powered mirror that reflects how livable and future-ready a city truly is. It goes beyond static metrics to **analyze, predict, and simulate** the balance between human development and natural systems.

By integrating multi-dimensional data on **water, energy, climate, pollution, and waste**, HSI 2.0 identifies **emerging risks, regional disparities, and pathways for sustainable growth**.

Through **AI-driven clustering, predictive modeling, and generative insights**, it empowers **governments, communities, and organizations** to design smarter policies, foster resilience, and ensure a **thriving planet for generations ahead.** üå±‚ú®

---



----

### **What's in Human Sustainability Index (HSI) 2.0 ?**

**Ans:** This dataset offers a realistic and holistic foundation for Human Sustainability Index (HSI) prediction. It has been synthesized by integrating multiple open-source datasets to create a unified, systematized dataset that reflects real-world sustainability conditions across regions.

Instead of using a direct HSI score, we employed **clustering techniques** to determine optimal category groupings for HSI, resulting in five meaningful sustainability classes ‚Äî **["Moderately Sustainable", "Critical (Unsustainable)", "Highly Sustainable", "Sustainable", "Low Sustainable"]**.

Additionally, we computed the **Urbanization %** and enriched the dataset with extended features such as **Energy Source, Water Consumption, SGD %, AQI, AQI Bucket, Waste Type, Disposal Method, Recycling Rate (%),** and **Cost of Waste Management (‚Çπ/Ton)** ‚Äî all contributing to a more data-driven and actionable sustainability assessment.



In [None]:
import pandas as pd
import numpy as np

data = pd.read_csv('HSI_data_2.0_updated.csv')

data.tail()

Unnamed: 0,Year,State,District,population,population_proper,Location,EnergySource,WaterConsumption,SGD %,AQI,AQI_Bucket,Waste Type,Disposal Method,Recycling Rate (%),Cost of Waste Management (‚Çπ/Ton)
29526,2017,Haryana,Gurugram,7904745,6014234.0,Rural,Mixed,29438040.0,58.3,209,Moderate,Non-Biodegradable,Landfill,87.712836,9468
29527,2018,Chhattisgarh,Bilaspur,478957,335327.2,Suburban,Mixed,265302800.0,90.2,216,Moderate,E-Waste,Landfill,89.98467,4013
29528,2019,Haryana,Gurugram,2825327,1767623.0,Rural,Hydro,160348700.0,66.5,66,Severe,Biodegradable,Recycling,89.289829,3111
29529,2020,Rajasthan,Udaipur,8593998,6641304.0,Urban,Thermal,152209800.0,47.6,227,Moderate,Biodegradable,Incineration,62.247141,8740
29530,2021,Telangana,Hyderabad,5691472,3678529.0,Urban,Hydro,180466100.0,44.4,397,Good,Mixed,Recycling,47.582822,3962


In [None]:
# Group by State and Location to sum population
state_location = data.groupby(['State', 'Location'])['population'].sum().unstack(fill_value=0)

# Compute Urbanization %
state_location['Urbanization %'] = (state_location['Urban'] / state_location.sum(axis=1)) * 100

# Merge back to main df
data = data.merge(
    state_location['Urbanization %'],
    on='State',
    how='left'
)

data.head()

Unnamed: 0,Year,State,District,population,population_proper,Location,EnergySource,WaterConsumption,SGD %,AQI,AQI_Bucket,Waste Type,Disposal Method,Recycling Rate (%),Cost of Waste Management (‚Çπ/Ton),Urbanization %
0,2015,ANDHRA PRADESH,Anantapur,5690620,4656456.657,Urban,Mixed,224915200.0,47.1,377,Good,E-Waste,Landfill,20.060841,3056,37.059616
1,2016,ANDHRA PRADESH,Chittoor,4402397,2880483.747,Suburban,Solar,212023800.0,82.7,311,Very Poor,Biodegradable,Landfill,82.181303,2778,37.059616
2,2017,ANDHRA PRADESH,East Godavari,5960353,5499209.272,Rural,Thermal,230763300.0,67.1,135,Very Poor,Mixed,Recycling,70.512784,3390,37.059616
3,2018,ANDHRA PRADESH,Guntur,4472323,3596533.313,Urban,Thermal,118994200.0,72.9,100,Very Poor,Mixed,Landfill,39.78976,1498,37.059616
4,2019,ANDHRA PRADESH,Kadapa,9315873,7518150.908,Suburban,Solar,23832090.0,49.6,94,Severe,Non-Biodegradable,Incineration,14.726121,2221,37.059616


In [None]:
data.isnull().sum()

Unnamed: 0,0
Year,0
State,0
District,0
population,0
population_proper,0
Location,0
EnergySource,0
WaterConsumption,0
SGD %,0
AQI,0


In [None]:
from sklearn.preprocessing import StandardScaler
from sklearn.cluster import KMeans

# Select features for clustering
features = data[['population',
 'WaterConsumption', 'SGD %', 'AQI',
 'Recycling Rate (%)', 'Cost of Waste Management (‚Çπ/Ton)',
 'Urbanization %']]


# -----------------------------
# Scale features ‚Äî very important for K-Means
# -----------------------------
scaler = StandardScaler()
features_scaled = scaler.fit_transform(features)

In [None]:
# -----------------------------
# Find optimal k with Elbow Method
# -----------------------------
inertia = []
k_range = range(2, 10)

for k in k_range:
    kmeans = KMeans(n_clusters=k, random_state=42)
    kmeans.fit(features_scaled)
    inertia.append(kmeans.inertia_)  # Distortion / SSE

import plotly.graph_objects as go

fig = go.Figure()

fig.add_trace(
    go.Scatter(
        x=list(k_range),
        y=inertia,
        mode='lines+markers',
        marker=dict(color='royalblue', size=8),
        line=dict(width=2),
        name='Inertia'
    )
)

fig.update_layout(
    title='Elbow Method For Optimal k',
    xaxis_title='Number of clusters (k)',
    yaxis_title='Inertia (SSE)',
    xaxis=dict(tickmode='linear'),
    template='plotly_white',
    width=800,
    height=500
)

fig.show()

In [None]:
# Suppose you choose k= from the elbow
optimal_k = 5

kmeans_final = KMeans(n_clusters=optimal_k, random_state=42)
data['HSI_Type'] = kmeans_final.fit_predict(features_scaled)

print(data.groupby('HSI_Type')[['population', 'WaterConsumption', 'SGD %', 'AQI', 'Recycling Rate (%)', 'Cost of Waste Management (‚Çπ/Ton)', 'Urbanization %']].mean())

            population  WaterConsumption      SGD %         AQI  \
HSI_Type                                                          
0         3.742767e+06      2.490753e+08  80.764863  300.361031   
1         4.130348e+06      2.396905e+08  57.597607  148.558974   
2         7.276895e+06      2.545164e+08  55.446529  302.310650   
3         2.454338e+06      2.644924e+08  64.325046  198.940901   
4         7.497677e+06      2.489859e+08  78.816467  146.387662   

          Recycling Rate (%)  Cost of Waste Management (‚Çπ/Ton)  Urbanization %  
HSI_Type                                                                        
0                  49.516740                       5230.440733       32.781547  
1                  51.063874                       4155.477265       33.934893  
2                  49.234896                       7381.229670       32.992404  
3                  50.675996                       9721.365813       33.035306  
4                  50.303427              

In [None]:
cluster_avg = data.groupby("HSI_Type")[['population', 'WaterConsumption', 'SGD %', 'AQI', 'Recycling Rate (%)', 'Cost of Waste Management (‚Çπ/Ton)', 'Urbanization %']].mean().sort_values(by='population')
ordered_clusters = cluster_avg.index.tolist()

In [None]:
ordered_clusters

[3, 0, 1, 2, 4]

In [None]:
# Make a mapping
segment_names = ["Sustainable", "Critical (Unsustainable)", "Low Sustainable", "Moderately Sustainable", "Highly Sustainable"]
cluster_to_label = {cluster: segment_names[i] for i, cluster in enumerate(ordered_clusters)}

# Apply mapping
data["HSI_Label"] = data["HSI_Type"].map(cluster_to_label)

print(data[["District","HSI_Type", "HSI_Label"]].head())
print('--' * 25)
print(data[["HSI_Type", "HSI_Label"]].value_counts())

        District  HSI_Type                 HSI_Label
0      Anantapur         2    Moderately Sustainable
1       Chittoor         0  Critical (Unsustainable)
2  East Godavari         1           Low Sustainable
3         Guntur         1           Low Sustainable
4         Kadapa         1           Low Sustainable
--------------------------------------------------
HSI_Type  HSI_Label               
3         Sustainable                 5973
4         Highly Sustainable          5933
0         Critical (Unsustainable)    5897
2         Moderately Sustainable      5878
1         Low Sustainable             5850
Name: count, dtype: int64


In [None]:
data['Location'] = data['Location'].map({
    'Urban': 1,
    'Suburban': 2,
    'Rural': 3
})

In [None]:
data['EnergySource'] = data['EnergySource'].map({
    'Hydro': 1,
    'Mixed': 2,
    'Thermal': 3,
    'Solar': 4,
    'Wind': 5
})

In [None]:
data['Waste Type'] = data['Waste Type'].map({
    'E-Waste': 1,
    'Non-Biodegradable': 2,
    'Biodegradable': 3,
    'Mixed': 4

})

In [None]:
data['Disposal Method'] = data['Disposal Method'].map({
    'Incineration': 1,
    'Composting': 2,
    'Landfill': 3,
    'Recycling': 4

})

In [None]:
data.head(8)

Unnamed: 0,State,District,population,population_proper,Location,EnergySource,WaterConsumption,SGD %,AQI,AQI_Bucket,Waste Type,Disposal Method,Recycling Rate (%),Cost of Waste Management (‚Çπ/Ton),Urbanization %,HSI_Type,HSI_Label
0,ANDHRA PRADESH,Anantapur,5690620,4656457.0,1,2,224915200.0,47.1,377,Good,1,3,20.060841,3056.0,37.059616,3,Sustainable
1,ANDHRA PRADESH,Chittoor,4402397,2880484.0,2,4,212023800.0,82.7,311,Very Poor,3,3,82.181303,2778.0,37.059616,1,Low Sustainable
2,ANDHRA PRADESH,East Godavari,5960353,5499209.0,3,3,230763300.0,67.1,135,Very Poor,4,4,70.512784,3390.0,37.059616,2,Moderately Sustainable
3,ANDHRA PRADESH,Guntur,4472323,3596533.0,1,3,118994200.0,72.9,100,Very Poor,4,3,39.78976,1498.0,37.059616,4,Highly Sustainable
4,ANDHRA PRADESH,Kadapa,9315873,7518151.0,2,4,23832090.0,49.6,94,Severe,2,1,14.726121,2221.0,37.059616,4,Highly Sustainable
5,ANDHRA PRADESH,Krishna,3744548,2894272.0,3,3,470134800.0,73.6,382,Good,3,4,36.631581,3195.0,37.059616,0,Critical (Unsustainable)
6,ANDHRA PRADESH,Kurnool,1281696,861136.5,1,4,250529800.0,77.3,95,Moderate,2,1,74.664997,3686.0,37.059616,2,Moderately Sustainable
7,ANDHRA PRADESH,Nellore,6054779,5082444.0,2,1,398320600.0,41.4,122,Severe,2,4,34.713708,1791.0,37.059616,0,Critical (Unsustainable)


In [None]:
data_ts = data.copy()

In [None]:
data.columns

Index(['Year', 'State', 'District', 'population', 'population_proper',
       'Location', 'EnergySource', 'WaterConsumption', 'SGD %', 'AQI',
       'AQI_Bucket', 'Waste Type', 'Disposal Method', 'Recycling Rate (%)',
       'Cost of Waste Management (‚Çπ/Ton)', 'Urbanization %', 'HSI_Type',
       'HSI_Label'],
      dtype='object')

In [None]:
df_numeric = data.select_dtypes(include=['number'])
corr = df_numeric.corr()

import plotly.express as px

fig = px.imshow(
    corr,
    text_auto=".2f",
    color_continuous_scale='Teal',
    title="Correlation Heatmap (Numeric Columns Only)"
)

fig.update_layout(
    width=850,
    height=600
)

fig.show()


In [None]:
corr_features = [
    'population', 'Location',
       'EnergySource', 'WaterConsumption', 'SGD %', 'AQI',
       'Waste Type', 'Disposal Method', 'Recycling Rate (%)',
       'Cost of Waste Management (‚Çπ/Ton)', 'Urbanization %', 'HSI_Type'
]

import plotly.express as px
import plotly.graph_objects as go

# Calculate correlation matrix again
corr_matrix = data[corr_features].corr()

# Select only correlations with target column
target_corr = corr_matrix[['HSI_Type']].drop(index='HSI_Type').reset_index()
target_corr.columns = ['Feature', 'Correlation']
target_corr['AbsCorrelation'] = target_corr['Correlation'].abs()

# Sort by absolute correlation
target_corr = target_corr.sort_values(by='AbsCorrelation', ascending=False)

fig = px.bar(
    target_corr,
    x='Feature',
    y='Correlation',
    color='Correlation',
    title='Feature Correlations with Human Sustainability Index (HSI)',
    color_continuous_scale='Teal',
    text=target_corr['Correlation'].round(3)
)

fig.update_layout(
    width=1050,
    height=650,
    xaxis_title='Feature',
    yaxis_title='Correlation with Human Sustainability Index (HSI)',
    bargap=0.3
)

fig.update_traces(
    textposition='outside'
)

fig.show()


In [None]:
from sklearn.preprocessing import LabelEncoder

le_state = LabelEncoder()
le_city = LabelEncoder()

data['State_encoded'] = le_state.fit_transform(data['State'])
data['District_encoded'] = le_city.fit_transform(data['District'])


In [None]:
data.drop(['State', 'District'], axis=1, inplace=True)


In [None]:
from sklearn.model_selection import train_test_split

X = data[['State_encoded', 'District_encoded','population', 'Location',
       'EnergySource', 'WaterConsumption', 'SGD %', 'AQI',
       'Waste Type', 'Disposal Method', 'Recycling Rate (%)',
       'Cost of Waste Management (‚Çπ/Ton)', 'Urbanization %']]

y = data['HSI_Type']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)


In [None]:
import numpy as np
import pandas as pd
from lightgbm import LGBMClassifier
from sklearn.metrics import accuracy_score, confusion_matrix
import plotly.figure_factory as ff

# ‚úÖ Train LightGBM model (no SMOTE)
lgb_model = LGBMClassifier(
    n_estimators=300,
    learning_rate=0.05,
    max_depth=-1,            # LightGBM handles depth automatically
    subsample=0.8,
    colsample_bytree=0.8,
    random_state=42,
    objective='multiclass',
    num_class=len(np.unique(y_train))
)

# ‚úÖ Fit the model
lgb_model.fit(X_train, y_train)

# ‚úÖ Predictions
y_pred = lgb_model.predict(X_test)

# ‚úÖ Accuracy
print('----' * 16)
print(f"‚úÖ LightGBM Classifier Accuracy: {accuracy_score(y_test, y_pred):.2f}")
print('----' * 16)

# ‚úÖ Confusion Matrix
cm = confusion_matrix(y_test, y_pred)

labels = ['Critical (Unsustainable)', 'Low Sustainable', 'Moderately Sustainable', 'Sustainable', 'Highly Sustainable']
z_text = [[str(y) for y in x] for x in cm]

# ‚úÖ Plot confusion matrix using Plotly
fig = ff.create_annotated_heatmap(
    z=cm,
    x=labels,
    y=labels,
    annotation_text=z_text,
    colorscale='teal',
    showscale=True
)

fig.update_layout(
    title_text='Confusion Matrix - LightGBM Classifier',
    width=900,
    height=500
)

fig['data'][0]['showscale'] = True
fig.show()


[LightGBM] [Info] Auto-choosing col-wise multi-threading, the overhead of testing was 0.003604 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 1917
[LightGBM] [Info] Number of data points in the train set: 23624, number of used features: 13
[LightGBM] [Info] Start training from score -1.616617
[LightGBM] [Info] Start training from score -1.621318
[LightGBM] [Info] Start training from score -1.603487
[LightGBM] [Info] Start training from score -1.598241
[LightGBM] [Info] Start training from score -1.607704
----------------------------------------------------------------
‚úÖ LightGBM Classifier Accuracy: 0.97
----------------------------------------------------------------


In [None]:
#from sklearn.metrics import classification_report

#print(classification_report(y_test,y_pred))

**Confusion matrix results (in context of Highly Sustainable) :-**


* **True Positive (TP) ‚Äî** ‚ÄúHighly Sustainable‚Äù correctly predicted as ‚ÄúHighly Sustainable‚Äù ‚Üí 1180

* **False Positive (FP) ‚Äî** Other classes predicted as ‚ÄúHighly Sustainable‚Äù (e.g: ‚ÄúCritical‚Äù, ‚ÄúLow Sustainable‚Äù, ‚ÄúModerately Sustainable‚Äù, ‚ÄúSustainable‚Äù ‚Üí ‚ÄúHighly Sustainable‚Äù) ‚Üí 5 + 7 + 6 + 2 = 20

* **False Negative (FN) ‚Äî** ‚ÄúHighly Sustainable‚Äù predicted as another sustainable class (e.g., Highly Sustainable ‚Üí any other class) ‚Üí 5 + 7 + 6 + 2 = 20

* **True Negative (TN) ‚Äî** All other class predictions correctly not labeled as ‚ÄúHighly Sustainable‚Äù ‚Üí This equals all remaining correct predictions across other classes:

    * Critical ‚Üí Critical: 1156

    * Low Sustainable ‚Üí Low Sustainable: 1136

    * Moderately Sustainable ‚Üí Moderately Sustainable: 1095

    * Sustainable ‚Üí Sustainable: 1172


---



In [None]:

# ‚úÖ Get feature importance directly from LightGBM model
importance_df = pd.DataFrame({
    'Feature': lgb_model.feature_name_,
    'Importance': lgb_model.feature_importances_
}).sort_values(by='Importance', ascending=False)

# ‚úÖ Plot with Plotly in teal theme
fig = px.bar(
    importance_df,
    x='Importance',
    y='Feature',
    orientation='h',
    title='üí† Feature Importance - LightGBM',
    color='Importance',
    color_continuous_scale='Tealgrn'
)

# ‚úÖ Style
fig.update_layout(
    width=850,
    height=600,
    title_x=0.5,
    yaxis=dict(autorange="reversed"),
    xaxis_title="Importance Score",
    yaxis_title="Features",
    plot_bgcolor="white"
)

fig.show()


In [None]:
X_test['State_encoded'] = le_state.inverse_transform(X_test['State_encoded'])
X_test['District_encoded'] = le_city.inverse_transform(X_test['District_encoded'])


In [None]:
X_test['Predicted_HSI'] = y_pred

In [None]:
# HSI mapping
HSI_map = {
    2: 'Moderately Sustainable',
    0: 'Critical (Unsustainable)',
    4: 'Highly Sustainable',
    3: 'Sustainable',
    1: 'Low Sustainable'
}

Disposal_Method_map = {
    1: 'Incineration',
    2: 'Composting',
    3: 'Landfill',
    4: 'Recycling'

}

Waste_Type_map = {
    1: 'E-Waste',
    2: 'Non-Biodegradable',
    3: 'Biodegradable',
    4: 'Mixed'
}

Energy_Source_map = {
    1: 'Hydro',
    2: 'Mixed',
    3: 'Thermal',
    4: 'Solar',
    5: 'Wind'

}

Location_map = {
    1: 'Urban',
    2: 'Suburban',
    3: 'Rural'
}

X_test['Predicted_HSI'] = X_test['Predicted_HSI'].map(HSI_map)
X_test['Disposal Method'] = X_test['Disposal Method'].map(Disposal_Method_map)
X_test['Waste Type'] = X_test['Waste Type'].map(Waste_Type_map)
X_test['EnergySource'] = X_test['EnergySource'].map(Energy_Source_map)
X_test['Location'] = X_test['Location'].map(Location_map)

#### **Final Test Data** : Ready for semi-deployment

In [None]:
import plotly.express as px
import pandas as pd

# --------------------------------------------
# 1Ô∏è‚É£ Group data for visualization
# --------------------------------------------
df_treemap = (
    X_test.groupby(['State_encoded', 'District_encoded', 'Predicted_HSI'])
    .size()
    .reset_index(name='Count')
)

# Optional: make HSI labels human-readable
df_treemap['HSI_Label'] = df_treemap['Predicted_HSI'].replace({
    0: 'Critical (Unsustainable)',
    1: 'Low Sustainable',
    2: 'Moderately Sustainable',
    3: 'Sustainable',
    4: 'Highly Sustainable'
})

# --------------------------------------------
# 2Ô∏è‚É£ Build Treemap (Squarify)
# --------------------------------------------
fig = px.treemap(
    df_treemap,
    path=['State_encoded', 'District_encoded', 'HSI_Label'],  # hierarchy
    values='Count',
    color='Predicted_HSI',  # color by sustainability level
    color_continuous_scale='Tealgrn',
    title="üåç Human Sustainability Index (HSI) Treemap by States and Districts"
)

# --------------------------------------------
# 3Ô∏è‚É£ Styling
# --------------------------------------------
fig.update_layout(
    width=950,
    height=750,
    title_x=0.5,
    title_font=dict(size=22),
    font=dict(size=12)
)

fig.show()


In [None]:
X_test.columns

Index(['State_encoded', 'District_encoded', 'population', 'Location',
       'EnergySource', 'WaterConsumption', 'SGD %', 'AQI', 'Waste Type',
       'Disposal Method', 'Recycling Rate (%)',
       'Cost of Waste Management (‚Çπ/Ton)', 'Urbanization %', 'Predicted_HSI'],
      dtype='object')

#### **Renaming Columns**

In [None]:
X_test.columns = [
    'State', 'District', 'Population', 'Location',
    'Energy Source', 'Water Consumption', 'SGD %', 'AQI', 'Waste Type',
       'Disposal Method', 'Recycling Rate (%)',
       'Cost of Waste Management (‚Çπ/Ton)', 'Urbanization %', 'Predicted HSI'
]

### *Semi-Deployment*

In [None]:
import pandas as pd
from ipywidgets import interact, widgets, VBox


# ------------------------------------------
# üîπ Dropdown widgets
# ------------------------------------------
state_dropdown = widgets.Dropdown(
    options=sorted(X_test['State'].unique().tolist()),
    description='Select State:',
    style={'description_width': 'initial'},
    layout=widgets.Layout(width='50%')
)

district_dropdown = widgets.Dropdown(
    options=[],
    description='Select District:',
    style={'description_width': 'initial'},
    layout=widgets.Layout(width='50%')
)

output = widgets.Output()

# ------------------------------------------
# üîÅ Update districts dynamically
# ------------------------------------------
def update_districts(*args):
    selected_state = state_dropdown.value
    filtered_districts = X_test[X_test['State'] == selected_state]['District'].unique().tolist()
    district_dropdown.options = sorted(filtered_districts)
    if filtered_districts:
        district_dropdown.value = filtered_districts[0]  # Default to first city
        show_district_info(None)  # ‚úÖ Show results immediately after state change

state_dropdown.observe(update_districts, 'value')

# ------------------------------------------
# üìä Display selected city data
# ------------------------------------------
def show_district_info(change):
    with output:
        output.clear_output()
        selected_state = state_dropdown.value
        selected_district = district_dropdown.value

        row = X_test[(X_test['State'] == selected_state) & (X_test['District'] == selected_district)]
        if row.empty:
            print("No data found for this district.")
            return

        row = row.iloc[0]
        print(f"üèôÔ∏è District: {row['District']}, State: {row['State']}\n")

        print(f"üë• Population: {row['Population']:,}")
        print(f"üìç Location Type: {row['Location']}")
        print(f"‚ö° Energy Source: {row['Energy Source']}")
        print(f"üíß Water Consumption: {row['Water Consumption']} liters/day (or unit in dataset)")
        print(f"üéØ SGD % Achievement: {row['SGD %']}%")
        print(f"üå´Ô∏è Air Quality Index (AQI): {row['AQI']}")
        print(f"üóëÔ∏è Waste Type: {row['Waste Type']}")
        print(f"üèóÔ∏è Disposal Method: {row['Disposal Method']}")
        print(f"‚ôªÔ∏è Recycling Rate: {row['Recycling Rate (%)']}%")
        print(f"üí∞ Waste Management Cost: ‚Çπ{row['Cost of Waste Management (‚Çπ/Ton)']:,} per ton")
        print(f"üèôÔ∏è Urbanization %: {row['Urbanization %']}%\n")

        print(f"üåç Predicted HSI Category: {row['Predicted HSI']}")


district_dropdown.observe(show_district_info, 'value')

# ------------------------------------------
# üöÄ Initialize and display
# ------------------------------------------
update_districts()  # ‚úÖ Run once to initialize first state + city data
display(VBox([state_dropdown, district_dropdown, output]))


VBox(children=(Dropdown(description='Select State:', layout=Layout(width='50%'), options=('A&N ISLAND', 'ANDHR‚Ä¶

### **Regression/Time Series for Human Sustainability (Forecast)**

***Work Under-construction***

kindly not proceed

In [None]:
from prophet import Prophet
import pandas as pd
import plotly.express as px

# ‚úÖ Ensure correct column names
# Rename for Prophet convention
df = data_ts.rename(columns={'Year': 'ds', 'HSI_Type': 'y', 'State': 'State'})

# Prophet expects a datetime-like column
df['ds'] = pd.to_datetime(df['ds'], format='%Y')


In [None]:
# Container to store all forecasts
forecasts = []

# Loop through each state
for state, group in df.groupby('State'):
    # Prophet requires at least 2 data points for fitting
    if len(group) < 2:
        print(f"Skipping state {state} due to insufficient data (less than 2 years).")
        continue

    m = Prophet(
        yearly_seasonality=True,
        changepoint_prior_scale=0.3
    )

    m.fit(group[['ds', 'y']])

    # Future dataframe ‚Äî forecast next 5 years
    future = m.make_future_dataframe(periods=5, freq='YE')
    forecast = m.predict(future)

    forecast['State'] = state
    forecasts.append(forecast[['ds', 'yhat', 'yhat_lower', 'yhat_upper', 'State']])

# Combine all states
forecast_df = pd.concat(forecasts, ignore_index=True)

INFO:prophet:Disabling weekly seasonality. Run prophet with weekly_seasonality=True to override this.
INFO:prophet:Disabling daily seasonality. Run prophet with daily_seasonality=True to override this.
INFO:prophet:n_changepoints greater than number of observations. Using 1.

'Y' is deprecated and will be removed in a future version, please use 'YE' instead.

INFO:prophet:Disabling weekly seasonality. Run prophet with weekly_seasonality=True to override this.
INFO:prophet:Disabling daily seasonality. Run prophet with daily_seasonality=True to override this.
INFO:prophet:n_changepoints greater than number of observations. Using 9.

'Y' is deprecated and will be removed in a future version, please use 'YE' instead.

INFO:prophet:Disabling weekly seasonality. Run prophet with weekly_seasonality=True to override this.
INFO:prophet:Disabling daily seasonality. Run prophet with daily_seasonality=True to override this.
INFO:prophet:n_changepoints greater than number of observations. Using 11.

Skipping state CHANDIGARH due to insufficient data (less than 2 years).



'Y' is deprecated and will be removed in a future version, please use 'YE' instead.

INFO:prophet:Disabling weekly seasonality. Run prophet with weekly_seasonality=True to override this.
INFO:prophet:Disabling daily seasonality. Run prophet with daily_seasonality=True to override this.

'Y' is deprecated and will be removed in a future version, please use 'YE' instead.

INFO:prophet:Disabling weekly seasonality. Run prophet with weekly_seasonality=True to override this.
INFO:prophet:Disabling daily seasonality. Run prophet with daily_seasonality=True to override this.
INFO:prophet:n_changepoints greater than number of observations. Using 8.

'Y' is deprecated and will be removed in a future version, please use 'YE' instead.

INFO:prophet:Disabling weekly seasonality. Run prophet with weekly_seasonality=True to override this.
INFO:prophet:Disabling daily seasonality. Run prophet with daily_seasonality=True to override this.

'Y' is deprecated and will be removed in a future version, pl

Skipping state Dadra & Nagar Haveli due to insufficient data (less than 2 years).


INFO:prophet:Disabling weekly seasonality. Run prophet with weekly_seasonality=True to override this.
INFO:prophet:Disabling daily seasonality. Run prophet with daily_seasonality=True to override this.

'Y' is deprecated and will be removed in a future version, please use 'YE' instead.

INFO:prophet:Disabling weekly seasonality. Run prophet with weekly_seasonality=True to override this.
INFO:prophet:Disabling daily seasonality. Run prophet with daily_seasonality=True to override this.


Skipping state Daman due to insufficient data (less than 2 years).
Skipping state Diu due to insufficient data (less than 2 years).


INFO:prophet:n_changepoints greater than number of observations. Using 0.

'Y' is deprecated and will be removed in a future version, please use 'YE' instead.

INFO:prophet:Disabling weekly seasonality. Run prophet with weekly_seasonality=True to override this.
INFO:prophet:Disabling daily seasonality. Run prophet with daily_seasonality=True to override this.

'Y' is deprecated and will be removed in a future version, please use 'YE' instead.

INFO:prophet:Disabling weekly seasonality. Run prophet with weekly_seasonality=True to override this.
INFO:prophet:Disabling daily seasonality. Run prophet with daily_seasonality=True to override this.

'Y' is deprecated and will be removed in a future version, please use 'YE' instead.

INFO:prophet:Disabling weekly seasonality. Run prophet with weekly_seasonality=True to override this.
INFO:prophet:Disabling daily seasonality. Run prophet with daily_seasonality=True to override this.

'Y' is deprecated and will be removed in a future version, pl

In [None]:
forecast_df

Unnamed: 0,ds,yhat,yhat_lower,yhat_upper,State
0,2015-01-01,2.000000e+00,2.000000e+00,2.000000e+00,A&N ISLAND
1,2016-01-01,2.933969e-08,2.930652e-08,2.937428e-08,A&N ISLAND
2,2025-01-01,-2.239593e-08,-2.242734e-08,-2.236124e-08,A&N ISLAND
3,2025-12-31,-7.993771e+00,-7.993771e+00,-7.993771e+00,A&N ISLAND
4,2026-12-31,-9.990365e+00,-9.990365e+00,-9.990365e+00,A&N ISLAND
...,...,...,...,...,...
1040,2025-12-31,1.924805e+00,1.165987e-01,3.896542e+00,West Bengal
1041,2026-12-31,1.892169e+00,3.004267e-02,3.708545e+00,West Bengal
1042,2027-12-31,1.859031e+00,-3.433928e-03,3.802883e+00,West Bengal
1043,2028-12-31,1.913490e+00,2.771984e-01,3.806125e+00,West Bengal



---

### üåç **Future Work: Towards Intelligent Sustainability for Human Life**

At the final stage/upcoming stage, an **LLM-powered intelligence layer** will transform complex data into **human-readable insights and localized recommendations**, helping **states and cities** enhance livability, align human progress with nature, and take smarter steps toward a **truly sustainable civilization**. ‚ú®



---

