**Install Antigranular Package**

In [1]:
!pip install antigranular &> /dev/null

**Login to the Antigranular Platform.**


In [2]:
import antigranular as ag
session = ag.login(<client_id>,<client_secret>, competition = "Global Economic Forecast Hackathon with Texas A&M Aggie Data Science")


Dataset "Statistical Performance Indicators" loaded to the kernel as [92mstatistical_performance_indicators[0m
Key Name                       Value Type     
---------------------------------------------
train_x                        PrivateDataFrame
train_y                        PrivateDataFrame
test_x                         DataFrame      

Connected to Antigranular server session id: f3ea511d-ec2f-4a99-aee4-bcb42916fd9d, the session will time out if idle for 25 minutes
Cell magic '%%ag' registered successfully, use `%%ag` in a notebook cell to execute your python code on Antigranular private python server
🚀 Everything's set up and ready to roll!


In [3]:
%%ag
x_train = statistical_performance_indicators["train_x"]
y_train = statistical_performance_indicators["train_y"]
x_test = statistical_performance_indicators["test_x"]

**Data Description**


| Column Name         | Description                                                |
|-------------------|------------------------------------------------------------|
| intl_org_use      | Measures international organizations' usage of national data. |
| social_stats      | Reflects the quality of social statistics like health and education. |
| economic_stats    | Assesses the reliability of economic data for analysis.    |
| inst_stats        | Evaluates the quality of governance and institutional data. |
| pov_ratio         | Tracks data availability on the poverty headcount ratio.   |
| child_mortality   | Monitors under-5 mortality rate data quality and availability. |
| debt_service      | Measures the quality of national debt service data.        |
| safe_water        | Assesses the availability of safely managed water data.    |
| labor_force       | Evaluates labor force participation data by sex and age.   |
| no_poverty        | Tracks data quality for achieving SDG 1: No Poverty.       |
| zero_hunger       | Monitors data for achieving SDG 2: Zero Hunger.            |
| good_health       | Assesses data for achieving SDG 3: Good Health and Well-being. |
| quality_edu       | Measures data availability for SDG 4: Quality Education.   |
| gender_eq         | Evaluates data for achieving SDG 5: Gender Equality.       |
| clean_water       | Tracks data quality for SDG 6: Clean Water and Sanitation. |
| clean_energy      | Assesses data for achieving SDG 7: Affordable Clean Energy. |
| decent_work       | Monitors data for achieving SDG 8: Decent Work and Growth. |
| innovation        | Measures data for achieving SDG 9: Industry and Innovation. |
| reduced_ineq      | Tracks data availability for SDG 10: Reduced Inequalities. |
| cities            | Evaluates data for SDG 11: Sustainable Cities.             |
| consump_prod      | Measures data for SDG 12: Responsible Consumption.         |
| life_land         | Assesses data quality for SDG 15: Life on Land.            |
| peace_justice     | Tracks data for SDG 16: Peace and Justice.                 |
| partnerships      | Monitors data for SDG 17: Global Partnerships.             |


In [4]:
%%ag
x_train.info()

+----+-----------------+-------------+---------------+---------+----------+
|    | Column          | numerical   | categorical   | dtype   | bounds   |
|----+-----------------+-------------+---------------+---------+----------|
|  0 | intl_org_use    | True        | False         | float64 | (0, 1)   |
|  1 | social_stats    | True        | False         | float64 | (0, 1)   |
|  2 | economic_stats  | True        | False         | float64 | (0, 1)   |
|  3 | inst_stats      | True        | False         | float64 | (0, 1)   |
|  4 | pov_ratio       | True        | False         | float64 | (0, 1)   |
|  5 | child_mortality | True        | False         | float64 | (0, 1)   |
|  6 | debt_service    | True        | False         | float64 | (0, 1)   |
|  7 | safe_water      | True        | False         | float64 | (0, 1)   |
|  8 | labor_force     | True        | False         | float64 | (0, 1)   |
|  9 | no_poverty      | True        | False         | float64 | (0, 1)   |
| 10 | zero_

**Data Summary**

Based on this data summary, for now I will not be carrying out additional preprocessing -

1. All Columns Are Numerical

Each column has a numerical flag set to True and categorical set to False, meaning there are no categorical variables that need encoding.
2. Data Type Consistency

All columns have a data type (dtype) of float64, ensuring uniformity for mathematical and statistical operations.

3. Value Bounds

All columns have values constrained within the range (0, 1), indicating that the data is normalized or standardized.

In [5]:
%%ag
y_train.info()

+----+----------+-------------+---------------+---------+----------+
|    | Column   | numerical   | categorical   | dtype   | bounds   |
|----+----------+-------------+---------------+---------+----------|
|  0 | income   | True        | False         | int64   | (0, 1)   |
+----+----------+-------------+---------------+---------+----------+



For our `Y` target, data is bound into 2 groups ((low and lower middle income), & (upper middle and high income))which are separated by 0 and 1.

```
Income levels:
    1. 'Low income': 0,
    2. 'Lower middle income': 0,
    3. 'Upper middle income': 1,
    4. 'High income': 1,
```




**Model Architecture**

A Sequential model is used for its simplicity and suitability for straightforward feedforward neural networks.

*Dense Layers*

Four dense layers with decreasing neuron counts (512 → 256 → 128 → 64) are implemented to allow the model to capture complex patterns in the data while progressively reducing dimensionality.

All layers use the *ReLU activation function* for non-linearity, ensuring efficient learning.

*Output Layer*

The final layer is a dense layer with a single neuron and *sigmoid activation function*, suitable for binary classification tasks.

**Kernel Initialization and Regularization**

*Glorot Uniform Initialization*

Selected for all layers to balance the weight distribution.It carefully sets the weights based on the number of input and output units in a layer to ensure that the variance of the activations and gradients is roughly consistent across layers, making it easier for the model to converge.


*L2 Regularization*

Added to each layer to penalize large weight values, helping prevent overfitting and improving generalization.

On privacy :

L2 Norm Clipping -  Limits the sensitivity of individual training examples.
Noise Multiplier- Adds noise to gradients during training to preserve privacy.


In [37]:
%%ag
import tensorflow as tf
from op_pandas import standard_scaler, PrivateDataFrame
from tensorflow.keras.models import Sequential
from tensorflow.keras import regularizers

from tensorflow.keras.layers import Dense

from op_tensorflow import PrivateKerasModel, PrivateDataLoader

# Keras model
seqM = Sequential([
    Dense(512, activation='relu', kernel_initializer="glorot_uniform", kernel_regularizer=regularizers.l2(0.01), input_shape=(24,)),
    # Dropout(0.2),
    Dense(256, activation='relu', kernel_initializer='glorot_uniform', kernel_regularizer=regularizers.l2(0.01)),
    Dense(128, activation='relu', kernel_initializer='glorot_uniform', kernel_regularizer=regularizers.l2(0.01)),
    Dense(64, activation='relu', kernel_initializer='glorot_uniform',  kernel_regularizer=regularizers.l2(0.01)),
    Dense(1, activation='sigmoid', kernel_initializer='glorot_uniform')
])


# Create DP keras model
dp_model = PrivateKerasModel(model=seqM, l2_norm_clip=1.0, noise_multiplier=1.0)

In [38]:
# %%ag
# optimizer = tf.keras.optimizers.Adam(learning_rate=0.001)
# dp_model.compile(
#     optimizer=optimizer,
#     loss='binary_crossentropy',
#     metrics=["accuracy", "Precision", "Recall", "AUC"]
#     )


In [39]:
# %%ag
# optimizer = tf.keras.optimizers.Adam(learning_rate=0.0001)
# dp_model.compile(
#     optimizer=optimizer,
#     loss='binary_crossentropy',
#     metrics=["accuracy", "Precision", "Recall", "AUC"]
#     )


In [40]:
%%ag
optimizer = tf.keras.optimizers.Adam(learning_rate=0.0004)
dp_model.compile(
    optimizer=optimizer,
    loss='binary_crossentropy',
    metrics=["accuracy", "Precision", "Recall"]
    )


In [41]:
%%ag
data_loader = PrivateDataLoader(feature_df=x_train , label_df=y_train, batch_size=64)

In [42]:
%%ag
# Epoch range between 50 to 100
dp_model.fit(x=data_loader, epochs=100, target_delta=1e-5)

Epoch 1/100

58/58 - 6s - loss: 0.5811 - accuracy: 0.6752 - precision: 0.6844 - recall: 0.8919 - 6s/epoch - 97ms/step

Epoch 2/100

58/58 - 2s - loss: 0.4829 - accuracy: 0.7361 - precision: 0.8336 - recall: 0.7415 - 2s/epoch - 34ms/step

Epoch 3/100

58/58 - 2s - loss: 0.4458 - accuracy: 0.7633 - precision: 0.8413 - recall: 0.7717 - 2s/epoch - 33ms/step

Epoch 4/100

58/58 - 2s - loss: 0.4212 - accuracy: 0.7852 - precision: 0.8543 - recall: 0.7980 - 2s/epoch - 33ms/step

Epoch 5/100

58/58 - 2s - loss: 0.4270 - accuracy: 0.7749 - precision: 0.8374 - recall: 0.7922 - 2s/epoch - 33ms/step

Epoch 6/100

58/58 - 2s - loss: 0.3956 - accuracy: 0.8024 - precision: 0.8576 - recall: 0.8245 - 2s/epoch - 34ms/step

Epoch 7/100

58/58 - 2s - loss: 0.3969 - accuracy: 0.8015 - precision: 0.8539 - recall: 0.8299 - 2s/epoch - 34ms/step

Epoch 8/100

58/58 - 2s - loss: 0.3962 - accuracy: 0.7952 - precision: 0.8482 - recall: 0.8251 - 2s/epoch - 33ms/step

Epoch 9/100

58/58 - 2s - loss: 0.4003 - accurac

In [43]:
%%ag
y_pred = dp_model.predict(PrivateDataFrame(x_test), label_columns=["output"])


 1/13 [=>............................] - ETA: 2s
 3/13 [=====>........................] - ETA: 0s



In [44]:
%%ag
def f(x: float) -> float:
  if x > 0.5:
    return 1
  else:
    return 0

y_pred["output"] = y_pred["output"].map(f, output_bounds=(0, 1))


In [45]:
# %%ag
# result = submit_predictions(y_pred)

In [46]:
# session.privacy_odometer()
