# Artificial Neural Networks with Dimensions Grant Data

This notebook walks through key neural network concepts using **Dimensions-style grant data** as the running example.

1. **Feed-forward ANNs (Dense networks)**  
   - Inputs, weights, bias, activation functions (ReLU, sigmoid)  
   - Gradient descent (mini-batch via Adam)  
   - Binary classification: AI vs non-AI grants  

2. **Logical functions with tiny networks**  
   - Showing how a single neuron can learn AND / OR-like behavior  

3. **Deep Neural Networks (DNNs) for regression**  
   - Predicting 5-year citations from funding & topic scores  

4. **Overfitting & Dropout**  
   - Adding dropout layers to improve generalization  

5. **1D Convolutional Neural Networks (CNN-style) on sequences**  
   - Using yearly citation counts as a “time-series” input  

6. **Recurrent Neural Networks (RNNs / LSTMs)**  
   - Using citation sequences to classify grants  

We assume a `grants` dataset exported from Dimensions into CSV or a DataFrame, with columns like:

- `grant_id`
- `topic_ai_score`, `topic_bioinfo_score`, `topic_data_repo_score`
- `total_funding`
- `citations_5yr`
- `is_ai_ml` (0/1 label: is this an AI/ML-related grant?)
- `citations_FY19`, `citations_FY20`, ..., `citations_FY25` (per-year citation counts)

You can adapt column names to your actual pipeline.

# Imports & Data Loading

In [None]:
import pandas as pd
import numpy as np

from sklearn.model_selection import train_test_split

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

# For reproducibility
np.random.seed(42)
tf.random.set_seed(42)

# --- Load Dimensions-style grants data ---

# Option 1: from CSV exported from Dimensions
# grants = pd.read_csv("grants.csv")

# Option 2: if already in memory, just ensure it has the needed columns
expected_cols = [
    "grant_id",
    "topic_ai_score",
    "topic_bioinfo_score",
    "topic_data_repo_score",
    "total_funding",
    "citations_5yr",
    "is_ai_ml",
    "citations_FY19", "citations_FY20", "citations_FY21",
    "citations_FY22", "citations_FY23", "citations_FY24", "citations_FY25"
]

missing = [c for c in expected_cols if c not in globals().get("grants", pd.DataFrame()).columns]
if missing:
    print("NOTE: grants DataFrame not found or missing columns.")
    print("Expected at least:", expected_cols)
    # You can uncomment the CSV line above and reload.
else:
    print("Grants data loaded with columns:", grants.columns.tolist())

# Basic Feed-Forward ANN (Classification)

## 1. Basic Feed-Forward ANN: Classifying AI vs Non-AI Grants

**Goal:**  
Use a simple artificial neural network to classify whether a grant is AI/ML-related (`is_ai_ml = 1`) based on numeric features such as topic scores and funding.

**Key concepts:**

- **Inputs:** topic scores, log funding  
- **Weights & bias:** learned parameters in Dense layers  
- **Activation functions:**
  - ReLU in hidden layers (non-linear, `max(0, x)`)  
  - Sigmoid in output layer (outputs in (0,1), interpretable as confidence)  
- **Training:** mini-batch gradient descent via Adam optimizer  
- **Loss:** binary cross-entropy for 0/1 classification  

# Feed-Forward ANN (Classification)

In [None]:
# --- Prepare features and labels ---

features = ["topic_ai_score", "topic_bioinfo_score", "topic_data_repo_score", "total_funding"]

grants_nn = grants.copy()

# Handle missing values
grants_nn[features] = grants_nn[features].fillna(0.0)

# Stabilize funding with log1p
grants_nn["total_funding"] = np.log1p(grants_nn["total_funding"])

X = grants_nn[features].values
y = grants_nn["is_ai_ml"].astype(int).values

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42, stratify=y
)

# --- Define the model: Input -> Dense(ReLU) -> Dense(ReLU) -> Dense(sigmoid) ---

model = keras.Sequential([
    layers.Input(shape=(X_train.shape[1],)),   # input layer
    layers.Dense(16, activation="relu"),       # hidden layer with ReLU
    layers.Dense(8, activation="relu"),        # another hidden layer
    layers.Dense(1, activation="sigmoid")      # output layer with sigmoid
])

model.compile(
    optimizer=keras.optimizers.Adam(learning_rate=1e-3),
    loss="binary_crossentropy",
    metrics=["accuracy"]
)

history = model.fit(
    X_train, y_train,
    epochs=20,
    batch_size=32,          # mini-batch gradient descent
    validation_split=0.2,
    verbose=0
)

test_loss, test_acc = model.evaluate(X_test, y_test, verbose=0)
print("Test accuracy (AI vs non-AI):", test_acc)

# Logical Functions (AND / OR)

## 2. Logical Functions with a Tiny Network

Before scaling to many inputs, show that a **single neuron** can represent simple logical functions like **AND** and **OR**.

**Example:**

- Input 1: `has_ai` (1 if AI topic score > threshold, else 0)  
- Input 2: `has_repo` (1 if data-repository topic score > threshold, else 0)  
- Output: `1` if both are 1 (logical AND), else 0.  

Train a one-layer network to learn this mapping directly from data.

In [None]:
# Create binary flags from topic scores
logic_df = grants.copy()
logic_df["has_ai"] = (logic_df["topic_ai_score"] > 0.5).astype(int)
logic_df["has_repo"] = (logic_df["topic_data_repo_score"] > 0.5).astype(int)

# Define label for AND
logic_df["label_and"] = (logic_df["has_ai"] & logic_df["has_repo"]).astype(int)

X_logic = logic_df[["has_ai", "has_repo"]].values
y_logic = logic_df["label_and"].values

model_and = keras.Sequential([
    layers.Input(shape=(2,)),
    layers.Dense(1, activation="sigmoid")   # a single neuron with sigmoid
])

model_and.compile(optimizer="sgd", loss="binary_crossentropy", metrics=["accuracy"])

# Batch gradient descent: use all rows as a single batch for simplicity
model_and.fit(
    X_logic, y_logic,
    epochs=500,
    batch_size=len(X_logic),
    verbose=0
)

loss_and, acc_and = model_and.evaluate(X_logic, y_logic, verbose=0)
print("Training accuracy (AND function):", acc_and)
print("Learned weights and bias:", model_and.get_weights())

# Deep Neural Network for Regression

## 3. Deep Neural Network (DNN) for Regression

**Goal:**  
Predict **5-year citations** (`citations_5yr`) from grant features.

**Concepts:**

- **Continuous output** → use a single linear neuron (`activation="linear"`) in the last layer.
- **Loss function:** Mean Squared Error (L₂ loss) or MAE (L₁-like).
- **Deep network:** multiple hidden layers with ReLU allow modeling non-linear relationships between funding, topic scores, and citations.

# DNN Regression

In [None]:
features_reg = ["topic_ai_score", "topic_bioinfo_score", "topic_data_repo_score", "total_funding"]

reg_df = grants.copy()
reg_df[features_reg] = reg_df[features_reg].fillna(0.0)
reg_df["total_funding"] = np.log1p(reg_df["total_funding"])
reg_df["citations_5yr"] = reg_df["citations_5yr"].fillna(0.0)

Xr = reg_df[features_reg].values
yr = reg_df["citations_5yr"].values

Xr_train, Xr_test, yr_train, yr_test = train_test_split(
    Xr, yr, test_size=0.2, random_state=42
)

dnn = keras.Sequential([
    layers.Input(shape=(Xr_train.shape[1],)),
    layers.Dense(64, activation="relu"),
    layers.Dense(32, activation="relu"),
    layers.Dense(1, activation="linear")   # regression output
])

dnn.compile(optimizer="adam", loss="mse", metrics=["mae"])
dnn.fit(
    Xr_train, yr_train,
    epochs=30,
    batch_size=32,
    validation_split=0.2,
    verbose=0
)

test_mse, test_mae = dnn.evaluate(Xr_test, yr_test, verbose=0)
print("Test MAE (citations regression):", test_mae)

# Ovverfitting and Dropout

## 4. Overfitting & Dropout

**Overfitting:**  
A model performs very well on training data but poorly on unseen data (test/validation).

**Dropout:**  
During training, randomly "drops" (disables) a fraction of units in a layer at each step.  
This discourages the network from relying too heavily on any one path and usually **improves generalization**.

Below: same classification model as before, but with **Dropout layer** added.

# Classification with Dropout

In [None]:
model_do = keras.Sequential([
    layers.Input(shape=(X_train.shape[1],)),
    layers.Dense(32, activation="relu"),
    layers.Dropout(0.5),              # randomly drop 50% of units during training
    layers.Dense(16, activation="relu"),
    layers.Dense(1, activation="sigmoid")
])

model_do.compile(
    optimizer="adam",
    loss="binary_crossentropy",
    metrics=["accuracy"]
)

history_do = model_do.fit(
    X_train, y_train,
    epochs=30,
    batch_size=32,
    validation_split=0.2,
    verbose=0
)

loss_do, acc_do = model_do.evaluate(X_test, y_test, verbose=0)
print("Test accuracy with dropout:", acc_do)

# CNN on Time Series

## 5. 1D Convolutional Network on Citation Time Series

Although CNNs are famous for **images**, the same ideas apply to **1D sequences**.

We’ll use:

- Input: yearly citations `citations_FY19`–`citations_FY25` (7 time steps).
- Task: classify whether a grant is AI/ML (`is_ai_ml`).

**Concepts:**

- **Convolution (Conv1D):** sliding filters over the sequence to detect local patterns.
- **Pooling (MaxPooling1D):** down-sampling to keep strongest responses and reduce size.
- **Dense layers:** to map extracted features to a classification output.

# 1D CNN on Citation Sequenc

In [None]:
# Build sequence array: [citations_FY19 ... citations_FY25]
year_cols = [f"citations_FY{y}" for y in range(19, 26)]

seq_df = grants.copy()
seq_df[year_cols] = seq_df[year_cols].fillna(0.0)
seq_df["is_ai_ml"] = seq_df["is_ai_ml"].astype(int)

X_seq = seq_df[year_cols].values  # (N, 7)
y_seq = seq_df["is_ai_ml"].values

# Reshape for Conv1D: (samples, timesteps, channels)
X_seq = X_seq[..., np.newaxis]  # add channel dim: (N, 7, 1)

X_seq_train, X_seq_test, y_seq_train, y_seq_test = train_test_split(
    X_seq, y_seq, test_size=0.2, random_state=42, stratify=y_seq
)

cnn = keras.Sequential([
    layers.Input(shape=(X_seq_train.shape[1], 1)),  # (7, 1)
    layers.Conv1D(16, kernel_size=3, activation="relu"),
    layers.MaxPooling1D(pool_size=2),
    layers.Conv1D(32, kernel_size=3, activation="relu"),
    layers.GlobalMaxPooling1D(),
    layers.Dense(16, activation="relu"),
    layers.Dense(1, activation="sigmoid")
])

cnn.compile(
    optimizer="adam",
    loss="binary_crossentropy",
    metrics=["accuracy"]
)

cnn.fit(
    X_seq_train, y_seq_train,
    epochs=20,
    batch_size=32,
    validation_split=0.2,
    verbose=0
)

loss_cnn, acc_cnn = cnn.evaluate(X_seq_test, y_seq_test, verbose=0)
print("1D CNN accuracy (sequence-based AI classification):", acc_cnn)

# RNN / LSTM on Citation Sequences

## 6. Recurrent Neural Network (LSTM) on Citation Sequences

**Recurrent Neural Networks (RNNs)**, and especially **LSTMs**, are designed for **sequences** where past information matters.

Here, we again use annual citations FY19–FY25 as a time series and try to classify whether a grant is AI/ML-related.

**Concepts:**

- **Recurrent connections:** an internal state carries information forward across time steps.
- **Applications:** language, time-series forecasting, video, translation, etc.

# LSTM on Citation Sequence

In [None]:
X_seq_train, X_seq_test, y_seq_train, y_seq_test = train_test_split(
    X_seq, y_seq, test_size=0.2, random_state=42, stratify=y_seq
)

rnn = keras.Sequential([
    layers.Input(shape=(X_seq_train.shape[1], 1)),  # (7, 1)
    layers.LSTM(32, return_sequences=False),
    layers.Dense(16, activation="relu"),
    layers.Dense(1, activation="sigmoid")
])

rnn.compile(
    optimizer="adam",
    loss="binary_crossentropy",
    metrics=["accuracy"]
)

rnn.fit(
    X_seq_train, y_seq_train,
    epochs=20,
    batch_size=32,
    validation_split=0.2,
    verbose=0
)

loss_rnn, acc_rnn = rnn.evaluate(X_seq_test, y_seq_test, verbose=0)
print("LSTM accuracy (sequence-based AI classification):", acc_rnn)

## 7. Summary

In this notebook, we:

- Built **feed-forward ANNs** to classify AI vs non-AI grants and to predict citation counts.
- Demonstrated how tiny networks can learn **logical functions** like AND.
- Used **deep networks** (multiple hidden layers) to model non-linear relationships.
- Illustrated **overfitting** and how **dropout** can improve generalization.
- Applied **convolutional ideas** (Conv1D + pooling) to Dimensions **time-series** data.
- Applied an **LSTM-based RNN** to yearly citations to classify grants.

You can adapt:

- **Inputs:** swap topic scores, abstract embeddings, funding, or PI-country indicators.  
- **Outputs:** predict probability of being “high-impact,” funding decisions, or future citations.  
- **Architectures:** deeper networks, different activations, or multi-task setups.

Next steps (if you want to extend this):

- Use **text embeddings** (e.g., from a transformer) of abstracts as ANN inputs.
- Add **equity features** (country income level, region) and compare models with and without them.
- Turn models into **screening tools** for highlighting potentially impactful or data-science-heavy grants.