# Phase 1, Step 2: Model & HGC Core Architecture Build

**Objective:** To construct the core HGC module and integrate it into a `bert-base-uncased` model, creating our experimental architecture.

This notebook will:
1. Import the `HolographicKnowledgeManifold` class from our new `hgc_core` module.
2. Define a custom PyTorch model, `BertForMaskedLM_With_HGC`, that wraps the standard BERT model.
3. Add the HKM and a projection head to the custom model.
4. Instantiate both the baseline and HGC-augmented models to verify the architecture is built correctly.

## 1. Setup & Dependencies

We import `torch`, `transformers`, and our custom HGC module. We also define key configuration parameters for our model.

In [1]:
import torch
import torch.nn as nn
from transformers import BertModel, BertForMaskedLM, BertConfig
import sys
import os

# Add the project root to the Python path to allow importing from hgc_core
module_path = os.path.abspath(os.path.join('..'))
if module_path not in sys.path:
    sys.path.append(module_path)

from hgc_core.hgc_module import HolographicKnowledgeManifold

# --- Configuration ---
MODEL_NAME = "bert-base-uncased"
HKM_DIMENSIONALITY = 4096 # The 'd' for our holographic vectors.

print("Dependencies loaded successfully.")

Dependencies loaded successfully.


## 2. Defining the HGC-Augmented BERT Architecture

Here, we define the custom PyTorch model class. It will hold the pre-trained BERT model, an HKM instance, and a linear layer to project BERT's outputs into the high-dimensional space of the HKM.

In [2]:
class BertForMaskedLM_With_HGC(nn.Module):
    """
    A custom BERT model for Masked Language Modeling that integrates a
    Holographic Knowledge Manifold (HKM).
    """
    def __init__(self, model_name: str, hkm_dimensionality: int):
        super().__init__()
        self.model_name = model_name
        self.hkm_dimensionality = hkm_dimensionality

        # 1. Load the pre-trained BERT model for Masked LM
        self.bert_mlm = BertForMaskedLM.from_pretrained(model_name)
        self.config = self.bert_mlm.config

        # 2. Instantiate the Holographic Knowledge Manifold
        self.hkm = HolographicKnowledgeManifold(d=hkm_dimensionality)

        # 3. Create a projection head to map BERT's output to the HKM's dimension
        bert_hidden_size = self.config.hidden_size # Should be 768 for bert-base
        self.projection_head = nn.Linear(bert_hidden_size, hkm_dimensionality)

    def forward(self, input_ids, attention_mask=None, labels=None):
        """
        The forward pass for our custom model.
        """
        # Pass inputs through the standard BERT model
        # The BertForMaskedLM returns a dictionary-like object
        outputs = self.bert_mlm(
            input_ids=input_ids,
            attention_mask=attention_mask,
            labels=labels,
            output_hidden_states=True # We need the hidden states for the HKM
        )
        
        # The primary loss from the masked language modeling task
        mlm_loss = outputs.loss

        # Get the last hidden state from the BERT outputs
        # Shape: (batch_size, sequence_length, hidden_size)
        last_hidden_state = outputs.hidden_states[-1]

        # For simplicity, we'll use the representation of the [CLS] token
        # as the representation for the entire sequence.
        # Shape: (batch_size, hidden_size)
        cls_representation = last_hidden_state[:, 0, :]

        # Project the CLS representation into the HKM's high-dimensional space
        # Shape: (batch_size, hkm_dimensionality)
        projected_vectors = self.projection_head(cls_representation)
        
        # During training, we would add these vectors to the HKM.
        # For now, we just demonstrate the data flow.
        # self.hkm.add_to_manifold(projected_vectors)
        
        # The model's output will be a dictionary containing the MLM loss
        # and the projected vectors, which can be used for other tasks.
        return {
            "loss": mlm_loss,
            "projected_vectors": projected_vectors
        }

print("Custom HGC-BERT model class defined.")

Custom HGC-BERT model class defined.


## 3. Architecture Verification

To confirm that our architecture is correctly defined, we will now instantiate both the baseline model and our new custom model. We'll then print their structures to see the difference.

In [3]:
# -- 1. Instantiate the Baseline Model --
print("--- Baseline Model (BertForMaskedLM) ---")
baseline_model = BertForMaskedLM.from_pretrained(MODEL_NAME)
print(baseline_model.config)
print(f"\nBaseline model loaded successfully.")
print("Note: The full model is very large. We printed its config instead.")

# -- 2. Instantiate the HGC-Augmented Model --
print("\n--- HGC-Augmented Model (BertForMaskedLM_With_HGC) ---")
hgc_model = BertForMaskedLM_With_HGC(
    model_name=MODEL_NAME, 
    hkm_dimensionality=HKM_DIMENSIONALITY
)
print(hgc_model)
print(f"\nHGC-augmented model loaded successfully.")

# Verify the dimensionality of the projection head
proj_head = hgc_model.projection_head
print(f"\nVerification of Projection Head Dimensions:")
print(f"  Input Features: {proj_head.in_features}")
print(f"  Output Features: {proj_head.out_features}")
assert proj_head.in_features == baseline_model.config.hidden_size
assert proj_head.out_features == HKM_DIMENSIONALITY
print("  Dimensions match correctly!")

--- Baseline Model (BertForMaskedLM) ---


config.json:   0%|          | 0.00/570 [00:00<?, ?B/s]

Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`


model.safetensors:   0%|          | 0.00/440M [00:00<?, ?B/s]

Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


BertConfig {
  "architectures": [
    "BertForMaskedLM"
  ],
  "attention_probs_dropout_prob": 0.1,
  "classifier_dropout": null,
  "dtype": "float32",
  "gradient_checkpointing": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "bert",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "pad_token_id": 0,
  "position_embedding_type": "absolute",
  "transformers_version": "4.56.1",
  "type_vocab_size": 2,
  "use_cache": true,
  "vocab_size": 30522
}


Baseline model loaded successfully.
Note: The full model is very large. We printed its config instead.

--- HGC-Augmented Model (BertForMaskedLM_With_HGC) ---


Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertForMaskedLM: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight']
- This IS expected if you are initializing BertForMaskedLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertForMaskedLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


BertForMaskedLM_With_HGC(
  (bert_mlm): BertForMaskedLM(
    (bert): BertModel(
      (embeddings): BertEmbeddings(
        (word_embeddings): Embedding(30522, 768, padding_idx=0)
        (position_embeddings): Embedding(512, 768)
        (token_type_embeddings): Embedding(2, 768)
        (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
        (dropout): Dropout(p=0.1, inplace=False)
      )
      (encoder): BertEncoder(
        (layer): ModuleList(
          (0-11): 12 x BertLayer(
            (attention): BertAttention(
              (self): BertSdpaSelfAttention(
                (query): Linear(in_features=768, out_features=768, bias=True)
                (key): Linear(in_features=768, out_features=768, bias=True)
                (value): Linear(in_features=768, out_features=768, bias=True)
                (dropout): Dropout(p=0.1, inplace=False)
              )
              (output): BertSelfOutput(
                (dense): Linear(in_features=768, out_features=7

## 4. Conclusion

We have successfully:
1. Created a reusable, batch-friendly `HolographicKnowledgeManifold` module.
2. Defined a custom `BertForMaskedLM_With_HGC` class that integrates the standard BERT model with our HKM via a projection head.
3. Verified that both the baseline and custom models can be instantiated correctly from the `bert-base-uncased` checkpoint.

This completes the architecture build step. We are now ready to move on to the next step: **Deploy (Training & Evaluation)**.