<a href="https://colab.research.google.com/github/Binaz/AddressParser/blob/main/Address_Parser_Main.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>


# Address Parser using BERT-based NER

This notebook fine-tunes a **Transformer model** to extract structured address information such as street number, street name, city, state, and postal code from unstructured text.  

We use **Hugging Face Transformers**, **datasets**, and **seqeval** to build and evaluate the model.  
All training is done in **Google Colab** using a dataset stored in Google Drive.
The synthetic dataset was created which includes variety of addresses, in different formats and different abbreviations.

In [None]:
from google.colab import drive
drive.mount('/content/drive')

### Setup: Mount Google Drive and install required libraries
We mount Google Drive to access the dataset stored in the user's Drive and install the required Python packages such as `transformers`, `datasets`, and `evaluate`.


In [None]:
!pip install evaluate transformers==4.49.0
!pip install datasets==2.14.6
!pip install DatasetDict

### Load Dataset from Google Drive
We specify the CSV path using `os.path` and load it with the `datasets` library.  
The synthetic dataset was created using different address structures.
This dataset contains tokenized addresses with corresponding NER tags.


In [None]:
from datasets import load_dataset
import os

# Use os.path to handle file paths
csv_path = os.path.join("/content/drive", "My Drive", "Colab Notebooks", "Version 3","Training_Dataset_ForModelVersion_3_2.csv")

# Load the dataset using the correct path

address_ner_dataset = load_dataset('csv',data_files=csv_path )

### Data Cleaning: Convert stringified lists into Python lists
The dataset columns `tokens` and `ner_tags` are stored as strings.  
We convert them back into Python lists for proper token alignment and processing.


In [None]:
# Fix: Convert stringified lists to actual lists if needed
def convert_str_lists(example):
    if isinstance(example["tokens"], str):
        example["tokens"] = eval(example["tokens"])
    if isinstance(example["ner_tags"], str):
        example["ner_tags"] = eval(example["ner_tags"])
    return example

address_ner_dataset = address_ner_dataset.map(convert_str_lists)

### Data Split: Train, Validation, Test
We split the dataset into training (80%), validation (10%), and test (10%) sets using Hugging Face’s built-in `train_test_split()` method.


In [None]:
## This function randomly selects / splits records for training , validation and test dataset.

# Split the dataset into train, validation, and test sets
# First split into train and temp (this will be used for validation + test)
train_dataset, temp_dataset = address_ner_dataset["train"].train_test_split(test_size=0.2).values()

# Then split temp into validation and test
validation_dataset, test_dataset = temp_dataset.train_test_split(test_size=0.5).values()

# Now you have `train_dataset`, `validation_dataset`, and `test_dataset`

# If needed, you can print to confirm the splits
print(f"Training set size: {len(train_dataset)}")
print(f"Validation set size: {len(validation_dataset)}")
print(f"Test set size: {len(test_dataset)}")

### Create a DatasetDict for Training
We organize all splits into a single `DatasetDict` object for convenience when passing data to the model.


In [None]:
# The first 80% goes to training (by order),
# The last 20% is split evenly into validation and test (10% each), randomly.

from datasets import Dataset, DatasetDict

# Access the actual dataset from the DatasetDict
full_dataset = address_ner_dataset["train"]

# Define split percentage
train_split_percent = 0.8
total_samples = len(full_dataset)
train_end = int(total_samples * train_split_percent)

# Deterministic slicing
train_dataset = full_dataset.select(range(train_end))
remaining_dataset = full_dataset.select(range(train_end, total_samples))

# Now split the remaining into validation and test
val_test_split = remaining_dataset.train_test_split(test_size=0.5, seed=42)
validation_dataset = val_test_split['train']
test_dataset = val_test_split['test']

# Optional: Group into a DatasetDict
dataset_dict = DatasetDict({
    'train': train_dataset,
    'validation': validation_dataset,
    'test': test_dataset
})


In [None]:
print(train_dataset.features)
print(train_dataset[0:50])
print(validation_dataset[0:50])
print(test_dataset[0:50])

### Define Entity Labels and Label Mapping
We define custom entity labels representing components of an address (e.g., `B-STREET`, `I-CITY`, etc.) and create a dictionary mapping label names to integer IDs.

In [None]:
# Accessing the label names from the 'ner_tags' feature.   This is used
label_names = ['O','B-NAME','I-NAME','B-STREET_NUM','I-STREET_NUM','B-STREET','I-STREET','B-UNIT','I-UNIT','B-CITY','I-CITY','B-STATE','I-STATE','B-POSTAL','I-POSTAL'] #address_ner_dataset['train'][0]['ner_tags'] #tokenized_datasets['train'].features['ner_tags']

#label_encoding_dict = {'O':0,'B-NAME':1,'I-NAME':2,'B-STREET_NUMBER':3,'I-STREET_NUMBER':4,'B-STREET_NAME':5,'I-STREET_NAME':6,'B-UNIT_NUMBER':7,'I-UNIT_NUMBER':8,'B-UNIT_DESIGNATOR':9,'I-UNIT_DESIGNATOR':10,'B-CITY':11,'I-CITY':12,'B-STATE_ABBREVIATION':13,'I-STATE_ABBREVIATION':14,'B-PLUS_4':15,'I-PLUS_4':16,'B-STATE_NAME':17,'I-STATE_NAME':18,'B-POSTAL_CODE':19,'I-POSTAL_CODE':20,'B-TRACKING_NUMBER':21,'I-TRACKING_NUMBER':22}
label_encoding_dict = {
    "O": 0,
    "B-NAME": 1,
    "I-NAME": 2,
    "B-STREET_NUM": 3,
    "I-STREET_NUM": 4,
    "B-STREET": 5,
    "I-STREET": 6,
    "B-UNIT": 7,
    "I-UNIT": 8,
    "B-CITY": 9,
    "I-CITY": 10,
    "B-STATE": 11,
    "I-STATE": 12,
    "B-POSTAL": 13,
    "I-POSTAL": 14
}
label_names

In [None]:
train_dataset[0]

### Tokenization & Label Alignment
This function tokenizes input tokens and aligns their corresponding NER tags.  
It ensures labels correspond correctly to subword tokens generated by BERT.


In [None]:
def tokenize_and_align_labels(batch):
    label_all_tokens = True
    #tokenized_inputs = tokenizer(list(batch["tokens"]), truncation=True, is_split_into_words=True, padding='max_length')
    tokenized_inputs = tokenizer(batch['tokens'], truncation=True, is_split_into_words=True, padding='max_length') #return_offsets_mapping=True

     # Define a mapping from beginning (B-) labels to inside (I-) labels
    begin2inside = {
        "B-NAME": "I-NAME",
        "B-STREET_NUM": "I-STREET_NUM",
        "B-STREET": "I-STREET",
        "B-UNIT": "I-UNIT",
        "B-CITY" : "I-CITY",
        "B-STATE" : "I-STATE",
        "B-POSTAL" : "I-POSTAL",
    }

    labels = []
    for i, label in enumerate(batch[f"ner_tags"]):
        word_ids = tokenized_inputs.word_ids(batch_index=i)
        previous_word_idx = None
        label_ids = []
        for word_idx in word_ids:
            if word_idx is None:
                label_ids.append(-100)
            elif label[word_idx] == '0':
                label_ids.append(0)
            elif word_idx != previous_word_idx:
                label_ids.append(label_encoding_dict[label[word_idx]])
            else:
                labelId = (label_encoding_dict[label[word_idx]])
                # Change B- to I- if the previous word is the same
                if label[word_idx] in begin2inside:
                    labelId = (label_encoding_dict[begin2inside[label[word_idx]]])  # Map B- to I-
                label_ids.append(labelId if label_all_tokens else -100)

            previous_word_idx = word_idx
        labels.append(label_ids)

    tokenized_inputs["labels"] = labels
    return tokenized_inputs

### Initialize Tokenizer
We use the pretrained `bert-base-NER-uncased` tokenizer for tokenizing address tokens.  
This ensures consistency with the BERT model vocabulary.


In [None]:
from transformers import AutoTokenizer

# Define the checkpoint you want to use for the tokenizer.
checkpoint = 'dslim/bert-base-NER-uncased'

tokenizer = AutoTokenizer.from_pretrained(checkpoint,num_labels=len(label_names),is_split_into_words = True)

### Test the Tokenization and Label Alignment Function
We apply the function on a single example to verify label alignment correctness.  
This step helps ensure that labels and tokens match as expected.


In [None]:
## The follwoing code is to test if tokenize_and_align_labels is working correctly

# Example input (word-level tokens with labels)
#test_batch = {
#    "tokens": [["19", "Waldo", "AvenuewESTeAST","East", "BelfastCORNILANA", ",", "ME", "04915" ,"Consolidated", "Communications"]],
#    "ner_tags": [[
#        "S-STREET_NUM", "B-STREET", "I-STREET", "E-STREET", "S-CITY", "O", "S-STATE", "S-POSTAL","O","O"
#    ]]
#}

# Take a single example from your actual dataset
test_example = train_dataset[3]


# Wrap it in a batch format expected by the function
test_batch = {
    "tokens": [test_example["tokens"]],
    "ner_tags": [test_example["ner_tags"]]
}

#label_encoding_dict = {'O':0,'B-NAME':1,'I-NAME':2,'B-STREET_NUMBER':3,'I-STREET_NUMBER':4,'B-STREET_NAME':5,'I-STREET_NAME':6,'B-UNIT_NUMBER':7,'I-UNIT_NUMBER':8,'B-UNIT_DESIGNATOR':9,'I-UNIT_DESIGNATOR':10,'B-CITY':11,'I-CITY':12,'B-STATE_ABBREVIATION':13,'I-STATE_ABBREVIATION':14,'B-PLUS_4':15,'I-PLUS_4':16,'B-STATE_NAME':17,'I-STATE_NAME':18,'B-POSTAL_CODE':19,'I-POSTAL_CODE':20,'B-TRACKING_NUMBER':21,'I-TRACKING_NUMBER':22}
label_encoding_dict = {
    "O": 0,
    "B-NAME": 1,
    "I-NAME": 2,
    "B-STREET_NUM": 3,
    "I-STREET_NUM": 4,
    "B-STREET": 5,
    "I-STREET": 6,
    "B-UNIT": 7,
    "I-UNIT": 8,
    "B-CITY": 9,
    "I-CITY": 10,
    "B-STATE": 11,
    "I-STATE": 12,
    "B-POSTAL": 13,
    "I-POSTAL": 14
}
# Reverse mapping
id2label = {v: k for k, v in label_encoding_dict.items()}

# Use your tokenize_and_align_labels function
# Apply the function
encoded = tokenize_and_align_labels(test_batch)

# Visualize token-label mapping
tokens = tokenizer.convert_ids_to_tokens(encoded["input_ids"][0])
labels = encoded["labels"][0]

print("Token\t\tLabel")
print("-" * 30)

for token, label_id in zip(tokens, labels):
    if label_id == -100:
        print(f"{token:10s}\tIGNORED")
    else:
        print(f"{token:10s}\t{id2label[label_id]}")


In [None]:
# Tokenize the first training example from the dataset
token = tokenizer(train_dataset[0]['tokens'],is_split_into_words = True)  ##, return_offsets_mapping=True

# Print the tokenizer object, the tokenized tokens, and the word IDs
print(token, '\n--------------------------------------------------------------------------------------\n',
      token.tokens(),'\n--------------------------------------------------------------------------------------\n',
      token.word_ids(),'\n--------------------------------------------------------------------------------------\n')

### Apply Tokenization to All Dataset Splits
We apply the `tokenize_and_align_labels()` function to the entire train, validation, and test datasets.


In [None]:
tokenized_train_dataset = train_dataset.map(tokenize_and_align_labels, batched=True, remove_columns=train_dataset.column_names)
tokenized_test_dataset = test_dataset.map(tokenize_and_align_labels, batched=True, remove_columns=test_dataset.column_names)
tokenized_validation_dataset = validation_dataset.map(tokenize_and_align_labels, batched=True, remove_columns=validation_dataset.column_names)

### Create Data Collator for Token Classification
The `DataCollatorForTokenClassification` dynamically pads input sequences and labels during training.


In [None]:
from transformers import DataCollatorForTokenClassification

# Create a DataCollatorForTokenClassification object
data_collator = DataCollatorForTokenClassification(tokenizer)
print(tokenized_train_dataset)
# Testing data using the data collator
batch = data_collator([tokenized_train_dataset[i] for i in range(1)])

# Display the resulting batch
batch

### Load Evaluation Metric
We install and import the `seqeval` metric via Hugging Face’s `evaluate` library to compute precision, recall, F1, and accuracy.

In [None]:
# Install the seqeval library for evaluating sequence tasks
!pip install seqeval ;
!pip install evaluate ;

In [None]:
# Import the seqeval metric from Hugging Face's datasets library
import evaluate

# Load the seqeval metric which can evaluate NER and other sequence tasks
metric = evaluate.load("seqeval")

### Compute Evaluation Metrics
This function calculates precision, recall, F1-score, and accuracy from the model’s predictions and true labels.


In [None]:
import numpy as np
# Function to compute evaluation metrics from model logits and true labels
def compute_metrics(logits_and_labels):

  # Unpack the logits and labels
  logits, labels = logits_and_labels

  # Get predictions from the logits
  predictions = np.argmax(logits, axis=-1)

  # Remove ignored index (special tokens)
  str_labels = [
    [label_names[t] for t in label if t!=-100] for label in labels
  ]

  str_preds = [
    [label_names[p] for (p, t) in zip(prediction, label) if t != -100]
    for prediction, label in zip(predictions, labels)
  ]

  # Compute metrics
  results = metric.compute(predictions=str_preds, references=str_labels)

  # Extract key metrics
  return {
    "precision": results["overall_precision"],
    "recall": results["overall_recall"],
    "f1": results["overall_f1"],
    "accuracy": results["overall_accuracy"]
  }

In [None]:
# Create mapping from label ID to label string name
id2label = {k: v for k, v in enumerate(label_names)}

# Create reverse mapping from label name to label ID
label2id = {v: k for k, v in enumerate(label_names)}

print(id2label , '\n--------------------\n' , label2id)

### Load Pretrained Model for Token Classification
We load the `dslim/bert-base-NER-uncased` model with the correct number of labels and mappings (`id2label`, `label2id`).


In [None]:
# Load pretrained token classification model from Transformers
from transformers import AutoModelForTokenClassification

# Initialize model object with pretrained weights
model = AutoModelForTokenClassification.from_pretrained(
  checkpoint,
  num_labels=len(label_names),
  # Pass in label mappings
  id2label=id2label,
  label2id=label2id,
  ignore_mismatched_sizes=True
)

### Training Setup and Save Fine-tuned Model
We configure `TrainingArguments`, optimizer, scheduler, and callbacks for fine-tuning the model.
After training completes, we save the fine-tuned model to Google Drive for reuse or deployment.


In [None]:
from transformers import (
    Trainer, TrainingArguments, EarlyStoppingCallback,
    AutoModelForTokenClassification, get_scheduler, AdamW
)

# Create mapping from label ID to string and vice versa
id2label = {k: v for k, v in enumerate(label_names)}
label2id = {v: k for k, v in enumerate(label_names)}

# Load model with label mapping
model = AutoModelForTokenClassification.from_pretrained(
    checkpoint,
    num_labels=len(label_names),
    id2label=id2label,
    label2id=label2id,
    ignore_mismatched_sizes=True
)

# Training arguments
training_args = TrainingArguments(
    output_dir="address_parser_fine_tuned_model",
    evaluation_strategy="epoch",
    learning_rate=2e-5,               #Lower the better, can range from 1e-5 (i.e. 0.00001 small) to 2e-5(i.e. 0.00002 large)
    per_device_train_batch_size=32,   # For less number of training records (eg. <5k), lesser the batch size the better like 16. But for greater training dataset batch size should be more like 32
    per_device_eval_batch_size=32,
    num_train_epochs=15,
    weight_decay=0.01,
    gradient_accumulation_steps=1,
    save_strategy="epoch",
    logging_dir="./logs",
    logging_steps=50,
    load_best_model_at_end=True,
    metric_for_best_model="f1",  # Must match the key returned by compute_metrics
    report_to="none",
    fp16=True                    # Enable mixed-precision training
)

# Optimizer
optimizer = AdamW(model.parameters(), lr=2e-5)

# Scheduler setup
num_training_steps = (
    (len(tokenized_train_dataset) // training_args.per_device_train_batch_size)
    * training_args.num_train_epochs
)

# Compute warmup steps as 10% of total steps
num_warmup_steps = int(0.1 * num_training_steps)

lr_scheduler = get_scheduler(
    name="linear",
    optimizer=optimizer,
    num_warmup_steps=num_warmup_steps,
    num_training_steps=num_training_steps
)

# Early stopping callback
early_stopping_callback = EarlyStoppingCallback(early_stopping_patience=3)  # Early Stopping after 3 epochs

# Trainer setup
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=tokenized_train_dataset.shuffle(seed=42).select(range(168456)),
    eval_dataset=tokenized_test_dataset.shuffle(seed=42).select(range(21056)),
    tokenizer=tokenizer,
    compute_metrics=compute_metrics,
    data_collator=data_collator,
    optimizers=(optimizer, lr_scheduler),  # Correct way to use scheduler
    callbacks=[early_stopping_callback],
)

# Start training
trainer.train()

path = F"/content/drive/My Drive/Colab Notebooks/Version 3/Trainer_Model_Version_3_2/"
trainer.save_model(path)


### Load Model for Inference
We load the trained model using `pipeline()` to perform token classification on new address strings.  
This allows easy testing and demonstration of real-world performance.


In [None]:
from transformers import pipeline

nerTrainer_new = pipeline(
    'token-classification',
    '/content/drive/My Drive/Colab Notebooks/Version 3/Trainer_Model_Version_3_2/',
    tokenizer=tokenizer,
    grouped_entities=True,
    aggregation_strategy = 'simple' ,
    device = -1   # device = 0 for gpu & device = -1 for cpu
)

### Output/ Inference Result

Below are sample address strings and the structured address components extracted by the fine-tuned **Address Parser** model.

Each address is converted into a set of `(value, label)` pairs for easy downstream use.

---
# Address Parsing Results

## 1. Address: 754-782 BROADWAY, 23, CHULA VISTA CA 919105372

- **STREET_NUM**: `754 - 782`, Score: `1.0`
- **STREET**: `BROADWAY`, Score: `1.0`
- **CITY**: `CHULA VISTA`, Score: `1.0`
- **STATE**: `CA`, Score: `1.0`
- **POSTAL**: `919105372`, Score: `1.0`

---

## 2. Address: Kashyap Property-1284 Leland Road-Manassas-VA-20111-Prince William County

- **NAME**: `kashyap property`, Score: `1.0`
- **STREET_NUM**: `1284`, Score: `1.0`
- **STREET**: `LELAND ROAD`, Score: `1.0`
- **CITY**: `MANASSAS`, Score: `1.0`
- **STATE**: `VA`, Score: `1.0`
- **POSTAL**: `20111`, Score: `1.0`

---

## 3. Address: The address of the suspected auto service station is 108 Harrison Street Southeast, Leesburg, VA 20198.

- **NAME**: `suspected auto service station`, Score: `0.9999`
- **STREET_NUM**: `108`, Score: `1.0`
- **STREET**: `HARRISON STREET SOUTHEAST`, Score: `1.0`
- **CITY**: `LEESBURG`, Score: `1.0`
- **STATE**: `VA`, Score: `1.0`
- **POSTAL**: `20198`, Score: `1.0`

---

## 4. Address: 7890 Old Mill Road, Richmond, VA 23225

- **STREET_NUM**: `7890`, Score: `1.0`
- **STREET**: `OLD MILL ROAD`, Score: `1.0`
- **CITY**: `RICHMOND`, Score: `1.0`
- **STATE**: `VA`, Score: `1.0`
- **POSTAL**: `23225`, Score: `1.0`

---

## 5. Address: Exit 54 on I-95 South near Fayetteville, NC

- **STREET**: `EXIT 54 ON I-95 SOUTH`, Score: `0.8667`
- **CITY**: `FAYETTEVILLE`, Score: `1.0`
- **STATE**: `NC`, Score: `0.9998`

---

## 6. Address: Various locations throughout the Upper Peninsula, corporate address located at 920 10th Avenue North, varies, MI, 95855-0000, US

- **STREET_NUM**: `920`, Score: `1.0`
- **STREET**: `10TH AVENUE NORTH`, Score: `1.0`
- **CITY**: `VARIES`, Score: `1.0`
- **STATE**: `MI`, Score: `1.0`
- **POSTAL**: `95855-0000`, Score: `1.0`

---

## 7. Address: 1228-1290 Middletown & Warwick Road, Middletown, DE 19709 US

- **STREET_NUM**: `1228 - 1290`, Score: `1.0`
- **STREET**: `MIDDLETOWN & WARWICK ROAD`, Score: `1.0`
- **CITY**: `MIDDLETOWN`, Score: `1.0`
- **STATE**: `DE`, Score: `1.0`
- **POSTAL**: `19709`, Score: `1.0`

---

## 8. Address: Express Trucking Co-700 1st St-Harrison-VA-07029-Frederick County

- **NAME**: `TRUCKING CO`, Score: `0.9005`
- **STREET_NUM**: `700`, Score: `1.0`
- **STREET**: `1ST ST`, Score: `1.0`
- **CITY**: `HARRISON`, Score: `1.0`
- **STATE**: `VA`, Score: `1.0`
- **POSTAL**: `07029`, Score: `1.0`
- **CITY**: `FREDERICK COUNTY`, Score: `1.0`

---

## 9. Address: 4200 Summit Bridge Road, Summit Airport, Middletown, DE 19709 US

- **STREET_NUM**: `4200`, Score: `1.0`
- **STREET**: `SUMMIT BRIDGE ROAD`, Score: `1.0`
- **CITY**: `SUMMIT AIRPORT`, Score: `0.9954`
- **CITY**: `MIDDLETOWN`, Score: `1.0`
- **STATE**: `DE`, Score: `1.0`
- **POSTAL**: `19709`, Score: `1.0`

---

## 10. Address: The actual compost site is located across the street from a community member home: 17397 Count Turf Place.  However, the name of the business, Clairvoux LLC and address is: 40730 Farm Market Road, Leesburg 20176. The compost site is located on land they own in the community.  Route 7 West to Farm Market Road, turn right onto Alysheba Drive, left onto Count Turf.  Compost site is on right about 100 feet.

- **NAME**: `ACTUAL COMPOST SITE`, Score: `0.9638`
- **STREET_NUM**: `17397`, Score: `1.0`
- **STREET**: `COUNT TURF PLACE`, Score: `1.0`
- **NAME**: `CLAIRVOUX LLC`, Score: `0.9998`
- **STREET_NUM**: `40730`, Score: `1.0`
- **STREET**: `FARM MARKET ROAD`, Score: `1.0`
- **CITY**: `LEESBURG`, Score: `1.0`
- **POSTAL**: `20176`, Score: `1.0`

---

## 11. Address: Capitol Fiber, Inc - Recycling Center-6610 Electronic Drive-Springfield-VA-22151-Fairfax County

- **NAME**: `CAPITOL FIBER, INC`, Score: `0.9999`
- **STREET_NUM**: `6610`, Score: `1.0`
- **STREET**: `ELECTRONIC DRIVE`, Score: `1.0`
- **CITY**: `SPRINGFIELD`, Score: `1.0`
- **STATE**: `VA`, Score: `1.0`
- **POSTAL**: `22151`, Score: `1.0`
- **CITY**: `FAIRFAX COUNTY`, Score: `1.0`

---

## 12. Address: This incident took place at 980 Bayshore rd. Cape Charles, VA 23318

- **STREET_NUM**: `980`, Score: `0.9999`
- **STREET**: `BAYSHORE RD.`, Score: `0.9999`
- **CITY**: `CAPE CHARLES`, Score: `1.0`
- **STATE**: `VA`, Score: `1.0`
- **POSTAL**: `23318`, Score: `1.0`

---

## 13. Address: Oyster Farm at Kings Creek 500 Marina Village Cir Cape Charles, VA 23310

- **NAME**: `OYSTER FARM AT KINGS`, Score: `0.9892`
- **STREET_NUM**: `500`, Score: `1.0`
- **STREET**: `MARINA VILLAGE CIR`, Score: `1.0`
- **CITY**: `CAPE CHARLES`, Score: `1.0`
- **STATE**: `VA`, Score: `1.0`
- **POSTAL**: `23310`, Score: `1.0`

---

## 14. Address: On North Curry St. between County St and Sewell Ave. Directly across the Street from 33 N.Curry St, Hampton,Va 23663-5858

- **STREET**: `COUNTY ST AND SEWELL AVE`, Score: `0.9999`
- **STREET_NUM**: `33`, Score: `1.0`
- **STREET**: `N. CURRY ST`, Score: `1.0`
- **CITY**: `HAMPTON`, Score: `1.0`
- **STATE**: `VA`, Score: `1.0`
- **POSTAL**: `23663-5858`, Score: `1.0`

---
