# BERT Language Model (LM) with TensorFlow

This project demonstrates fine-tuning a BERT model for language modeling using TensorFlow. 
Bidirectional encoder representations from transformers (BERT) is a language model introduced in October 2018 by researchers at Google. It learns to represent text as a sequence of vectors using self-supervised learning. It uses the encoder-only transformer architecture.


In [11]:
import tensorflow as tf
from transformers import TFBertForQuestionAnswering, BertTokenizerFast
import json
import os

# Set workspace_dir to the root of the project
workspace_dir = os.path.abspath(os.path.join(os.getcwd(), '..'))

# Define paths to the model and dataset
model_path = os.path.join(workspace_dir, 'trained_model')
dataset_path = os.path.join(workspace_dir, 'data', 'dataset.json')

class QAInference:
    def __init__(self, model_path):
        """
        Initializes the model and tokenizer for inference.
        """
        # Load the fine-tuned model and tokenizer
        self.model = TFBertForQuestionAnswering.from_pretrained(model_path)
        self.tokenizer = BertTokenizerFast.from_pretrained(model_path)

    def answer_question(self, context, question):
        """
        Given a context and a question, return the answer predicted by the model.
        """
        # Tokenize the input question and context
        inputs = self.tokenizer(question, context, return_tensors="tf", truncation=True, padding=True)
        
        # Get model outputs
        outputs = self.model(inputs)
        
        # Find the start and end logits of the predicted answer
        start_scores = outputs.start_logits
        end_scores = outputs.end_logits
        
        # Get the start and end positions of the answer
        start_idx = tf.argmax(start_scores, axis=1).numpy()[0]
        end_idx = tf.argmax(end_scores, axis=1).numpy()[0]
        
        # Extract the answer tokens and decode to string
        answer_tokens = inputs["input_ids"][0][start_idx:end_idx + 1]
        answer = self.tokenizer.decode(answer_tokens, skip_special_tokens=True)
        
        return answer

def main():
    # Load dataset from JSON (just for the context)
    with open(dataset_path, 'r') as file:
        dataset = json.load(file)

    # Initialize the QA model
    qa_model = QAInference(model_path)

    # Allow user to input a question
    question = input("Enter your question: ")

    # Use the first context from the dataset (or modify as needed)
    context = dataset[0]["context"]

    # Get the predicted answer for the given question and context
    predicted_answer = qa_model.answer_question(context, question)

    print("\nQuestion:", question)
    print("Predicted Answer:", predicted_answer)

if __name__ == "__main__":
    main()


All model checkpoint layers were used when initializing TFBertForQuestionAnswering.

All the layers of TFBertForQuestionAnswering were initialized from the model checkpoint at c:\GitHub\Machine-Learning\BERT\trained_model.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFBertForQuestionAnswering for predictions without further training.



Question: "What does natural language processing enable?"
Predicted Answer: building systems that learn from data
