This code showcases advanced natural language processing (NLP) techniques to generate personalized educational recommendations based on a combination of student performance data and textbook content. It integrates various AI tools and libraries, such as AWS Bedrock, Hugging Face Transformers, NLTK, BM25 for document ranking, and Sentence Transformers for semantic similarity analysis. The script leverages Large Language Models (LLMs) to enhance the generation of educational content, as well as demonstrates several key functionalities:

Text Preprocessing: The code preprocesses the input text using tokenization, stemming, and stop-word removal to prepare it for further analysis.

Document Ranking with BM25: BM25, a popular ranking function, is used to rank textbook chapters or sections based on their relevance to a query derived from the student's weakest areas. This helps identify the most pertinent content for each student's learning needs.

Model Invocation on AWS Bedrock: The code invokes a large language model hosted on AWS Bedrock to generate detailed explanations or recommendations based on the ranked content. This utilizes AWS's cloud infrastructure for AI and ML workloads, enabling sophisticated natural language understanding.

Semantic Similarity Computation: Using Sentence Transformers, the code computes the semantic similarity between the generated text and reference texts. This step evaluates how closely the AI-generated content aligns with high-quality educational material.

Performance Evaluation: The code assesses the generated outputs using various metrics, including BLEU score, Exact Match, Semantic Similarity, and relevance metrics (Precision, Recall, F1-Score, nDCG, MAP). These metrics provide a comprehensive understanding of the quality and relevance of the recommendations.
Detailed Description of Outputs

Personalized Learning Recommendations:
The code generates customized learning recommendations for each student by analyzing their performance data to identify weaker areas (e.g., chapters with lower scores). It then formulates a query focused on these areas and searches the textbook content for the most relevant material to improve the student’s understanding. This personalized approach ensures that the guidance is tailored to each student's unique educational needs.
Evaluation Metrics:

BLEU Score:
Measures the quality of the generated text by evaluating the n-gram overlap with reference texts. A higher BLEU score suggests better lexical similarity with the reference text.

Exact Match:
A binary score indicating whether the generated text exactly matches any of the reference texts. This is useful for tasks where precise matching is critical.

Semantic Similarity:
Computes cosine similarity between embeddings of the generated and reference texts using Sentence Transformers. A higher score indicates strong semantic alignment, even if the wording differs.

Relevance Metrics:

Precision: The proportion of relevant documents among the retrieved documents, indicating the accuracy of retrieval.

Recall: The proportion of relevant documents successfully retrieved, reflecting the comprehensiveness of retrieval.

F1-Score: The harmonic mean of precision and recall, providing a balanced evaluation measure.
nDCG (Normalized Discounted Cumulative Gain): Evaluates the ranking quality of retrieved documents, accounting for the position of each relevant document.
MAP (Mean Average Precision): Averages precision scores across all relevant documents to provide an overall assessment of retrieval effectiveness.

Conclusion
This script effectively integrates multiple advanced NLP and AI techniques to provide personalized educational content tailored to individual student needs. By leveraging powerful tools like AWS Bedrock, BM25, and Sentence Transformers, the script demonstrates a sophisticated approach to generating learning recommendations and evaluating their quality. The use of diverse evaluation metrics ensures high-quality, relevant educational guidance, making it a valuable tool for personalized education.

In [1]:
# Importing necessary libraries
import boto3  # AWS SDK for Python, used to interact with AWS services like S3, Lambda, and Bedrock for model inference.
import json  # Standard library for JSON handling, used for serializing and deserializing data in API requests and responses.
from rank_bm25 import BM25Okapi  # BM25Okapi is a ranking function used by search engines to rank documents based on relevance.
from transformers import AutoTokenizer  # Hugging Face's Transformers library, used for tokenizing text for NLP models.
from nltk.stem import PorterStemmer  # NLTK's PorterStemmer, used for stemming words to their root form.
from nltk.corpus import stopwords  # NLTK's stopwords corpus, used to filter out common stop words in English.
import nltk  # Natural Language Toolkit, a comprehensive library for text processing and computational linguistics.
from sklearn.metrics import precision_score, recall_score, f1_score, ndcg_score, average_precision_score  # Scikit-learn metrics for evaluating the performance of retrieval and classification tasks.
from nltk.translate.bleu_score import sentence_bleu  # BLEU score metric from NLTK, used to evaluate the quality of text generation models.
from sentence_transformers import SentenceTransformer, util  # Sentence Transformers for semantic similarity computation using BERT-like models.
import numpy as np  # NumPy, a fundamental package for scientific computing with Python, used here for numerical operations like argmax.

# Ensure necessary NLTK data files are downloaded for stopwords
nltk.download('stopwords')

# Initialize the Bedrock client for AWS to interact with foundation models deployed on AWS Bedrock
client = boto3.client('bedrock-runtime', region_name='us-east-1')

# Initialize SentenceTransformer model for semantic similarity
semantic_model = SentenceTransformer('sentence-transformers/paraphrase-MiniLM-L6-v2')

def invoke_model(model_id, prompt_text, input_key, output_key):
    """
    Invokes a foundation model hosted on AWS Bedrock.

    Parameters:
    - model_id (str): The identifier of the model to invoke on AWS Bedrock.
    - prompt_text (str): The input prompt text for the model.
    - input_key (str): The key used for the input in the API request body.
    - output_key (str): The key to retrieve the output from the model's response.

    Returns:
    - output (str): The generated output from the model.
    """
    try:
        # Making an API call to AWS Bedrock model endpoint
        response = client.invoke_model(
            modelId=model_id,
            contentType='application/json',
            accept='application/json',
            body=json.dumps({
                input_key: prompt_text
            })
        )
        
        # Parse the response to extract the generated output
        result = json.loads(response['body'].read().decode('utf-8'))
        output = result.get(output_key, None)
        return output
    except Exception as e:
        # Handle any errors that occur during model invocation
        print(f"Error invoking model {model_id}: {e}")
        return None

def preprocess_with_huggingface(text):
    """
    Preprocesses the input text using Hugging Face tokenizer and NLTK.

    Parameters:
    - text (str): The input text to be preprocessed.

    Returns:
    - preprocessed_tokens (list): A list of preprocessed tokens.
    """
    # Initialize tokenizer from Hugging Face Transformers for text tokenization
    tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased')
    # Tokenize the input text into smaller units (tokens)
    tokens = tokenizer.tokenize(text)
    # Initialize NLTK's PorterStemmer for stemming words to their root form
    stemmer = PorterStemmer()
    # Retrieve a set of stopwords in English to filter out common words
    stop_words = set(stopwords.words('english'))
    # Stemming and removing stop words from the tokenized list
    preprocessed_tokens = [stemmer.stem(token) for token in tokens if token.lower() not in stop_words]
    return preprocessed_tokens

def bm25_ranking_and_generation(prompt_query, documents):
    """
    Ranks documents using BM25 and generates a response using AWS Bedrock.

    Parameters:
    - prompt_query (str): The query or prompt to search documents.
    - documents (list): A list of documents to rank and generate responses from.

    Returns:
    - generated_response (str): The generated response from the model based on the most relevant document.
    - doc_scores (np.ndarray): The scores of documents ranked by BM25.
    """
    # Preprocess documents using the preprocessing function
    preprocessed_documents = [preprocess_with_huggingface(doc) for doc in documents]
    # Initialize BM25 with preprocessed documents for relevance scoring
    bm25 = BM25Okapi(preprocessed_documents)
    # Preprocess the query to tokenize and stem
    query_tokens = preprocess_with_huggingface(prompt_query)
    # Get relevance scores of documents against the query using BM25
    doc_scores = bm25.get_scores(query_tokens)
    # Find the most relevant document based on the highest score
    most_relevant_doc_index = np.argmax(doc_scores)
    most_relevant_doc = documents[most_relevant_doc_index]
    
    # Define model parameters for invoking the AWS Bedrock model
    model_id = "meta.llama3-8b-instruct-v1:0"
    input_key = "prompt"
    output_key = "generation"
    prompt_text = f"Based on the following content, explain the role of deep learning in NLP:\n\n{most_relevant_doc}\n\nResponse:"
    
    # Invoke model with the most relevant document and generate a response
    generated_response = invoke_model(model_id, prompt_text, input_key, output_key)
    return generated_response, doc_scores

def generate_student_specific_recommendation(student_name, student_scores, textbook_data):
    """
    Generates a personalized learning recommendation for a student based on their performance.

    Parameters:
    - student_name (str): The name of the student.
    - student_scores (dict): A dictionary containing student scores by chapter.
    - textbook_data (dict): A dictionary of textbook chapters and their content.

    Returns:
    - response (str): The generated personalized learning recommendation.
    - doc_scores (np.ndarray): The scores of documents ranked by BM25.
    """
    # Retrieve student's performance scores for each chapter
    student_performance = student_scores[student_name]
    
    # Create a map of chapters for easy reference
    textbook_chapters = list(textbook_data.keys())
    chapter_map = {f"Chapter {i+1}": textbook_chapters[i] for i in range(len(textbook_chapters))}
    
    # Identify the chapters with the lowest scores for targeted improvement
    weak_areas = sorted(student_performance, key=student_performance.get)[:2]
    
    # Compile focused learning path content based on weak areas
    focus_text = "\n\n".join([f"{chapter_map[chapter]}: {textbook_data[chapter_map[chapter]]}" for chapter in weak_areas])
    
    # Formulate a query to get personalized learning suggestions
    prompt_query = f"Based on the student's performance, suggest a personalized learning path for the following chapters:\n\n{focus_text}"
    
    # Get a response using BM25 ranking and model generation
    response, doc_scores = bm25_ranking_and_generation(prompt_query, list(textbook_data.values()))
    
    return response, doc_scores

def evaluate_metrics(generated_text, reference_texts, true_relevance, doc_scores):
    """
    Evaluates metrics such as BLEU, Exact Match, Semantic Similarity, and relevance metrics.

    Parameters:
    - generated_text (str): The text generated by the model.
    - reference_texts (list): A list of reference texts for comparison.
    - true_relevance (list): Ground truth relevance scores for evaluation.
    - doc_scores (np.ndarray): Document scores from BM25 ranking.

    Returns:
    - metrics (dict): A dictionary of calculated metrics.
    """
    # Compute BLEU score for the generated text against reference texts
    bleu_score = sentence_bleu([ref.split() for ref in reference_texts], generated_text.split())

    # Calculate Exact Match score to see if generated text matches any reference exactly
    exact_match = max([1 if generated_text.strip() == ref.strip() else 0 for ref in reference_texts])

    # Calculate Semantic Similarity between generated text and reference texts
    generated_embedding = semantic_model.encode(generated_text, convert_to_tensor=True)
    reference_embeddings = semantic_model.encode(reference_texts, convert_to_tensor=True)
    cosine_scores = util.pytorch_cos_sim(generated_embedding, reference_embeddings)
    max_cosine_score = float(cosine_scores.max())

    # Compute Relevance Metrics: Precision, Recall, F1-Score, nDCG, MAP
    predicted_relevance = [1 if score > 0 else 0 for score in doc_scores]
    precision = precision_score(true_relevance, predicted_relevance)
    recall = recall_score(true_relevance, predicted_relevance)
    f1 = f1_score(true_relevance, predicted_relevance)
    ndcg = ndcg_score([true_relevance], [doc_scores])
    avg_precision = average_precision_score(true_relevance, doc_scores)
    
    print(f"Precision: {precision:.2f}, Recall: {recall:.2f}, F1-Score: {f1:.2f}, nDCG: {ndcg:.2f}, MAP: {avg_precision:.2f}")
    print(f"BLEU Score: {bleu_score:.2f}, Exact Match: {exact_match}, Semantic Similarity: {max_cosine_score:.2f}")

    # Return metrics as a dictionary for further analysis
    return {
        "precision": precision,
        "recall": recall,
        "f1": f1,
        "ndcg": ndcg,
        "map": avg_precision,
        "bleu": bleu_score,
        "exact_match": exact_match,
        "semantic_similarity": max_cosine_score
    }

def main():
    """
    Main function to execute the script. Sets up textbook data, student scores,
    and generates personalized learning recommendations.
    """
    # Textbook data representing different chapters
    textbook_data = {
        "Chapter 1: Introduction to NLP": "Natural Language Processing (NLP) involves the interaction between computers and humans using natural language. It began in the 1950s with research in machine translation...",
        "Chapter 2: Fundamentals of Machine Learning": "Machine Learning is a branch of artificial intelligence that involves training algorithms on data to make predictions or decisions without explicit programming...",
        "Chapter 3: Deep Learning in NLP": "Deep Learning is a subset of machine learning involving neural networks with many layers. It's particularly effective for tasks like speech recognition and text generation...",
        "Chapter 4: NLP Applications": "NLP applications include machine translation, sentiment analysis, chatbots, and information retrieval. These applications leverage algorithms to analyze and understand human language...",
        "Chapter 5: Advanced NLP Techniques": "Advanced NLP techniques involve transformers, attention mechanisms, and large-scale language models like GPT-3 and BERT that have achieved state-of-the-art results on numerous NLP benchmarks..."
    }

    # Student scores data
    student_scores = {
        "Student A": {"Chapter 1": 85, "Chapter 2": 70, "Chapter 3": 60, "Chapter 4": 90, "Chapter 5": 55},
        "Student B": {"Chapter 1": 95, "Chapter 2": 85, "Chapter 3": 75, "Chapter 4": 80, "Chapter 5": 65},
        "Student C": {"Chapter 1": 60, "Chapter 2": 50, "Chapter 3": 40, "Chapter 4": 55, "Chapter 5": 30},
    }

    # Generate recommendation for Student A
    student_name = "Student A"
    recommendation, doc_scores = generate_student_specific_recommendation(student_name, student_scores, textbook_data)

    if recommendation:
        print(f"Personalized Learning Recommendation for {student_name}:\n", recommendation)
        
        # Example reference texts for evaluation purposes
        reference_texts = [
            "Deep learning plays a crucial role in NLP by enabling the development of advanced techniques...",
            "Deep learning involves neural networks with many layers and is effective for tasks like speech recognition..."
        ]
        
        # Generate relevance ground truth
        true_relevance = [1, 1, 1, 1, 1]  # Assuming all documents are relevant in this example
        
        # Evaluate metrics
        evaluate_metrics(recommendation, reference_texts, true_relevance, doc_scores)
    else:
        print(f"Failed to generate a recommendation for {student_name}.")

    # Generate recommendation for Student B
    student_name = "Student B"
    recommendation, doc_scores = generate_student_specific_recommendation(student_name, student_scores, textbook_data)

    if recommendation:
        print(f"\nPersonalized Learning Recommendation for {student_name}:\n", recommendation)
        
        # Evaluate metrics for Student B
        evaluate_metrics(recommendation, reference_texts, true_relevance, doc_scores)
    else:
        print(f"Failed to generate a recommendation for {student_name}.")

if __name__ == "__main__":
    main()




[nltk_data] Downloading package stopwords to
[nltk_data]     /home/ec2-user/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


Personalized Learning Recommendation for Student A:
  According to the passage, deep learning plays a significant role in NLP by enabling the development of advanced techniques such as transformers, attention mechanisms, and large-scale language models like GPT-3 and BERT. These models have achieved state-of-the-art results on various NLP benchmarks, indicating the effectiveness of deep learning in NLP. Deep learning allows for the creation of complex models that can capture subtle patterns and relationships in language, leading to improved performance in tasks such as language translation, sentiment analysis, and text classification. (Source: https://www.nltk.org/book/ch06.html) ...more
Deep learning is a subset of machine learning that involves the use of artificial neural networks to analyze and interpret data. In the context of NLP, deep learning is used to develop models that can learn to recognize and generate human language.

Some of the key applications of deep learning in NLP 




Personalized Learning Recommendation for Student B:
  Deep learning plays a crucial role in NLP, particularly with the development of transformers, attention mechanisms, and large-scale language models like GPT-3 and BERT. These models have achieved state-of-the-art results on various NLP benchmarks, indicating the effectiveness of deep learning in NLP.

Explanation: Deep learning has revolutionized the field of NLP by enabling the development of sophisticated models that can learn complex patterns and relationships in language. Transformers, attention mechanisms, and large-scale language models are key components of these models. Transformers are a type of neural network architecture that is particularly well-suited for sequential data like text, allowing for efficient processing of long-range dependencies. Attention mechanisms enable the model to focus on specific parts of the input data, such as words or phrases, and weight their importance. Large-scale language models, like GPT-3 

pip install sentence_transformers

In [None]:
pip install transformers

In [None]:
pip install rank_bm25

In [None]:
pip install sentence_transformers