# Course Identification and Recommendation

In the following notebook, a program will be created to identify which course the current user is in to recommend the next course. 

## Process Description

1. **Document Creation**: Initially, three documents will be created to reflect the content of each course.

2. **Current Course Identification**: Vectors will be used to identify the current course of the user by analyzing and comparing their progress with the course content.

3. **Next Course Recommendation**: Finally, an AI model from IBM will be used to generate the output, recommending the next course.

## Requirements

- Python 3.x
- scikit-learn
- ibm-watson-machine-learning

## Document Creation

Three text variables were loaded with the information from the Word document.

In [1]:
course1_description = """Course: Introduction to SRE
•	Class: Definition: Site Reliability Engineering (SRE) is a discipline that incorporates software engineering and applies it to infrastructure and operations problems, aiming to create scalable and highly reliable software systems. SRE ensures that an organization's systems are reliable and available.
•	Class: Role of an SRE: SREs are responsible for maintaining the reliability, availability, and performance of systems. They do this by automating tasks, monitoring systems, responding to incidents, and working closely with development teams to ensure that services meet user expectations.
•	Class: Relevant Course: IBM Professional SRE: Welcome and Introduction
    o	This course provides a foundational understanding of what SRE is and its significance in maintaining reliable and scalable systems.
"""


In [2]:
course2_description = """
• Class: Embrace Risk: SRE involves making informed decisions about risk. This principle recognizes that risk is a natural part of operating a service, and the goal is to manage and mitigate it effectively.

• Class: Service Level Objectives (SLOs): SLOs are the agreed-upon levels of performance that a service is expected to meet. These are critical in defining the reliability targets for services.

• Class: Relevant Course: Negotiating Service Level Objectives, Service Level
  o This course delves into the process of negotiating and defining SLOs, helping SREs establish clear expectations for service performance.

• Class: Error Budgets: Error budgets are a critical concept in SRE, defining how much unreliability is acceptable in a service. They guide decisions on whether to prioritize reliability or feature development.

• Class: Blameless Postmortems: When incidents occur, SREs conduct blameless postmortems to learn from the failures without assigning blame. This fosters a culture of continuous improvement.

• Class: Apply Software Engineering Principles to Drive Reliability:
  o This course emphasizes applying software engineering principles to improve the reliability and performance of systems, reinforcing the technical foundation of SRE practices.

• Class: Perform Operational Readiness Reviews (ORR): ORRs are comprehensive assessments conducted before launching or modifying a service, ensuring it is ready for production.

• Class: Relevant Courses:
  o Perform Operational Readiness Reviews on IBM: Learn how to conduct effective ORRs to ensure that services are production-ready.
  o Creating an ORR Checklist: This course offers a practical guide for creating detailed ORR checklists, a critical tool in the review process.
"""

In [3]:
course3_description = """
• Class: Trade-Offs in SRE: One of the key challenges in SRE is balancing the need for rapid changes and deployments with the need to maintain high reliability.

• Class: Relevant Course: Manage Trade-Off between Change, Velocity, and Reliability
  o This course focuses on strategies for managing the trade-offs between deploying changes quickly and maintaining system reliability.

• Class: Cost Optimization in SRE: SRE also involves managing the cost-efficiency of running services. Optimizing costs without compromising on reliability is a crucial part of the role.

• Class: Relevant Course: Employ Cost-Optimization Strategies
  o This course teaches strategies to optimize the cost of operations while ensuring that reliability and performance are not compromised.

• Class: Monitoring and Observability: Monitoring is a foundational practice in SRE, providing the data needed to ensure services are operating within their SLOs.

• Class: Relevant Course: Overview of Monitoring
  o This course provides an overview of the key monitoring strategies and tools used in SRE to maintain observability and ensure system health.
"""

In [4]:
!pip install scikit-learn


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.0[0m[39;49m -> [0m[32;49m24.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


## Index the ionformation

Now let index the knowledge by title

In [5]:
from sklearn.feature_extraction.text import TfidfVectorizer

def generate_tfidf_representation(documents):
    vectorizer = TfidfVectorizer()
    tfidf_vector = vectorizer.fit_transform(documents)
    return tfidf_vector

documents = [
    course1_description,
    course2_description,
    course3_description
]

knowledge_base = [ 
    { 
        "title"     : "Course: Introduction to SRE", 
        "Author"    : "IBM team",
        "Published" : "2024",
        "txt"       : course1_description 
    }, 
    {
        "title"     : "Course: Core Principles of SRE",
        "Author"    : "IBM team",
        "Published" : "2024",
        "txt"       : course2_description
    }, 
    {
        "title"     : "Course: Managing Change, Velocity, and Reliability",
        "Author"    : "IBM team",
        "Published" : "2024",
        "txt"       : course3_description
    }
]

## Current Course Identification

Let´s use a vector to find the current course

In [6]:
from sklearn.metrics.pairwise import cosine_similarity
def search( phrase ):
    texts = [phrase] + documents
    tfidf_matrix = generate_tfidf_representation(texts)
    # Compute cosine similarity between the phrase and each document
    phrase_vector = tfidf_matrix[0]  # The TF-IDF vector for the phrase
    document_vectors = tfidf_matrix[1:]  # The TF-IDF vectors for the documents
    # Calculate cosine similarity
    similarities = cosine_similarity(phrase_vector, document_vectors).flatten()
    # Find the index of the document with the highest similarity
    closest_index = similarities.argmax() 
    return closest_index

The following cell will display the outputs of some test

In [7]:
index = search( "Which course contains classes for risk managment?" )
if index >= 0:
    print( "Index: " + str( index ) + "\nArticle: \"" + knowledge_base[index]["title"] + "\"" )
else:
    print( "No matching content was found" )

index = search( "Which course contains information of the error budgets?" )
if index >= 0:
    print( "Index: " + str( index ) + "\nArticle: \"" + knowledge_base[index]["title"] + "\"" )
else:
    print( "No matching content was found" )

index = search( "Which course contains about monitoring?" )
if index >= 0:
    print( "Index: " + str( index ) + "\nArticle: \"" + knowledge_base[index]["title"] + "\"" )
else:
    print( "No matching content was found" )



Index: 1
Article: "Course: Core Principles of SRE"
Index: 1
Article: "Course: Core Principles of SRE"
Index: 2
Article: "Course: Managing Change, Velocity, and Reliability"


## Prompt as template

Let´s create a template to join all the information and send all together the a generative model and extend the reponse

In [39]:
prompt_template = """
Article:
###
%s
###

Answer the following question using only information from the article. 
Answer in a complete sentence, with proper capitalization and punctuation. 

Question: %s
Answer: 
"""

def augment( template_in, context_in, query_in ):
    return template_in % ( context_in, query_in )

In [40]:
query = "Which course contains about monitoring?"

index = search( query )
article_txt = knowledge_base[index]["txt"]

augmented_prompt = augment( prompt_template, article_txt, query )

print( augmented_prompt )


Article:
###

• Class: Trade-Offs in SRE: One of the key challenges in SRE is balancing the need for rapid changes and deployments with the need to maintain high reliability.

• Class: Relevant Course: Manage Trade-Off between Change, Velocity, and Reliability
  o This course focuses on strategies for managing the trade-offs between deploying changes quickly and maintaining system reliability.

• Class: Cost Optimization in SRE: SRE also involves managing the cost-efficiency of running services. Optimizing costs without compromising on reliability is a crucial part of the role.

• Class: Relevant Course: Employ Cost-Optimization Strategies
  o This course teaches strategies to optimize the cost of operations while ensuring that reliability and performance are not compromised.

• Class: Monitoring and Observability: Monitoring is a foundational practice in SRE, providing the data needed to ensure services are operating within their SLOs.

• Class: Relevant Course: Overview of Monitorin

In [41]:
!pip install ibm-watson-machine-learning

2605.98s - pydevd: Sending message related to process being replaced timed-out after 5 seconds



[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.0[0m[39;49m -> [0m[32;49m24.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


## IBM cloud

All the settings to connect the model at IBM model

In [42]:
import os
from ibm_watson_machine_learning.foundation_models import Model

gen_parms = { 
    "DECODING_METHOD" : "greedy", 
    "MIN_NEW_TOKENS" : 1, 
    "MAX_NEW_TOKENS" : 50 
}

model_id = "google/flan-t5-xxl"
api_key = os.getenv("IBM_API_KEY")
region = os.getenv("IBM_REGION")
project_id = os.getenv("IBM_PROJECT_ID")

credentials = {
    "apikey": f"{api_key}",
    "url": f"https://{region}.ml.cloud.ibm.com"  # Replace with your region
}

model = Model( model_id, credentials, gen_parms, project_id )

In [43]:
import json

def generate( model_in, augmented_prompt_in ):
    
    generated_response = model_in.generate( augmented_prompt_in )

    if ( "results" in generated_response ) \
       and ( len( generated_response["results"] ) > 0 ) \
       and ( "generated_text" in generated_response["results"][0] ):
        return generated_response["results"][0]["generated_text"]
    else:
        print( "The model failed to generate an answer" )
        print( "\nDebug info:\n" + json.dumps( generated_response, indent=3 ) )
        return ""

In [44]:
output = generate( model, augmented_prompt )
print( output )

Overview of Monitoring


## Next Course Recommendation
The following function will combine everything and prepare a response for the user based on the model's findings for the current course the user is taking.

In [45]:
import re
def searchAndAnswer( knowledge_base_in, model ):
    
    question = input( "Type your question:\n")
    if not re.match( r"\S+", question ):
        print( "No question")
        return
        
    # Retrieve the relevant content
    top_matching_index = search( question )
    if top_matching_index < 0:
        print( "No good answer was found in the knowledge base" )
        return;
    next_course_index = top_matching_index+1

    if next_course_index >= len(knowledge_base)   :
        asset = knowledge_base_in[top_matching_index]
        asset_txt  = asset["txt"]

        print( "\nQuestion:\n" + question )
        print( "\nCurrent course: \"" + asset["title"] + "\", " + asset["Author"] + " (" + asset["Published"] + ")"  )
        print( "\nYour´re done. Thank!\n")
    else:
        asset = knowledge_base_in[next_course_index]
        asset_txt  = asset["txt"]
        # Augment a prompt with context
        augmented_prompt = augment( prompt_template, asset_txt, question )
    
        # Generate output
        output = generate( model, augmented_prompt )
        if not re.match( r"\S+", output ):
            print( "The model failed to generate an answer")
        print( "\nQuestion:\n" + question )
        print( "\nAnswer:\n" + output )
        print( "\nSource: \"" + asset["title"] + "\", " + asset["Author"] + " (" + asset["Published"] + ")"  )
    
    

In [48]:
searchAndAnswer( knowledge_base, model )


Question:
I am done with the course that talk about error bugets, waht is next ?

Answer:
• Class: Trade-Offs in SRE: One of the key challenges in SRE is balancing the need for rapid changes and deployments with the need to maintain high reliability. • Class: Relevant Course: Manage Trade-Off

Source: "Course: Managing Change, Velocity, and Reliability", IBM team (2024)


In [49]:
searchAndAnswer( knowledge_base, model )


Question:
I am at "Cost Optimization in SRE" what is next ?

Current course: "Course: Managing Change, Velocity, and Reliability", IBM team (2024)

Your´re done. Thank!

