# Self-learning Text Summarization 🐬

##### 💡 **Research Areas:** Rapid Prototyping, Generative AI, Iterative Refinement, Text Summarization.

#### This is a simple prototype demonstrating the power of iterative refinement i.e. iteratively refining the output via the LLM to generate high-quality text summaries.

<div style="display:flex; align-items:center; padding: 50px;">
<p style="margin-right:10px;">
    <img height="300px" style="width:auto;" width="200px" src="https://avatars.githubusercontent.com/u/192148546?s=400&u=95d76fbb02e6c09671d87c9155f17ca1e4ef8f21&v=4"> 
</p>
</div>

## Description:

Description
Self-learning Text Summarization Prototype
This project demonstrates the power of iterative refinement in generating high-quality text summaries. Using an AI-driven pipeline, the system analyzes input text, generates a summary, and improves it iteratively based on self-feedback to enhance clarity, coherence, and conciseness.

- Core Features:

   
    - Rapid prototyping using LLMs.
   
    - Iterative refinement for quality improvement.
   
    - Multi-step summarization with context understanding.

- Applications:

 
    - Efficient text summarization.
    
    - Adaptable for multiple formats and contexts (plain text, markdown).
    
    - Provides metadata and quality evaluations.
    
    - This tool simplifies complex content for easier understanding and supports detailed refinement for professional use cases.


## Step 1: Environment Initialization and Dependency Setup

This script serves as a boilerplate for setting up a Python environment in Jupyter notebooks, ensuring that all dependencies are installed and required environment variables are configured. Below is a breakdown of its key functionalities:

---

### 1. **Purpose**

The script is designed to:
  
#### - Install necessary Python libraries specified in a `requirements.txt` file.

#### - Verify the presence of essential environment variables (e.g., `OPENAI_API_KEY`).

#### - Provide retry mechanisms for handling transient installation errors.


---

### 2. **Installing Requirements**
#### `install_requirements()` Function:

- **Purpose:**

  Installs packages listed in the `requirements.txt` file using `pip install -r requirements.txt`.

- **Features:**

  - Uses a `requirements_installed` flag to avoid redundant installations.

  - Implements a retry mechanism (`max_retries` and `retries`) to address transient issues during installation.

  - Terminates with a clear error message if all retries fail.

---

### 3. **Environment Setup**

#### **Loading Environment Variables:**

- Uses the `dotenv` package to securely load variables from a `.env` file into the environment.

- This is crucial for managing sensitive credentials like API keys.


#### `setup_env()` Function:

- **Purpose:**

  Ensures all required environment variables are set before proceeding.


- **Process:**

  - Checks against the `REQUIRED_ENV_VARS` list.

  - If an environment variable is missing:

    - Displays a clear error message.

    - Terminates the execution to prevent running the notebook in an incomplete state.


---

### 4. **Clear Output and Final Message**

- **`clear_output()`:**

  Cleans up notebook cell output to improve readability after setup.

- **Success Message:**

  Indicates that the setup is complete and users can proceed confidently.

---

### 5. **Resilience and User Guidance**

- **Error Handling:**

  Incorporates retry logic for dependency installation, reducing the need for manual intervention.
  
- **Informative Feedback:**

  Provides clear and actionable error messages to guide users in resolving issues, such as setting missing environment variables.
  

---

### Summary
This script automates the setup of a Python environment in Jupyter notebooks. By managing dependencies, verifying configurations, and offering robust error handling, it ensures a seamless and reliable user experience.


In [None]:
# Boilerplate: This block goes into every notebook.
# It sets up the environment, installs the requirements, and checks for the required environment variables.

import os
from IPython.display import clear_output

requirements_installed = False
max_retries = 3
retries = 0
REQUIRED_ENV_VARS = ["OPENAI_API_KEY"]


def install_requirements():
    """Installs the requirements from requirements.txt file"""
    global requirements_installed
    if requirements_installed:
        print("Requirements already installed.")
        return

    print("Installing requirements...")
    install_status = os.system("pip install -r requirements.txt")
    if install_status == 0:
        print("Requirements installed successfully.")
        requirements_installed = True
    else:
        print("Failed to install requirements.")
        if retries < max_retries:
            print("Retrying...")
            retries += 1
            return install_requirements()
        exit(1)
    return


from dotenv import load_dotenv
import os


def setup_env():
    """Sets up the environment variables"""

    def check_env(env_var):
        value = os.getenv(env_var)
        if value is None:
            print(f"Please set the {env_var} environment variable.")
            exit(1)
        else:
            print(f"{env_var} is set.")

    load_dotenv()

    variables_to_check = REQUIRED_ENV_VARS

    for var in variables_to_check:
        check_env(var)


install_requirements()
setup_env()
clear_output()
print("🚀 Setup complete. Continue to the next cell.")

## Step 2: SelfLearningSummarizer Class Implementation

## Imports and Constants

- `import traceback:`  
  Used to provide detailed information about exceptions (errors) that occur during program execution, including the error message and stack trace.

- `from openai import OpenAI:`  
  Imports the OpenAI class to interact with OpenAI's API for generating text responses.

- `import os:`  
  Provides functions for interacting with the operating system, such as environment variables and file paths.

- `from uuid import uuid4:`  
  Used to generate unique identifiers for various sessions or processes (e.g., for iterative refinement).

- `DEFAULT_OPENAI_MODEL = "gpt-4o-mini":`  
  Specifies a default model to use when interacting with OpenAI's API.

## Prompts

- `SIMPLE_SUMMARIZATION_SYSTEM_PROMPT:`  
  A predefined system message that sets the behavior of the AI for summarization tasks.  
  Describes capabilities (e.g., context recognition, semantic comprehension) and expectations for analyzing input text.

- `SIMPLE_SUMMARIZATION_PROMPT:`  
  The main instruction template for summarizing text.  
  Contains detailed steps for analyzing and structuring the summary, ensuring adherence to specific requirements.

- `ITERATIVE_REFINEMENT_SYSTEM_PROMPT:`  
  Defines the AI's role for refining summaries iteratively.

- `ITERATIVE_REFINEMENT_PROMPT:`  
  Instructions for refining a summary in multiple iterations, focusing on clarity, coherence, and other factors.  
  Includes steps like self-evaluation and providing reasons for refinements.

## Class Definition: SelfLearningSummarizer

- `class SelfLearningSummarizer::`  
  Defines a class that encapsulates methods for summarization, refinement, and comparison of summaries.

- `def __init__(self, model="gpt-4o-mini")::`  
  Constructor for initializing the summarizer instance.

- `self.llm = OpenAI(api_key=os.getenv("OPENAI_API_KEY")):`  
  Initializes the OpenAI client using the API key from environment variables.

- `self.model = model:`  
  Sets the default model for the summarizer.

## Method 1: get_summary

- `def get_summary(self, source_text: str, format="plain_text") -> str::`  
  Method to generate a summary for a given text.

- `Parameters:`  
  - `source_text:` The text to summarize.  
  - `format:` Specifies output format (plain_text or markdown).

- `try::`  
  Start of error-handling block for catching exceptions.

- `if format not in ["plain_text", "markdown"]::`  
  Ensures that the format parameter is valid.

- `system = SIMPLE_SUMMARIZATION_SYSTEM_PROMPT:`  
  Assigns the summarization system prompt.

- `prompt = SIMPLE_SUMMARIZATION_PROMPT.format(text=source_text, format=format):`  
  Prepares the summarization prompt by replacing placeholders (`{text}`, `{format}`) with the actual input text and desired format.

- `messages = [{"role": "system", "content": system}, {"role": "user", "content": prompt}]:`  
  Constructs the input message structure required by the OpenAI API.

- `response = self.llm.chat.completions.create(...):`  
  Calls the OpenAI API to generate a summary.

- `summary = response.choices[0].message.content:`  
  Extracts the generated summary from the API response.

- `except Exception as e::`  
  Handles errors, logs the issue using traceback, and returns an empty string.

## Method 2: iterative_refinement

- `def iterative_refinement(self, source_text: str, summary: str, turns=3, format="plain_text") -> str::`  
  Refines a summary iteratively based on feedback.

- `Parameters:`  
  - `source_text:` The original text.  
  - `summary:` Initial summary to refine.  
  - `turns:` Number of refinement iterations.  
  - `format:` Output format.

- `session_id = str(uuid4()):`  
  Generates a unique ID for the refinement session.

- `current_summary = summary:`  
  Starts with the initial summary.

- `while current_turn <= turns::`  
  Loops through the specified number of refinement iterations.

- `prompt = ITERATIVE_REFINEMENT_PROMPT.format(...):`  
  Prepares the prompt for each refinement iteration.

- `messages = [{"role": "system", "content": system}, {"role": "user", "content": prompt}]:`  
  Constructs the input message for the API.

- `current_summary = llm_response.choices[0].message.content:`  
  Updates the summary with the refined version.

- `return current_summary:`  
  Returns the final refined summary after completing all iterations.

## Method 3: compare_summaries

- `def compare_summaries(self, source_text: str, summary1: str, summary2: str) -> str::`  
  Compares two summaries and evaluates their quality.

- `Parameters:`  
  - `source_text:` Original text.  
  - `summary1:` Initial summary.  
  - `summary2:` Refined summary.

- `prompt = f"Compare the two summaries below ...":`  
  Prepares the prompt for comparing the summaries.

- `response = self.llm.chat.completions.create(...):`  
  Calls the OpenAI API to evaluate the summaries.

- `return feedback:`  
  Returns the feedback as a markdown table.

## Utility Methods

- `def get(self)::`  
  Placeholder for retrieving items from an internal queue (not implemented).

- `def empty(self)::`  
  Placeholder for checking if the queue is empty (not implemented).

This code provides a comprehensive framework for text summarization, iterative refinement, and quality comparison, leveraging OpenAI's language models.


In [None]:
import traceback
from openai import OpenAI
import os
from uuid import uuid4

DEFAULT_OPENAI_MODEL = "gpt-4o-mini"

SIMPLE_SUMMARIZATION_SYSTEM_PROMPT = """
    You are SummarizerGPT, an advanced AI system specialized in text summarization. Your core function is to process and analyze various types of text input, preparing the groundwork for generating high-quality summaries. Your capabilities include:

    1. Text Analysis: Quickly assess the structure, style, and content of any given text.
    2. Context Recognition: Identify the domain, target audience, and purpose of the text.
    3. Language Processing: Understand and process text in multiple languages and dialects.
    4. Semantic Comprehension: Grasp complex ideas, abstract concepts, and subtle nuances in the text.
    5. Information Hierarchy: Recognize the relative importance of different pieces of information within the text.
    6. Cross-referencing: Identify and connect related ideas across different parts of the text.
    7. Bias Detection: Recognize potential biases or slants in the original text.
    8. Data Extraction: Pull out key statistics, dates, names, and other crucial data points.
    9. Tone Analysis: Understand the emotional tone and rhetorical style of the text.
    10. Multi-format Handling: Process various text formats including plain text, HTML, PDF extracts, and more.

    You do not generate the summary directly. Instead, you prepare a comprehensive analysis of the text, which will be used by the summarization module to create the final output. Your analysis should include:

    - Text type and structure
    - Main topic and key themes
    - Target audience and purpose
    - Important data points and statistics
    - Identified biases or controversial points
    - Tone and style characteristics
    - Any unique or standout elements in the text

    Await the input text, and be ready to provide this detailed analysis to support the summarization process.
"""

SIMPLE_SUMMARIZATION_PROMPT = """
    1. Analyze the input:
    - Determine the text type (article, research paper, conversation, etc.)
    - Identify the main topic and key themes
    - Assess the length and complexity of the content

    2. Generate the summary:
    - Provide a concise yet informative summary
    - Maintain the original tone and style where appropriate
    - Ensure factual accuracy and avoid introducing new information
    - Use clear, coherent language suitable for a general audience

    3. Structure the summary:
    - Begin with a brief overview of the main topic
    - Organize key points logically, using paragraphs or bullet points as appropriate
    - Conclude with the most significant takeaway or implication

    4. Adapt to specific requirements:
    - If a word/character limit is specified, adhere to it strictly
    - If the text contains technical terms, provide brief explanations
    - For multi-section documents, summarize each section separately, then provide an overall summary

    5. Handle edge cases:
    - For very short texts, provide a condensed version without losing essential information
    - For extremely long or complex texts, focus on the most crucial points and indicate that it's a high-level summary
    - If the text contains conflicting viewpoints, present them objectively without bias

    6. Enhance readability:
    - Use transition words to improve flow between ideas
    - Employ varied sentence structures to maintain engagement
    - Highlight key terms or concepts using bold text when appropriate

    7. Quality check:
    - Ensure the summary is self-contained and understandable without the original text
    - Verify that no critical information is omitted
    - Check for consistency in tense, voice, and perspective

    8. Metadata (if applicable):
    - Include the original title, author, and date of publication
    - Mention the word count of the original text and the summary

    Now, summarize the following text, adhering to the above guidelines.

    Text: '{text}'
    Respond in the format '{format}' STRICTLY.
    IF THE FORMAT IS 'plain_text', THEN RESPOND IN PLAIN TEXT ONLY, NOT MARKDOWN.
    IF THE FORMAT IS 'markdown'. DIRECTLY GIVE THE MARKDOWN. DON'T WRAP IT IN ```markdown``` tags.
"""

ITERATIVE_REFINEMENT_SYSTEM_PROMPT = """
    You are a Refinement AI specializing in improving text quality. Your task is to refine the given text based on the given instructions.
"""
ITERATIVE_REFINEMENT_PROMPT = """
   You are a Refinement AI specializing in improving text quality. Your task is to refine the given text in a single iteration. Follow these steps:

    1. Analyze the input:
    - Identify the source text and the summary
    - Assess strengths and weaknesses in content, structure, and style of the summary 

    2. Prioritize improvements:
    - Focus on 2-3 key areas that will have the most significant impact that could be made in the summary
    - Consider clarity, coherence, conciseness, and effectiveness

    3. Refine the text:
    - Make targeted improvements in the summary based on your analysis
    - Maintain the original intent and core message in the source text
    - Ensure changes enhance overall quality without introducing new issues

    4. Provide a summary of changes:
    - Briefly explain the key modifications made in the revised summary 
    - Justify your refinement decisions with clear reasoning

    5. Self-evaluate:
    - Rate the improvement on a scale of 1-10
    - Briefly explain your rating

    Source Text: '{source_text}'

    Summary to be refined: '{summary}'
    
    Respond only with the final revised summary after all improvements are made. 

    Respond in the format '{format}' STRICTLY.
    IF THE FORMAT IS 'plain_text', THEN RESPOND IN PLAIN TEXT ONLY, NOT MARKDOWN.
    IF THE FORMAT IS 'markdown'. DIRECTLY GIVE THE MARKDOWN. DON'T WRAP IT IN ```markdown``` tags.
"""


class SelfLearningSummarizer:
    """Queue-based Summarizer implementation"""

    def __init__(self, model="gpt-4o-mini"):
        self.llm = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
        self.model = model

    def get_summary(self, source_text: str, format="plain_text") -> str:
        """Generates a summary of the given text"""
        try:
            if format not in ["plain_text", "markdown"]:
                raise ValueError("Invalid format. Use 'plain_text' or 'markdown'.")
            system = SIMPLE_SUMMARIZATION_SYSTEM_PROMPT
            prompt = SIMPLE_SUMMARIZATION_PROMPT.format(text=source_text, format=format)
            messages = [
                {
                    "role": "system",
                    "content": system,
                },
                {"role": "user", "content": prompt},
            ]
            response = self.llm.chat.completions.create(
                messages=messages, model=self.model
            )
            summary = response.choices[0].message.content
            return summary
        except Exception as e:
            print(f"Failed to generate summary for {item}")
            traceback.print_exc()
            return ""

    def iterative_refinement(
        self, source_text: str, summary: str, turns=3, format="plain_text"
    ) -> str:
        """Iteratively refines the summary based on self-generated feedback for given turns."""
        session_id = str(uuid4())
        print(f"Iterative Refinement ({session_id}): Session ID: {session_id}")
        current_summary = summary
        current_turn = 1
        try:
            while current_turn <= turns:
                print(f"Iterative Refinement ({session_id}): Turn {current_turn}.")
                system = ITERATIVE_REFINEMENT_SYSTEM_PROMPT
                prompt = ITERATIVE_REFINEMENT_PROMPT.format(
                    source_text=source_text, summary=current_summary, format=format
                )
                messages = [
                    {
                        "role": "system",
                        "content": system,
                    },
                    {"role": "user", "content": prompt},
                ]
                llm_response = self.llm.chat.completions.create(
                    messages=messages, model=self.model
                )
                current_summary = llm_response.choices[0].message.content
                current_turn += 1
                print(
                    f"Iterative Refinement ({session_id}): Turn {current_turn} completed. Updated rolling summary."
                )
            return current_summary
        except Exception as e:
            print(
                f"Iterative Refinement ({session_id}): Failed to complete all turns for {source_text} and {summary}."
            )
            print(
                f"Iterative Refinement ({session_id}): Turns completed: {current_turn}"
            )
            traceback.print_exc()
            return current_summary

    def compare_summaries(self, source_text: str, summary1: str, summary2: str) -> str:
        """Compares two summaries and provides feedback on their quality."""
        try:
            print(f"Comparing summaries for {source_text}.")
            system = "You are a Comparison AI specializing in evaluating text quality. Your task is to compare two summaries and provide feedback on their quality."
            prompt = f"""
            Compare the two summaries below and provide feedback on their quality. 
            Provide score comparison for both summaries, the old summary score and the new summary score.
            This will help us compare the two summaries on various parameters.
            Refer to the source text when making your evaluation. \n\n 
            Source Text: {source_text}
            Initial Summary: {summary1} 
            Refined Summary: {summary2}
            STRICTLY PROVIDE YOUR RESPONSE AS MARKDOWN TABLE WITH SCORES AND JUSTIFICATIONS.
            """
            messages = [
                {
                    "role": "system",
                    "content": system,
                },
                {"role": "user", "content": prompt},
            ]
            response = self.llm.chat.completions.create(
                messages=messages, model=self.model
            )
            feedback = response.choices[0].message.content
            return feedback
        except Exception as e:
            print(f"Failed to compare summaries for {source_text}")
            traceback.print_exc()
            return ""

    def get(self):
        return self.q.get()

    def empty(self):
        return self.q.empty()

## Step 3: Generate Summary Using SelfLearningSummarizer Class

1. **Creating the Summarizer Object:**

   The line `summarizer = SelfLearningSummarizer()` initializes an object of the `SelfLearningSummarizer` class. 

   This object will be used to process the provided text.

2. **Sample Text for Summarization:**

   The `text` variable holds a sample paragraph discussing how to effectively manage projects as an engineer, focusing on task prioritization, leadership, and ensuring project success.

3. **Choosing the Format:**

   The `format` variable is set to `"plain_text"`, meaning the summary will be generated in plain text format, not markdown.

4. **Generating the Summary:**

   The `summarizer.get_summary(text, format=format)` method is called to generate a summary of the provided text. 

   This will call the `get_summary` method within the `SelfLearningSummarizer` class, which uses the OpenAI API to generate a summary based on the provided system and user prompts.

5. **Outputting the Summary:**

   The `print(summary)` statement will output the generated summary to the console.

   When you run the code, it should generate a concise, informative summary of the provided text.


In [None]:
# Let's get the summary and test our prompts which seem to be solid.

summarizer = SelfLearningSummarizer()

# Credits: Arpit Bhayani
# Post Link: https://www.linkedin.com/posts/arpitbhayani_asliengineering-careergrowth-activity-7280566114894430208-tjB2?utm_source=share&utm_medium=member_desktop

text = """
When working on a new project, we engineers almost always start with the most fascinating part. But, while it's exciting for us, it's not always what's best for the project.

The easiest way to become an effective lead/manager is to break down the project into tasks and prioritize the most important items. So, it is always a good idea that before the work begins, step back and ask

1. what is the most critical piece?
2. which items are highest risk and need early attention?
3. which deliverables provide the most immediate value?

We naturally gravitate towards easily doable, less impactful, and tangential parts of the project. This happens because of a lack of a broader context. So, if you are leading a project, make sure,

1. define a clear roadmap and align it with business outcomes
2. define milestones and priorities

A good leader doesn’t micromanage but ensures that the team starts on the right foot. Check-in periodically to ensure the alignment while giving engineers ownership of their tasks.

Prioritization is what separates effective leads from those simply managing tasks. As a lead, you are not just there to oversee execution but to set the direction.
"""
format = "plain_text"

summary = summarizer.get_summary(text, format=format)
print(summary)

## Step 4: Display Summary in Markdown Format

The code you've written will generate the summary and display it as markdown in the notebook output.

Here's a breakdown of the added steps for markdown display:

### Creating a Markdown Object:

`markdown_summary = Markdown(f"## Summary\n{summary}")` creates a markdown object. The summary generated earlier is placed inside the markdown object with a `## Summary` header to format it as a section heading.

### Displaying the Markdown:

`display(markdown_summary)` displays the markdown content in the notebook. This will render the summary with the appropriate markdown formatting, such as the header for the "Summary" section.

Once the code runs, the summary will be shown in the notebook as formatted markdown.


In [None]:
# Let's try a markdown response now
from IPython.display import Markdown, display

summarizer = SelfLearningSummarizer()
text = """
When working on a new project, we engineers almost always start with the most fascinating part. But, while it's exciting for us, it's not always what's best for the project.

The easiest way to become an effective lead/manager is to break down the project into tasks and prioritize the most important items. So, it is always a good idea that before the work begins, step back and ask

1. what is the most critical piece?
2. which items are highest risk and need early attention?
3. which deliverables provide the most immediate value?

We naturally gravitate towards easily doable, less impactful, and tangential parts of the project. This happens because of a lack of a broader context. So, if you are leading a project, make sure,

1. define a clear roadmap and align it with business outcomes
2. define milestones and priorities

A good leader doesn’t micromanage but ensures that the team starts on the right foot. Check-in periodically to ensure the alignment while giving engineers ownership of their tasks.

Prioritization is what separates effective leads from those simply managing tasks. As a lead, you are not just there to oversee execution but to set the direction.
"""
format = "plain_text"

summary = summarizer.get_summary(text, format=format)
markdown_summary = Markdown(f"## Summary\n{summary}")
display(markdown_summary)

## Step 5: Generate and Display Refined Summary

### Summarizer Initialization:

The `SelfLearningSummarizer` class is instantiated, creating an object that will be used for generating and refining summaries. This object contains methods for the summarization process.

### Text Input:

The text input represents the content that will be summarized. In this case, the text discusses project management, leadership, and prioritization in an engineering context.

### Format for Summary:

The format in which the summary will be generated is set to `plain_text`. This means the output will be returned as simple text, with no markdown or advanced formatting applied.

### Generate Initial Summary:

The summarizer analyzes the input text and generates an initial summary based on the content. This summary is aimed at capturing the main ideas and key points from the original text.

### Iterative Refinement:

The initial summary is then refined iteratively. In this case, the process will go through 3 cycles of refinement. Each cycle involves analyzing the current version of the summary and making improvements to enhance clarity, coherence, and conciseness.

### Prepare the Markdown Object:

Once the final refined summary is obtained, it is formatted as a markdown object, which includes a header that specifies the number of refinement turns applied. The refined summary text is also included under this header.

### Display the Markdown Summary:

The markdown summary, which includes the header and the final refined summary, is displayed in the notebook. This allows the summary to be presented in a structured, easily readable format.

### Summary of the Process:

- First, an initial summary is created based on the provided text.

- Then, that summary goes through a series of refinement steps to improve its quality.

- The final refined summary is displayed in a markdown format, making it clear and structured for easy reading. This process ensures the summary captures the essence of the original text while being concise and coherent.


In [None]:
summarizer = SelfLearningSummarizer()
text = """
When working on a new project, we engineers almost always start with the most fascinating part. But, while it's exciting for us, it's not always what's best for the project.

The easiest way to become an effective lead/manager is to break down the project into tasks and prioritize the most important items. So, it is always a good idea that before the work begins, step back and ask

1. what is the most critical piece?
2. which items are highest risk and need early attention?
3. which deliverables provide the most immediate value?

We naturally gravitate towards easily doable, less impactful, and tangential parts of the project. This happens because of a lack of a broader context. So, if you are leading a project, make sure,

1. define a clear roadmap and align it with business outcomes
2. define milestones and priorities

A good leader doesn’t micromanage but ensures that the team starts on the right foot. Check-in periodically to ensure the alignment while giving engineers ownership of their tasks.

Prioritization is what separates effective leads from those simply managing tasks. As a lead, you are not just there to oversee execution but to set the direction.
"""
format = "plain_text"
turns = 3

summary = summarizer.get_summary(text, format=format)

refined_summary = summarizer.iterative_refinement(
    text, summary, turns=turns, format=format
)

markdown_summary = Markdown(f"## Refined Summary (turns={turns})\n{refined_summary}")

display(markdown_summary)

## Step 6: Iterative Summarization and Comparison

### 1. Importing Necessary Libraries

- `IPython.display`: This is a module that allows you to display rich media, including Markdown text, HTML, and other objects in Jupyter notebooks or IPython environments.

- `Markdown`: A function from the `IPython.display` module that renders text as Markdown in the notebook.

- `clear_output`: Clears the output of the current cell in a Jupyter notebook, which is useful when you want to refresh the display or hide unnecessary outputs during execution.

### 2. Input Text

This variable `text` holds the original content you want to summarize and process.

It explains the concept of framework-defined infrastructure, its benefits, and how it operates in cloud environments.

### 3. Defining Variables for Summary Iterations

- `turns`: Defines how many iterations the summarizer will undergo to refine the summary.

- In this case, it is set to 5, meaning the summarizer will refine the initial summary in 5 rounds.

### 4. Summarizing the Text

- `summarizer = SelfLearningSummarizer()`: Here, a summarizer object is created. This class or function is responsible for processing the input text and generating a summary.

- `get_summary(text, format=format)`: This method processes the input text and returns a summarized version of it. The format for output is specified by the variable `format`. We would need to define `format` earlier in the code, typically something like `plain_text` or `markdown`.

### 5. Iterative Refinement of the Summary

- `iterative_refinement`: This method is key to improving the initial summary over multiple iterations. Each iteration aims to refine the summary by making it more concise and accurate, while retaining important concepts from the original text.

- **Parameters**:
  - `text`: The original input text to help with refining the summary.

  - `summary`: The initial summary created in the previous step.

  - `turns=turns`: Specifies how many times the summary will be refined. Here, it will refine the summary 5 times.

  - `format=format`: Ensures the summary is output in the desired format (e.g., `plain_text`).


### 6. Comparing Original and Refined Summaries

- `compare_summaries`: This method compares the initial summary with the refined summary after multiple iterations. It helps to assess how much the refinement process improved the quality, clarity, and conciseness of the summary.

- **Parameters**:
  - `text`: The original input text.
  
  - `summary`: The initial summary generated by the summarizer.
  
  - `refined_summary`: The version of the summary after being refined over `turns` iterations.

### 7. Clearing the Output and Displaying Comparison

- `clear_output()`: Clears the output area of the notebook to ensure that only relevant results are shown, particularly when running cells multiple times.

- `Markdown(f"## Comparison\n{comparison}")`: Converts the comparison (which is likely a text-based comparison or report) into Markdown format. The `##` in the string indicates a second-level heading in Markdown.

- `display(markdown)`: Displays the Markdown formatted text in the notebook. This will render the comparison of the original and refined summaries for easy comparison.

### Overall Process and Flow

- Input the text you want to summarize.

- Generate an initial summary using the `SelfLearningSummarizer`.

- Refine the summary iteratively to improve clarity, conciseness, and information retention.

- Compare the original and refined summaries to assess the effectiveness of the refinement process.

- Clear the output to refresh the display and then show the comparison in Markdown format.

### Purpose of this Code

The primary goal of this code is to summarize complex text and refine the summary over multiple iterations. The iterative refinement process ensures that the final summary is both accurate and concise. By comparing the initial and refined summaries, users can see how much improvement was made during the refinement stages.


In [None]:
from IPython.display import Markdown, clear_output

turns = 5
text = """
What is framework-defined infrastructure?
Framework-defined infrastructure abstracts over cloud primitives such as servers, message queues, and serverless functions, making them mere implementation details of the frameworks' concepts:

Providing portability between different target infrastructure providers

Eliminating the need to manually configure infrastructure to run an application in production

Increasing the time spent writing product code over system management

Allowing the unchanged use of the framework's native local development tools

Standardizing on pre-reviewed secure services

Frameworks use well-established patterns to provide structure and abstraction to applications, making them easier to write and understand. While the word framework is hard to define, the Hollywood principle, "Don't call us, we call you," probably captures best the inversion of control, where the framework manages the high-level application flow while the developer writes code within the hooks provided by it.

Framework-defined infrastructure takes advantage of both this inversion of control and the predictable structure of framework-based applications to automatically map framework concepts onto the appropriate infrastructure without the need for explicit declaration or configuration of the infrastructure.

Note that this post is giving examples based on Vercel's Platform as a Service offering. The concept, however, can be applied more widely as the basic idea of understanding a framework, and generating IaC configuration for it, can also be used for more traditional infrastructure deployments.
"""


summarizer = SelfLearningSummarizer()
summary = summarizer.get_summary(text, format=format)
refined_summary = summarizer.iterative_refinement(
    text, summary, turns=turns, format=format
)
comparison = summarizer.compare_summaries(text, summary, refined_summary)
clear_output()
markdown = Markdown(f"## Comparison\n{comparison}")
display(markdown)

## Step 7: Environment Initialization and Dependency Setup

This script serves as a boilerplate for setting up a Python environment in Jupyter notebooks, ensuring that all dependencies are installed and required environment variables are configured. Below is a breakdown of its key functionalities:

---

### 1. **Purpose**

The script is designed to:
  
#### - Install necessary Python libraries specified in a `requirements.txt` file.

#### - Verify the presence of essential environment variables (e.g., `OPENAI_API_KEY`).

#### - Provide retry mechanisms for handling transient installation errors.


---

### 2. **Installing Requirements**
#### `install_requirements()` Function:

- **Purpose:**

  Installs packages listed in the `requirements.txt` file using `pip install -r requirements.txt`.

- **Features:**

  - Uses a `requirements_installed` flag to avoid redundant installations.

  - Implements a retry mechanism (`max_retries` and `retries`) to address transient issues during installation.

  - Terminates with a clear error message if all retries fail.

---

### 3. **Environment Setup**

#### **Loading Environment Variables:**

- Uses the `dotenv` package to securely load variables from a `.env` file into the environment.

- This is crucial for managing sensitive credentials like API keys.


#### `setup_env()` Function:

- **Purpose:**

  Ensures all required environment variables are set before proceeding.


- **Process:**

  - Checks against the `REQUIRED_ENV_VARS` list.

  - If an environment variable is missing:

    - Displays a clear error message.

    - Terminates the execution to prevent running the notebook in an incomplete state.


---

### 4. **Clear Output and Final Message**

- **`clear_output()`:**

  Cleans up notebook cell output to improve readability after setup.

- **Success Message:**

  Indicates that the setup is complete and users can proceed confidently.

---

### 5. **Resilience and User Guidance**

- **Error Handling:**

  Incorporates retry logic for dependency installation, reducing the need for manual intervention.
  
- **Informative Feedback:**

  Provides clear and actionable error messages to guide users in resolving issues, such as setting missing environment variables.
  

---

### Summary
This script automates the setup of a Python environment in Jupyter notebooks. By managing dependencies, verifying configurations, and offering robust error handling, it ensures a seamless and reliable user experience.



In [None]:
# Boilerplate: This block goes into every notebook.
# It sets up the environment, installs the requirements, and checks for the required environment variables.

import os
from IPython.display import clear_output

requirements_installed = False
max_retries = 3
retries = 0
REQUIRED_ENV_VARS = ["OPENAI_API_KEY"]


def install_requirements():
    """Installs the requirements from requirements.txt file"""
    global requirements_installed
    if requirements_installed:
        print("Requirements already installed.")
        return

    print("Installing requirements...")
    install_status = os.system("pip install -r requirements.txt")
    if install_status == 0:
        print("Requirements installed successfully.")
        requirements_installed = True
    else:
        print("Failed to install requirements.")
        if retries < max_retries:
            print("Retrying...")
            retries += 1
            return install_requirements()
        exit(1)
    return


from dotenv import load_dotenv
import os


def setup_env():
    """Sets up the environment variables"""

    def check_env(env_var):
        value = os.getenv(env_var)
        if value is None:
            print(f"Please set the {env_var} environment variable.")
            exit(1)
        else:
            print(f"{env_var} is set.")

    load_dotenv()

    variables_to_check = REQUIRED_ENV_VARS

    for var in variables_to_check:
        check_env(var)


install_requirements()
setup_env()
clear_output()
print("🚀 Setup complete. Continue to the next cell.")

## Step 8: Understanding the SelfLearningSummarizer Class

### 1. Constants and Prompts

- `DEFAULT_OPENAI_MODEL`: This is the default model used to interact with the OpenAI API, set to "gpt-4o-mini".

**Prompt Strings**:

- `SIMPLE_SUMMARIZATION_SYSTEM_PROMPT`: Defines the system's role and capabilities for summarization. This prompt outlines the summarizer's features, including its ability to process and analyze text in various formats, understand language, and identify critical information.

- `SIMPLE_SUMMARIZATION_PROMPT`: A detailed prompt that guides the summarization process, including:

  - Text analysis (identifying key themes, structure, and audience).

  - Generation of a concise summary.

  - Formatting the summary (ensuring readability, structuring, etc.).

  - Handling edge cases like very short or long texts.

- `ITERATIVE_REFINEMENT_SYSTEM_PROMPT`: A system prompt that instructs the AI to refine the summary based on feedback and improve its quality.

- `ITERATIVE_REFINEMENT_PROMPT`: Provides detailed instructions to guide the refinement process, focusing on analyzing the text, prioritizing areas of improvement, making targeted changes, and providing a self-evaluation.

### 2. SelfLearningSummarizer Class

The core of this class is to manage the summarization, iterative refinement, and comparison of text summaries. It integrates the OpenAI API for text generation, refinement, and evaluation.

#### 2.1 `__init__` Method

- **Purpose**: This initializes an instance of the SelfLearningSummarizer class, setting up the OpenAI API connection using an API key (loaded from environment variables). The default model is set to `gpt-4o-mini`.

#### 2.2 `get_summary` Method

- **Purpose**: Generates a summary of the provided `source_text` based on a defined prompt.

- **Steps**:

  - Validates the `format` argument to ensure it’s either 'plain_text' or 'markdown'.

  - Constructs the system and user messages to be sent to the OpenAI API, using the `SIMPLE_SUMMARIZATION_SYSTEM_PROMPT` and `SIMPLE_SUMMARIZATION_PROMPT`.

  - Sends a request to the OpenAI model and retrieves the generated summary.

  - Returns the generated summary.

#### 2.3 `iterative_refinement` Method

- **Purpose**: This method refines the generated summary iteratively over a specified number of turns (`turns`).

- **Steps**:

  - Initializes a session using a unique ID (via `uuid4`).

  - For each turn (up to `turns`):

    - Constructs the system and user messages for iterative refinement using the `ITERATIVE_REFINEMENT_SYSTEM_PROMPT` and `ITERATIVE_REFINEMENT_PROMPT`.

    - Sends the request to the OpenAI API to refine the summary.

    - Updates the summary based on the AI's response.

  - Returns the final refined summary after completing all turns.

#### 2.4 `compare_summaries` Method

- **Purpose**: Compares two summaries (the initial one and the refined one) to evaluate their quality and provide feedback.
- **Steps**:

  - Constructs a system and user message for comparing summaries using the `compare_summaries` prompt.

  - Sends the request to the OpenAI API, which provides a markdown table comparing the two summaries based on quality, structure, and other parameters.

  - Returns the feedback as a markdown table.


#### 2.5 `get` and `empty` Methods

- These methods likely relate to a queue (`self.q`), but the queue isn’t initialized in the provided code snippet. If present, these methods would allow fetching items from the queue and checking if the queue is empty. These are standard methods for working with queues, often used in asynchronous or parallel processing.

### 3. Error Handling

The code uses `try-except` blocks to catch and handle exceptions:

- In `get_summary`: If an error occurs while generating the summary (e.g., API failure), the exception is caught, and a traceback is printed.

- In `iterative_refinement`: If an error occurs during any of the iterative refinement turns, the process is stopped, and the current summary is returned. A traceback is also printed for debugging.

### 4. Output Format

- The summary can be returned in either `plain_text` or `markdown` format, depending on the specified `format` argument. This flexibility allows the summarizer to be used in various contexts (e.g., plain text output for applications, markdown output for rendering in web platforms).

### 5. Using the Summarizer

To use the `SelfLearningSummarizer`, you would instantiate the class and call its methods like so:

This code allows for summarizing text, refining the summary, and comparing the results to improve the quality and effectiveness of the final output.

### Summary of Purpose

This class automates the process of summarizing a given text, refining that summary iteratively, and comparing the original summary to a refined version to evaluate its quality. It leverages the OpenAI API for natural language processing tasks and ensures flexibility with different output formats, enabling use in various applications.


In [15]:
import traceback
from openai import OpenAI
import os
from uuid import uuid4

DEFAULT_OPENAI_MODEL = "gpt-4o-mini"

SIMPLE_SUMMARIZATION_SYSTEM_PROMPT = """
    You are SummarizerGPT, an advanced AI system specialized in text summarization. Your core function is to process and analyze various types of text input, preparing the groundwork for generating high-quality summaries. Your capabilities include:

    1. Text Analysis: Quickly assess the structure, style, and content of any given text.
    2. Context Recognition: Identify the domain, target audience, and purpose of the text.
    3. Language Processing: Understand and process text in multiple languages and dialects.
    4. Semantic Comprehension: Grasp complex ideas, abstract concepts, and subtle nuances in the text.
    5. Information Hierarchy: Recognize the relative importance of different pieces of information within the text.
    6. Cross-referencing: Identify and connect related ideas across different parts of the text.
    7. Bias Detection: Recognize potential biases or slants in the original text.
    8. Data Extraction: Pull out key statistics, dates, names, and other crucial data points.
    9. Tone Analysis: Understand the emotional tone and rhetorical style of the text.
    10. Multi-format Handling: Process various text formats including plain text, HTML, PDF extracts, and more.

    You do not generate the summary directly. Instead, you prepare a comprehensive analysis of the text, which will be used by the summarization module to create the final output. Your analysis should include:

    - Text type and structure
    - Main topic and key themes
    - Target audience and purpose
    - Important data points and statistics
    - Identified biases or controversial points
    - Tone and style characteristics
    - Any unique or standout elements in the text

    Await the input text, and be ready to provide this detailed analysis to support the summarization process.
"""

SIMPLE_SUMMARIZATION_PROMPT = """
    1. Analyze the input:
    - Determine the text type (article, research paper, conversation, etc.)
    - Identify the main topic and key themes
    - Assess the length and complexity of the content

    2. Generate the summary:
    - Provide a concise yet informative summary
    - Maintain the original tone and style where appropriate
    - Ensure factual accuracy and avoid introducing new information
    - Use clear, coherent language suitable for a general audience

    3. Structure the summary:
    - Begin with a brief overview of the main topic
    - Organize key points logically, using paragraphs or bullet points as appropriate
    - Conclude with the most significant takeaway or implication

    4. Adapt to specific requirements:
    - If a word/character limit is specified, adhere to it strictly
    - If the text contains technical terms, provide brief explanations
    - For multi-section documents, summarize each section separately, then provide an overall summary

    5. Handle edge cases:
    - For very short texts, provide a condensed version without losing essential information
    - For extremely long or complex texts, focus on the most crucial points and indicate that it's a high-level summary
    - If the text contains conflicting viewpoints, present them objectively without bias

    6. Enhance readability:
    - Use transition words to improve flow between ideas
    - Employ varied sentence structures to maintain engagement
    - Highlight key terms or concepts using bold text when appropriate

    7. Quality check:
    - Ensure the summary is self-contained and understandable without the original text
    - Verify that no critical information is omitted
    - Check for consistency in tense, voice, and perspective

    8. Metadata (if applicable):
    - Include the original title, author, and date of publication
    - Mention the word count of the original text and the summary

    Now, summarize the following text, adhering to the above guidelines.

    Text: '{text}'
    Respond in the format '{format}' STRICTLY.
    IF THE FORMAT IS 'plain_text', THEN RESPOND IN PLAIN TEXT ONLY, NOT MARKDOWN.
    IF THE FORMAT IS 'markdown'. DIRECTLY GIVE THE MARKDOWN. DON'T WRAP IT IN ```markdown``` tags.
"""

ITERATIVE_REFINEMENT_SYSTEM_PROMPT = """
    You are a Refinement AI specializing in improving text quality. Your task is to refine the given text based on the given instructions.
"""
ITERATIVE_REFINEMENT_PROMPT = """
   You are a Refinement AI specializing in improving text quality. Your task is to refine the given text in a single iteration. Follow these steps:

    1. Analyze the input:
    - Identify the source text and the summary
    - Assess strengths and weaknesses in content, structure, and style of the summary 

    2. Prioritize improvements:
    - Focus on 2-3 key areas that will have the most significant impact that could be made in the summary
    - Consider clarity, coherence, conciseness, and effectiveness

    3. Refine the text:
    - Make targeted improvements in the summary based on your analysis
    - Maintain the original intent and core message in the source text
    - Ensure changes enhance overall quality without introducing new issues

    4. Provide a summary of changes:
    - Briefly explain the key modifications made in the revised summary 
    - Justify your refinement decisions with clear reasoning

    5. Self-evaluate:
    - Rate the improvement on a scale of 1-10
    - Briefly explain your rating

    Source Text: '{source_text}'

    Summary to be refined: '{summary}'
    
    Respond only with the final revised summary after all improvements are made. 

    Respond in the format '{format}' STRICTLY.
    IF THE FORMAT IS 'plain_text', THEN RESPOND IN PLAIN TEXT ONLY, NOT MARKDOWN.
    IF THE FORMAT IS 'markdown'. DIRECTLY GIVE THE MARKDOWN. DON'T WRAP IT IN ```markdown``` tags.
"""


class SelfLearningSummarizer:
    """Queue-based Summarizer implementation"""

    def __init__(self, model="gpt-4o-mini"):
        self.llm = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
        self.model = model

    def get_summary(self, source_text: str, format="plain_text") -> str:
        """Generates a summary of the given text"""
        try:
            if format not in ["plain_text", "markdown"]:
                raise ValueError("Invalid format. Use 'plain_text' or 'markdown'.")
            system = SIMPLE_SUMMARIZATION_SYSTEM_PROMPT
            prompt = SIMPLE_SUMMARIZATION_PROMPT.format(text=source_text, format=format)
            messages = [
                {
                    "role": "system",
                    "content": system,
                },
                {"role": "user", "content": prompt},
            ]
            response = self.llm.chat.completions.create(
                messages=messages, model=self.model
            )
            summary = response.choices[0].message.content
            return summary
        except Exception as e:
            print(f"Failed to generate summary for {item}")
            traceback.print_exc()
            return ""

    def iterative_refinement(
        self, source_text: str, summary: str, turns=3, format="plain_text"
    ) -> str:
        """Iteratively refines the summary based on self-generated feedback for given turns."""
        session_id = str(uuid4())
        print(f"Iterative Refinement ({session_id}): Session ID: {session_id}")
        current_summary = summary
        current_turn = 1
        try:
            while current_turn <= turns:
                print(f"Iterative Refinement ({session_id}): Turn {current_turn}.")
                system = ITERATIVE_REFINEMENT_SYSTEM_PROMPT
                prompt = ITERATIVE_REFINEMENT_PROMPT.format(
                    source_text=source_text, summary=current_summary, format=format
                )
                messages = [
                    {
                        "role": "system",
                        "content": system,
                    },
                    {"role": "user", "content": prompt},
                ]
                llm_response = self.llm.chat.completions.create(
                    messages=messages, model=self.model
                )
                current_summary = llm_response.choices[0].message.content
                current_turn += 1
                print(
                    f"Iterative Refinement ({session_id}): Turn {current_turn} completed. Updated rolling summary."
                )
            return current_summary
        except Exception as e:
            print(
                f"Iterative Refinement ({session_id}): Failed to complete all turns for {source_text} and {summary}."
            )
            print(
                f"Iterative Refinement ({session_id}): Turns completed: {current_turn}"
            )
            traceback.print_exc()
            return current_summary

    def compare_summaries(self, source_text: str, summary1: str, summary2: str) -> str:
        """Compares two summaries and provides feedback on their quality."""
        try:
            print(f"Comparing summaries for {source_text}.")
            system = "You are a Comparison AI specializing in evaluating text quality. Your task is to compare two summaries and provide feedback on their quality."
            prompt = f"""
            Compare the two summaries below and provide feedback on their quality. 
            Provide score comparison for both summaries, the old summary score and the new summary score.
            This will help us compare the two summaries on various parameters.
            Refer to the source text when making your evaluation. \n\n 
            Source Text: {source_text}
            Initial Summary: {summary1} 
            Refined Summary: {summary2}
            STRICTLY PROVIDE YOUR RESPONSE AS MARKDOWN TABLE WITH SCORES AND JUSTIFICATIONS.
            """
            messages = [
                {
                    "role": "system",
                    "content": system,
                },
                {"role": "user", "content": prompt},
            ]
            response = self.llm.chat.completions.create(
                messages=messages, model=self.model
            )
            feedback = response.choices[0].message.content
            return feedback
        except Exception as e:
            print(f"Failed to compare summaries for {source_text}")
            traceback.print_exc()
            return ""

    def get(self):
        return self.q.get()

    def empty(self):
        return self.q.empty()

## Step 9: AI-Driven Text Summarization and Refinement System

### Explanation of the Text Summarization System

The code is a text summarization setup using OpenAI's GPT model to generate summaries. The system also includes functionality for refining summaries and comparing them iteratively. The code contains several core elements like:

### 1. **SelfLearningSummarizer Class**

- This class is the core of the summarization system. It interacts with OpenAI's GPT model to generate summaries, refine them, and compare the quality of different summaries.

### 2. **Prompts for Summarization and Refinement**

- The class uses predefined system prompts and user instructions to guide the summarization and refinement process. These prompts define how the AI should approach summarizing and improving text.

### 3. **Method Breakdown**

The methods in the SelfLearningSummarizer class are responsible for different stages of the summarization process.

### **Explanation of the Core Functions**:

#### 3.1 **Initialization (`__init__` method)**

- The `SelfLearningSummarizer` class is initialized with a default model (`gpt-4o-mini`), but this can be changed by passing a different model during instantiation.
- `self.llm` is set to the OpenAI API instance with the API key, allowing the summarizer to interact with OpenAI's models for generating summaries and refinements.

#### 3.2 **`get_summary` Method**:

- **Purpose**: This method is responsible for generating an initial summary of a given text.

- **System Prompt**: This prompt instructs the model on how to approach the summarization. It defines the task in detail, such as understanding the text, extracting key themes, and generating a summary.

- **User Prompt**: This prompt contains instructions for the model, specifying how to analyze and summarize the provided text. It also includes the format in which the summary should be returned (either plain_text or markdown).

- The method sends these prompts to OpenAI's API using the `chat.completions.create` method and returns the generated summary.

#### 3.3 **`iterative_refinement` Method**:

- **Purpose**: This method refines the generated summary iteratively.

- **Loop**: It loops through a set number of "turns" (iterations), where the summary is refined in each iteration based on feedback from the AI.

- **System Prompt for Refinement**: This defines the role of the AI in each iteration, where it reviews the summary, looks for weaknesses, and refines it.

- **Improvement Steps**: Each iteration involves an analysis of the current summary and making changes to improve clarity, coherence, and conciseness.

- This process continues for the specified number of turns (turns=3 by default). The final refined summary is returned.

#### 3.4 **`compare_summaries` Method**:

- **Purpose**: This method compares two summaries (the original and the refined version) and provides feedback on their quality.

- **System Prompt for Comparison**: This guides the AI in comparing the two summaries and scoring them based on various parameters, such as coherence, clarity, and factual accuracy.

- **Output**: The AI produces a comparison between the two summaries in a markdown format, including scores and justifications for the differences.

#### 3.5 **Using the Summarizer**:

- You create an instance of the `SelfLearningSummarizer` class (`summarizer = SelfLearningSummarizer()`).

- The `get_summary` method is used to generate a summary of the input text.

- The `iterative_refinement` method is used to refine the summary iteratively.

- Finally, the `compare_summaries` method can be used to compare the original and refined summaries.

### **Example Use Case (Your Code)**:

In the final part of your code:

```python
summarizer = SelfLearningSummarizer()

text = """ ... """  # Example text to be summarized
format = "plain_text"

summary = summarizer.get_summary(text, format=format)
print(summary)


In [None]:
# Let's get the summary and test our prompts which seem to be solid.

summarizer = SelfLearningSummarizer()

# Credits: Arpit Bhayani
# Post Link: https://www.linkedin.com/posts/arpitbhayani_asliengineering-careergrowth-activity-7280566114894430208-tjB2?utm_source=share&utm_medium=member_desktop

text = """
When working on a new project, we engineers almost always start with the most fascinating part. But, while it's exciting for us, it's not always what's best for the project.

The easiest way to become an effective lead/manager is to break down the project into tasks and prioritize the most important items. So, it is always a good idea that before the work begins, step back and ask

1. what is the most critical piece?
2. which items are highest risk and need early attention?
3. which deliverables provide the most immediate value?

We naturally gravitate towards easily doable, less impactful, and tangential parts of the project. This happens because of a lack of a broader context. So, if you are leading a project, make sure,

1. define a clear roadmap and align it with business outcomes
2. define milestones and priorities

A good leader doesn’t micromanage but ensures that the team starts on the right foot. Check-in periodically to ensure the alignment while giving engineers ownership of their tasks.

Prioritization is what separates effective leads from those simply managing tasks. As a lead, you are not just there to oversee execution but to set the direction.
"""
format = "plain_text"

summary = summarizer.get_summary(text, format=format)
print(summary)

## Step 10: Text Summarization Process

The code defines a process to summarize a given text using the `SelfLearningSummarizer` class. Here's a brief overview:

### 1. **Text Input**

- A block of text is provided, which discusses project management and leadership.

### 2. **Summarization**

- The `SelfLearningSummarizer` instance generates a summary of the text by interacting with the OpenAI model.
- The summary is returned in plain text format.

### 3. **Markdown Output**

- The summary is formatted with a markdown header (`## Summary`) and displayed in a Jupyter notebook using IPython.display.

### **Purpose of the Code**

- The code automates text summarization and outputs the result in a formatted way suitable for displaying in interactive Python environments like Jupyter notebooks.


In [None]:
# Let's try a markdown response now
from IPython.display import Markdown, display

summarizer = SelfLearningSummarizer()
text = """
When working on a new project, we engineers almost always start with the most fascinating part. But, while it's exciting for us, it's not always what's best for the project.

The easiest way to become an effective lead/manager is to break down the project into tasks and prioritize the most important items. So, it is always a good idea that before the work begins, step back and ask

1. what is the most critical piece?
2. which items are highest risk and need early attention?
3. which deliverables provide the most immediate value?

We naturally gravitate towards easily doable, less impactful, and tangential parts of the project. This happens because of a lack of a broader context. So, if you are leading a project, make sure,

1. define a clear roadmap and align it with business outcomes
2. define milestones and priorities

A good leader doesn’t micromanage but ensures that the team starts on the right foot. Check-in periodically to ensure the alignment while giving engineers ownership of their tasks.

Prioritization is what separates effective leads from those simply managing tasks. As a lead, you are not just there to oversee execution but to set the direction.
"""
format = "plain_text"

summary = summarizer.get_summary(text, format=format)
markdown_summary = Markdown(f"## Summary\n{summary}")
display(markdown_summary)

## Step 11: Text Summarization and Iterative Refinement Process

The code performs text summarization and iterative refinement:

### 1. **Text Input**

- A block of text related to project management is provided.

### 2. **Initial Summary**

- The `SelfLearningSummarizer` class generates an initial summary of the input text.

### 3. **Iterative Refinement**

- The summary is then refined over a specified number of turns (`turns=3`).
- The `iterative_refinement` method is used to improve the summary by re-evaluating and enhancing it.

### 4. **Markdown Output**

- The final refined summary is displayed in a markdown format with a header that includes the number of refinement turns.

### **Purpose of the Process**

- This process helps in refining a summary iteratively to improve clarity, coherence, and accuracy.


In [None]:
summarizer = SelfLearningSummarizer()
text = """
When working on a new project, we engineers almost always start with the most fascinating part. But, while it's exciting for us, it's not always what's best for the project.

The easiest way to become an effective lead/manager is to break down the project into tasks and prioritize the most important items. So, it is always a good idea that before the work begins, step back and ask

1. what is the most critical piece?
2. which items are highest risk and need early attention?
3. which deliverables provide the most immediate value?

We naturally gravitate towards easily doable, less impactful, and tangential parts of the project. This happens because of a lack of a broader context. So, if you are leading a project, make sure,

1. define a clear roadmap and align it with business outcomes
2. define milestones and priorities

A good leader doesn’t micromanage but ensures that the team starts on the right foot. Check-in periodically to ensure the alignment while giving engineers ownership of their tasks.

Prioritization is what separates effective leads from those simply managing tasks. As a lead, you are not just there to oversee execution but to set the direction.
"""
format = "plain_text"
turns = 3

summary = summarizer.get_summary(text, format=format)

refined_summary = summarizer.iterative_refinement(
    text, summary, turns=turns, format=format
)

markdown_summary = Markdown(f"## Refined Summary (turns={turns})\n{refined_summary}")

display(markdown_summary)

## Step 12: Text Summarization, Refinement, and Comparison Process

The code performs the following steps:

### 1. **Text Input**

- A block of text explaining "framework-defined infrastructure" is provided.

### 2. **Initial Summary**

- The `SelfLearningSummarizer` class generates an initial summary of the input text.

### 3. **Iterative Refinement**

- The summary undergoes refinement over a specified number of turns (`turns=5`).
- Each turn iteratively improves the summary for better clarity, conciseness, and readability.

### 4. **Comparison**

- The original and refined summaries are compared.
- A detailed comparison, including feedback on quality and improvements, is generated.

### 5. **Markdown Output**

- The comparison is displayed in a markdown format for easy review.

### **Purpose of the Process**

- This helps track the progress of the summary improvement process and compare the quality of the initial and refined summaries.


In [None]:
from IPython.display import Markdown, clear_output

turns = 5
text = """
What is framework-defined infrastructure?
Framework-defined infrastructure abstracts over cloud primitives such as servers, message queues, and serverless functions, making them mere implementation details of the frameworks' concepts:

Providing portability between different target infrastructure providers

Eliminating the need to manually configure infrastructure to run an application in production

Increasing the time spent writing product code over system management

Allowing the unchanged use of the framework's native local development tools

Standardizing on pre-reviewed secure services

Frameworks use well-established patterns to provide structure and abstraction to applications, making them easier to write and understand. While the word framework is hard to define, the Hollywood principle, "Don't call us, we call you," probably captures best the inversion of control, where the framework manages the high-level application flow while the developer writes code within the hooks provided by it.

Framework-defined infrastructure takes advantage of both this inversion of control and the predictable structure of framework-based applications to automatically map framework concepts onto the appropriate infrastructure without the need for explicit declaration or configuration of the infrastructure.

Note that this post is giving examples based on Vercel's Platform as a Service offering. The concept, however, can be applied more widely as the basic idea of understanding a framework, and generating IaC configuration for it, can also be used for more traditional infrastructure deployments.
"""


summarizer = SelfLearningSummarizer()
summary = summarizer.get_summary(text, format=format)
refined_summary = summarizer.iterative_refinement(
    text, summary, turns=turns, format=format
)
comparison = summarizer.compare_summaries(text, summary, refined_summary)
clear_output()
markdown = Markdown(f"## Comparison\n{comparison}")
display(markdown)

## Conclusion

The application automates the process of summarizing complex text using OpenAI's language model, making it a valuable tool for quickly generating concise summaries. 

### Key Features:

- **Initial Summarization**: The app generates an initial summary from a provided block of text, capturing key themes and important points.

- **Iterative Refinement**: The summary is refined over multiple turns, improving clarity, coherence, and conciseness.

- **Comparison of Summaries**: The original and refined summaries are compared, with feedback on quality and improvements.

- **Markdown Output**: The final summaries and comparisons are displayed in an easily readable markdown format, suitable for presentation in Jupyter notebooks or other interactive Python environments.

### Benefits:

- **Efficiency**: Automates the summarization process, saving time and effort in distilling important information from lengthy text.

- **Quality Control**: The iterative refinement process ensures the final summary is clear, accurate, and easy to understand.

- **Customizable Output**: Users can choose to output summaries in plain text or markdown, making it versatile for different applications.

This app is useful for anyone needing to process large volumes of text and produce succinct, high-quality summaries for analysis, presentation, or further processing.


---

# Thank You for visiting The Hackers Playbook! 🌐

If you liked this research material;

- [Subscribe to our newsletter.](https://thehackersplaybook.substack.com)

- [Follow us on LinkedIn.](https://www.linkedin.com/company/the-hackers-playbook/)

- [Leave a star on our GitHub.](https://www.github.com/thehackersplaybook)

<div style="display:flex; align-items:center; padding: 50px;">
<p style="margin-right:10px;">
    <img height="200px" style="width:auto;" width="200px" src="https://avatars.githubusercontent.com/u/192148546?s=400&u=95d76fbb02e6c09671d87c9155f17ca1e4ef8f21&v=4"> 
</p>
</div>