##                       Building AI Tools for News Summarization and Personal Productivity Apps

#### Task1 : News Article Summarizer - Recursive Character Text Splitter

In [31]:
#  Take the article text from the website and perform Text Splitter Techniques on the input text.

# Step-1: Input_Text
input_text = ''' 'Helping every dang soul': Beloved camp director was among those lost in Texas flooding 
                 Jane Ragsdale spent her summers by the Guadalupe, the very river that killed her a week ago today in the catastrophic July Fourth flood. Mention her name in Kerrville, Texas, this week, and folks tend to do two things: tear up and smile. "I mean I can't tell you how many people, acquaintances of mine say, 'My dear, dear friend died.' And then they said, 'Did you know Jane Ragsdale?' and I say, 'Yeah, I did,' " said Karen Taylor, who lives in nearby Hunt, Texas. For her, Ragsdale was West Kerr County personified. "Everybody's friendly here, but she embodied that friendliness and generosity and love for others. I just can't imagine life without her," Taylor said. Ragsdale, who was in her late 60s, did a lot of things, but she's best known as the owner and director of Heart O' the Hills camp for girls. She was born into the business.'''

# Step-2: Importing the necessary libraries for text splitting
from langchain_text_splitters import RecursiveCharacterTextSplitter, CharacterTextSplitter

# Step-3: Initialize the RecursiveCharacterTextSplitter
text_splitter1 = RecursiveCharacterTextSplitter(chunk_size=100, chunk_overlap=30)

# Step-4: Split the input text using the RecursiveCharacterTextSplitter
split_text1 = text_splitter1.split_text(input_text)

# Step-5: Print the results of the text splitting using RecursiveCharacterTextSplitter
print("Splitting the input text using RecursiveCharacterTextSplitter:\n")
for i,chunk in enumerate(split_text1) :
    print(f"chunk {i+1}: {chunk}\n")



Splitting the input text using RecursiveCharacterTextSplitter:

chunk 1: 'Helping every dang soul': Beloved camp director was among those lost in Texas flooding

chunk 2: Jane Ragsdale spent her summers by the Guadalupe, the very river that killed her a

chunk 3: very river that killed her a week ago today in the catastrophic July Fourth flood. Mention her name

chunk 4: flood. Mention her name in Kerrville, Texas, this week, and folks tend to do two things: tear up

chunk 5: to do two things: tear up and smile. "I mean I can't tell you how many people, acquaintances of

chunk 6: many people, acquaintances of mine say, 'My dear, dear friend died.' And then they said, 'Did you

chunk 7: And then they said, 'Did you know Jane Ragsdale?' and I say, 'Yeah, I did,' " said Karen Taylor,

chunk 8: I did,' " said Karen Taylor, who lives in nearby Hunt, Texas. For her, Ragsdale was West Kerr

chunk 9: her, Ragsdale was West Kerr County personified. "Everybody's friendly here, but she embodied t

### Code Explanation:

-> input_text = ''' ... '''

🟢 Purpose: This defines a news article-style input as a string.

   -> We are simulating a real news article to test text splitting.

-> from langchain_text_splitters import RecursiveCharacterTextSplitter, CharacterTextSplitter

🟢 Purpose: We import two different text splitter classes from langchain_text_splitters.

   -> These are used to break long texts into smaller, manageable chunks, which is essential for:
      -> Better summarization
      -> Efficient use of LLMs (since input size is limited)

-> text_splitter1 = RecursiveCharacterTextSplitter(chunk_size=100, chunk_overlap=30)

🟢 Purpose: We are creating an instance of the RecursiveCharacterTextSplitter.

   -> chunk_size=100: Each chunk will have up to 100 characters.
   -> chunk_overlap=30: Each chunk overlaps the previous one by 30 characters (to maintain context continuity).
   -> Recursive splitting tries to split smartly — e.g., at sentence boundaries, paragraphs, etc.

-> split_text1 = text_splitter1.split_text(input_text)

🟢 Purpose: This takes our input_text and breaks it into chunks using the recursive method.

   -> split_text1 is now a list of strings (each one is a chunk).

-> print("Splitting the input text using RecursiveCharacterTextSplitter:\n")
for i,chunk in enumerate(split_text1) :
    print(f"chunk {i+1}: {chunk}\n")
    
🟢 Purpose: Neatly prints each chunk for review.

   -> Helps you visually evaluate:
      -> Are chunks readable?
      -> Do they break mid-sentence or maintain clarity?

In [32]:
# Step-1: Importing the necessary libraries for text splitting
from langchain_text_splitters import CharacterTextSplitter

# Step-2: Initialize the CharacterTextSplitter
text_splitter2 = CharacterTextSplitter(chunk_size=300, chunk_overlap=30)

# Step-3: Split the input text using CharacterTextSplitter
split_text2 = text_splitter2.split_text(input_text)

# Step-4: Print the results of the text splitting using CharacterTextSplitter
print("Splitting the input text using CharacterTextSplitter:\n")
for i,chunk in enumerate(split_text2) :
    print(f"chunk {i+1}: {chunk}\n")

Splitting the input text using CharacterTextSplitter:

chunk 1: 'Helping every dang soul': Beloved camp director was among those lost in Texas flooding 
                 Jane Ragsdale spent her summers by the Guadalupe, the very river that killed her a week ago today in the catastrophic July Fourth flood. Mention her name in Kerrville, Texas, this week, and folks tend to do two things: tear up and smile. "I mean I can't tell you how many people, acquaintances of mine say, 'My dear, dear friend died.' And then they said, 'Did you know Jane Ragsdale?' and I say, 'Yeah, I did,' " said Karen Taylor, who lives in nearby Hunt, Texas. For her, Ragsdale was West Kerr County personified. "Everybody's friendly here, but she embodied that friendliness and generosity and love for others. I just can't imagine life without her," Taylor said. Ragsdale, who was in her late 60s, did a lot of things, but she's best known as the owner and director of Heart O' the Hills camp for girls. She was born into t

### Code Explanation:

-> text_splitter2 = CharacterTextSplitter(chunk_size=300, chunk_overlap=30)

🟢 Purpose: Now using the simpler CharacterTextSplitter, which splits purely by character count — no sentence/structure awareness.

   -> Larger chunk size (300) to show difference.

-> split_text2 = text_splitter2.split_text(input_text)

🟢 Purpose: Same as before, but with basic character-based splitting. Less intelligent — might cut mid-sentence.

-> print("Splitting the input text using CharacterTextSplitter:\n")
for i,chunk in enumerate(split_text2) :
    print(f"chunk {i+1}: {chunk}\n")

🟢 Purpose: Shows the output of the second splitting method, so we can compare results with the first one.

#### Comparison Between RecursiveCharacterTextSplitter and CharacterTextSplitter

   -> RecursiveCharacterTextSplitter : attempts to break text on logical boundaries like punctuation and sentence ends.  
   -> Result: Smaller, more natural chunks; ideal for summarization tasks.

   -> CharacterTextSplitter : cuts strictly by character length without considering sentence structure.  
   -> Result: Fewer chunks, but may split in awkward places.

### Conclusion:
RecursiveCharacterTextSplitter produced more human-readable chunks for this article. It is better suited when the goal is to preserve sentence meaning and improve summarization quality.

#### Task-2 : Meeting Transcript Section Extractor - HTML Header Text Splitting

In [None]:
# step-1: importing the necessary libraries for text splitting
from langchain_text_splitters import HTMLHeaderTextSplitter

# Step-2: Input HTML content
input_html ='''
<h1>Board Meeting</h1>
 <p>Company updates and revenue discussion.</p>
 <h2>Marketing</h2>
 <p>Campaign performance reports.</p>
 <h2>Finance</h2>
 <p>Financial projections for Q3.</p>
 <h3>Q&A Session</h3>
 <p>Open questions from board members.</p>
 '''
 
header_to_split_on = [
    ("h1", "header h1"),
    ("h2", "header h2"),
    ("h3", "header h3"),
    ("h4", "header h4"),
    ("h5", "header h5"),
    ("h6", "header h6")
]

# Step-3: Initialize the HTMLHeaderTextSplitter
html_splitter = HTMLHeaderTextSplitter(
    header_to_split_on)

# Step-4: Split the input HTML using HTMLHeaderTextSplitter
split_html = html_splitter.split_text(input_html)

# Step-5: Print the results of the text splitting using HTMLHeaderTextSplitter
print("Splitting the input HTML using HTMLHeaderTextSplitter:\n")
for i, chunk in enumerate(split_html):
    print(f"HTML chunk {i+1}: {chunk}\n")
    print("-"*150)


Splitting the input HTML using HTMLHeaderTextSplitter:

HTML chunk 1: page_content='Board Meeting' metadata={'header h1': 'Board Meeting'}

------------------------------------------------------------------------------------------------------------------------------------------------------
HTML chunk 2: page_content='Company updates and revenue discussion.' metadata={'header h1': 'Board Meeting'}

------------------------------------------------------------------------------------------------------------------------------------------------------
HTML chunk 3: page_content='Marketing' metadata={'header h1': 'Board Meeting', 'header h2': 'Marketing'}

------------------------------------------------------------------------------------------------------------------------------------------------------
HTML chunk 4: page_content='Campaign performance reports.' metadata={'header h1': 'Board Meeting', 'header h2': 'Marketing'}

-----------------------------------------------------------------

### Code Explanation:

-> We imported the `HTMLHeaderTextSplitter` from LangChain's text splitter library.  
   -> This class is used to **break structured HTML documents into sections** based on header tags (`<h1>`, `<h2>`, etc.).

-> input_html:
   -> We defined a sample HTML transcript of a board meeting. 
      -> Paragraphs under each heading
-> This is the type of structured HTML the splitter is designed to work with.

-> header_to_split_on:
   -> This is a **mapping** that tells the splitter which HTML header tags to use for splitting.  
      -> Each tuple means:  
      ->`"h1"` → treat as `"header h1"` in the output metadata.

-> Initialize the HTMLHeaderTextSplitter
   -> html_splitter = HTMLHeaderTextSplitter(header_to_split_on)

🟢 Purpose:

   -> Creates a html_splitter object using the heading rules we defined.
   -> This splitter will now know how to break the input HTML into structured document chunks.

-> split_html = html_splitter.split_text(input_html)

🟢 Purpose:  

   -> Actually performs the **splitting operation**.  
   -> Returns a list of LangChain `Document` objects, each with:
   ->  `.page_content`: The content chunk
   ->  `.metadata`: The header tag info (e.g., which `h1`/`h2` it belongs to)

-> Print the results of the text splitting using HTMLHeaderTextSplitter
print("Splitting the input HTML using HTMLHeaderTextSplitter:\n")
for i, chunk in enumerate(split_html):
    print(f"HTML chunk {i+1}: {chunk}\n")
    print("-"*150)

🟢 Purpose:

   -> Loops through the split results and prints them neatly.
   -> Helps us to verify that the text was correctly split into meaningful meeting sections (e.g., "Marketing", "Finance", etc.).

## 📝 Observation:

-> The HTMLHeaderTextSplitter successfully extracted meaningful sections such as:
   -> Board Meeting
   ->  Marketing
   -> Finance
   -> Q&A Session
-> This method is useful for processing structured meeting transcripts from HTML files, making it easier to summarize or analyze individual sections.

### Task-3 : Daily Productivity Prompter - OpenAI API Integration 

In [None]:
# Step-1: Importing the necessary libraries for OpenAI API integration
from openai import OpenAI

# Step-2: Load environment variables from .env file
from dotenv import load_dotenv
load_dotenv()

# Step-3: Import os
import os

# Step-4: Initialize the OpenAI client with API key and base URL
client = OpenAI(
    api_key=os.getenv("API_KEY"),
    base_url="https://api.sambanova.ai/v1"
)

# Step-5: Prompt the OpenAI API with various tasks
# Here are some example prompts to test the OpenAI API
prompts = [
    "Suggest 3 study tasks for a student who has 2 hours today.",
    "prepare an exam study plan for a student in a small paragraph.",
    "Give me a morning routine to stay productive in a small paragraph.",
    "Suggest a beginner-friendly fitness plan in a small paragraph.",
]

# Step-6: Function to generate response from OpenAI API
def generate_response(prompt):
    print(f"Prompt: {prompt}")
    try:
        response = client.chat.completions.create(
            model="Meta-Llama-3.3-70B-Instruct",
            messages=[{"role": "user", "content": prompt}],
        )
        print(f"Response:\n{response.choices[0].message.content}")
    except Exception as e:
        print("Error:", e)
        
    print("-" * 150)

# Step-7: Call the function for each prompt
for prompt in prompts:
    generate_response(prompt)


Prompt: Suggest 3 study tasks for a student who has 2 hours today.
Response:
Here are three study tasks that a student can complete in 2 hours:

**Task 1: Review Notes (30 minutes)**
Review your class notes from the past week or two. Go through each page, summarize the key points in your own words, and make sure you understand the concepts. This will help you reinforce your learning and identify areas where you need more practice or review.

**Task 2: Practice Problems (45 minutes)**
Choose a topic or subject that you're currently studying and practice solving problems or past exams. This will help you apply what you've learned and build your problem-solving skills. Try to complete as many problems as you can within the time frame, and check your answers to see where you need to improve.

**Task 3: Flashcard Creation and Review (45 minutes)**
Create flashcards for key terms, concepts, or formulas in your subject. Write the term or question on one side and the definition or answer on th

### Code Explanation:

-> from openai import OpenAI

🟢 Purpose:  

Imports the `OpenAI` class from the `openai` library.  
We are using it to make API calls to the SambaNova-hosted OpenAI-compatible endpoint.

-> Load environment variables from .env file

🟢 Purpose:

Loads our .env file, which contains environment variables like your API key.
This avoids hardcoding credentials directly in the script — good for security.

-> import os

🟢 Purpose:  

   -> `os` is used to access environment variables (e.g., `os.getenv("API_KEY")`) loaded from `.env`.

->  Initialize the OpenAI client with API key and base URL

🟢 Purpose:

   -> Creates a client instance of OpenAI using:
   -> Our secret API key (retrieved securely)
   -> A custom base URL (sambanova.ai) — not OpenAI’s default URL
   -> This is essential to send requests to SambaNova's hosted LLMs (e.g., Llama3).

-> prompts

🟢 Purpose:  

This is a list of input prompts we want the model to answer.  
Each prompt represents a **personal productivity request**, just as the assignment requires.

->  Function to generate response from OpenAI API : def generate_response(prompt):

🟢 Purpose:

Defines a function that takes a prompt, sends it to the OpenAI model, and prints the response.

-> chat completions:
Uses the chat completions endpoint to get a response from the Meta Llama 3 model.
It uses the standard chat format with a system/user conversation.

-> print(f"Response:\n{response.choices[0].message.content}")

🟢 Purpose:  

Prints the model’s generated text (only the actual message content from the first choice).

-> Exception:

🟢 Purpose:

Catches any exceptions (e.g., rate limits, connection errors) so our program won’t crash and you'll know what went wrong.

-> Call the function for each prompt
for prompt in prompts:
    generate_response(prompt)

🟢 Purpose:

Iterates through our list of prompts, calls the API for each, and prints the results.

## 🧠 Observation:

The OpenAI API (via SambaNova) successfully generated responses for various personal productivity prompts such as study plans, morning routines, and fitness suggestions. The chat format allowed dynamic interaction, and the outputs were human-like and helpful. This approach is useful for building productivity tools that offer contextual advice to users.