In [1]:
"""
CodeFusion | Day 3 | Pretrained Models with Hugging Face Pipelines

We will discuss
Session 1: The power of pretrained models
Session 2: Hugging face pipelines
Session 3: Supercharge you application with AI
"""
print("Welcome to CodeFusion Day 3")

Welcome to CodeFusion Day 3


### Session 1: The Power of Pretrained Models

In [2]:
"""
Key concepts:
[] Pretrained models are neural networks trained on large datasets for general tasks (e.g., BERT, GPT)
   Pretrained model pipeline (data → model → task).
[] They can be fined-tuned for specific tasks with minimal data
[] Benefits: Saves time, reduces need for large datasets, leverage expert knowledge. 

Explain: Pretrained models are like "knowledgeable assistants" pretrained on vast data, ready to adapt to new tasks.

Interactive Questions: 
"Can you name a task where pretrained models might be useful?"


Applications:
[] Text: Sentiment analysis (e.g., analyzing customer reviews).
[] Vision: Image classification (e.g., identifying objects in photos).
[] Speech: Automatic speech recognition (e.g., transcribing audio).
[] BERT for search engines, DALL-E for image generation.

Interactive Question: "What's a real-world problem you'd love a pretrained model to solve?"


"""

print("Pretrained models are awesome.")

Pretrained models are awesome.


In [3]:
from transformers import pipeline
classifier = pipeline("sentiment-analysis")
result = classifier("I love learning about AI!")
print(result)

No model was supplied, defaulted to distilbert/distilbert-base-uncased-finetuned-sst-2-english and revision 714eb0f (https://huggingface.co/distilbert/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.





Device set to use cuda:0


[{'label': 'POSITIVE', 'score': 0.99968421459198}]


  attn_output = torch.nn.functional.scaled_dot_product_attention(


In [4]:
""" Hugging face (https://huggingface.co/)

Interactive Question: How many of you use hugging face pipelines?

Key Concepts:
[] Hugging Face is an open-source platform focused on machine learning, especially natural language processing (NLP).
[] It provides tools, models, and datasets for developers and researchers.
[] Famous for the Transformers library, which offers easy access to pre-trained AI models.
[] Founded in 2016, based in New York, with a mission to make AI accessible to everyone.

Key Features:
[] Model Hub: Thousands of pre-trained models for tasks like text generation, translation, and sentiment analysis.
[] Datasets Library: A wide range of publicly available datasets for machine learning tasks.
[] HF Space: A platform for hosting and sharing machine learning apps, demos, and models with an easy-to-use interface.
[] Community-Driven: Encourages collaboration through open-source contributions and discussions.
[] Applications: Widely used in academia, industry, and personal projects for fast AI development.
"""
print("Hugging face is awesome")

Hugging face is awesome


### Session 2. Hugging Face Pipelines

In [5]:
""" 
Installation: pip install transformers

[] Hugging Face Transformers is an open-source library with pre-trained models for NLP, vision, audio, and more.
[] The pipeline is a simplified API for inference tasks.

Discuss the questions:
[] Can pipelines handle other languages?
[] How do I choose a model?
"""

print("Sentiment Analysis")
from transformers import pipeline
classifier = pipeline("sentiment-analysis")
texts = [
    "This movie is amazing!", 
    "I did not enjoy the food."
]
results = classifier(texts)
for text, result in zip(texts, results):
    print(f"Text: {text} → {result['label']}")
    print(f"Confidence: {result['score']*100:.2f}%")

No model was supplied, defaulted to distilbert/distilbert-base-uncased-finetuned-sst-2-english and revision 714eb0f (https://huggingface.co/distilbert/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.


Sentiment Analysis


Device set to use cuda:0


Text: This movie is amazing! → POSITIVE
Confidence: 99.99%
Text: I did not enjoy the food. → NEGATIVE
Confidence: 99.95%


In [6]:
generator = pipeline("text-generation", model="gpt2")
prompt = "Bangladesh is a "
result = generator(prompt, max_length=50)
print(result[0]["generated_text"])

Device set to use cuda:0
Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Bangladesh is a izzarjar region located near the border with Pakistan to southern Bangladesh. It's a part of Jammu Kashmir in Jammu, Delhi and some parts of the Central Valley of Kashmir. It plays a pivotal role. The


### Session 3: Supercharge Your Software with AI

In [7]:
sample_note = [
    {
        "title": "Personal Health Check-In",
        "note": (
            "Took some extra time this morning to reflect on my health and well-being as part of my ongoing personal growth. "
            "Over the past few months, I’ve felt my energy fluctuate quite a lot — sometimes waking up refreshed, other times feeling sluggish — and I want to understand these patterns better. "
            "This reflection has helped me recognize that my habits around sleep, diet, exercise, and stress management all play a role in my day-to-day health. "
            "To address this, I’m setting a clear intention: I will go for a walk every evening after work to stay active, clear my mind, and create a more sustainable routine that fits into my schedule. "
            "I plan to make these walks a space for quiet thinking, listening to my favorite podcasts or simply appreciating my surroundings so that movement also becomes a mental reset. "
            "I also scheduled a doctor’s appointment for next week to follow up on my last check-up, review my lab results, and talk about any lingering health concerns. "
            "Additionally, I want to pay closer attention to my eating habits, especially making sure that I get a good balance of vegetables, fruits, and protein at each meal. "
            "I’m going to prepare a simple meal plan for the coming week to make grocery shopping easier and ensure I don’t skip meals when my schedule gets busy. "
            "Beyond physical health, I recognize that my emotional and mental well-being deserve regular care too. "
            "That’s why I plan to dedicate a few minutes every morning to light stretching and mindfulness exercises — this practice will help me ground myself before starting the day. "
            "At the end of the week, I’d like to take some time to journal my progress, acknowledge my wins (no matter how small), and note any challenges so that I can continuously improve. "
            "Writing this all down reminds me that health is a lifelong journey — one that I can approach with patience, consistency, and compassion toward myself. "
            "I feel hopeful that these small steps will add up over time and help me feel more energetic, centered, and at peace with my body and mind."
        )
    },
    {
        "title": "Product Development Sync-Up",
        "note": (
            "The team gathered to discuss the current status of the mobile app development project. "
            "We reviewed the sprint backlog and identified blockers in the payment integration feature. "
            "The team agreed to split the task into front-end and back-end subtasks and assigned owners. "
            "Next steps include preparing a demo for the client next week and drafting the release notes for version 1.2."
        )
    },
    {
        "title": "Data Science Workshop Recap",
        "note": (
            "The class focused on training machine learning models and evaluating performance metrics. "
            "Students practiced hands-on exercises to fine-tune hyperparameters and learned best practices for avoiding overfitting. "
            "We ended with a Q&A session where students raised questions about model interpretability and deployment. "
            "Upcoming sessions will explore feature engineering and advanced deep learning techniques."
        )
    },
    {
        "title": "Feeling Overwhelmed but Hopeful",
        "note": (
            "Today felt quite overwhelming with so many tasks and deadlines piling up. "
            "Despite the stress, I feel hopeful that by organizing my time better and reaching out for support, I can make progress. "
            "Writing this down to acknowledge my emotions and remind myself that it's okay to take one step at a time."
        )
    }
]


In [8]:
def preprocess_text(text: str) -> str:
    """Clean and prepare text for processing."""
    return text.strip().replace("\n", " ")

# Example usage:
raw_text = "   This is a sample text.\nIt contains newlines and extra spaces.   "
clean_text = preprocess_text(raw_text)

print("Original text:")
print(repr(raw_text))
print("\nCleaned text:")
print(repr(clean_text))

Original text:
'   This is a sample text.\nIt contains newlines and extra spaces.   '

Cleaned text:
'This is a sample text. It contains newlines and extra spaces.'


In [9]:
"""Task: Summarization"""

def summarize_note(note_data):
    """Generate a summary of the note using a pre-trained LLM."""
    try:
        summarizer = pipeline("summarization", model="facebook/bart-large-cnn")
        text = f"{note_data['title']}\n{note_data['note']}"
        summary = summarizer(
            preprocess_text(text), 
            max_length=100, 
            min_length=20, 
            do_sample=False
        )[0]['summary_text']
        return summary
    except Exception as e:
        return f"Error summarizing note: {str(e)}"

In [10]:
summarize_note(sample_note[0])

Device set to use cuda:0


'"I’m setting a clear intention: I will go for a walk every evening after work to stay active, clear my mind, and create a more sustainable routine" "I want to pay closer attention to my eating habits, especially making sure that I get a good balance of vegetables, fruits, and protein at each meal" "Health is a lifelong journey — one that I can approach with patience, consistency, and compassion"'

In [11]:
def assign_tag(note_data):
    """Assign a single tag using zero-shot classification with clearer labels."""
    try:
        from transformers import pipeline
        classifier = pipeline(
            "zero-shot-classification",
            model="facebook/bart-large-mnli"
        )
        text = f"{note_data['title']}\n{note_data['note']}"

        # More explicit candidate labels
        candidate_labels = ["Educational session", "Project meeting", "Personal note"]

        result = classifier(preprocess_text(text), candidate_labels, multi_label=False)

        top_label = result['labels'][0]

        return top_label
    except Exception as e:
        return {"error": str(e)}

In [12]:
print(assign_tag(sample_note[2]))  # Expected: "Educational session"

Device set to use cuda:0


Educational session


In [13]:
def analyze_sentiment(note_data) -> str:
    """Analyze sentiment and map to an emoji."""
    try:
        sentiment_analyzer = pipeline("sentiment-analysis", model="distilbert-base-uncased-finetuned-sst-2-english")
        text = f"{note_data['title']}\n{note_data['note']}"
        sentiment = sentiment_analyzer(preprocess_text(text))[0]
        label, score = sentiment['label'], sentiment['score']
        if label == "POSITIVE" and score > 0.9:
            return "😊"
        elif label == "POSITIVE":
            return "🙂"
        elif label == "NEGATIVE" and score > 0.9:
            return "😟"
        else:
            return "😐"
    except Exception as e:
        return f"Error analyzing sentiment: {str(e)}"

In [14]:
analyze_sentiment(sample_note[3])

Device set to use cuda:0


'😊'

In [15]:
def process_notes(notes):
    """Process a list of notes and return insights, with optional question answering."""
    results = []
    for note_data in notes:
        try:
            result = {
                "title": note_data['title'],
                "note": note_data['note'],
                "summary": summarize_note(note_data),
                "tag": assign_tag(note_data),
                "sentiment": analyze_sentiment(note_data)
            }
            results.append(result)
        except KeyError as e:
            results.append({
                "title": note_data.get('title', 'Unknown'),
                "note": note_data.get('note', ''),
                "error": f"Missing key {e} in note_data. Expected format: {{'title': str, 'note': str}}"
            })

    return results

In [16]:
processed_notes = process_notes(sample_note[:1])
for result in processed_notes:
    print(f"Title: {result['title']}")
    print(f"Original Note: {result['note']}")
    print(f"Summary: {result['summary']}")
    print(f"Tag: {result['tag']}")
    print(f"Sentiment: {result['sentiment']}\n")

Device set to use cuda:0
Device set to use cuda:0
Device set to use cuda:0


Title: Personal Health Check-In
Original Note: Took some extra time this morning to reflect on my health and well-being as part of my ongoing personal growth. Over the past few months, I’ve felt my energy fluctuate quite a lot — sometimes waking up refreshed, other times feeling sluggish — and I want to understand these patterns better. This reflection has helped me recognize that my habits around sleep, diet, exercise, and stress management all play a role in my day-to-day health. To address this, I’m setting a clear intention: I will go for a walk every evening after work to stay active, clear my mind, and create a more sustainable routine that fits into my schedule. I plan to make these walks a space for quiet thinking, listening to my favorite podcasts or simply appreciating my surroundings so that movement also becomes a mental reset. I also scheduled a doctor’s appointment for next week to follow up on my last check-up, review my lab results, and talk about any lingering health c