<a href="https://colab.research.google.com/github/Yuting-TinaL/Test/blob/main/GenAI_1.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Generative AI: Introduction and Hands-On with Hugging Face

**Certification by ShiftKey Labs**  
**Content created by Vansh Sood**

This notebook is designed to give you hands-on experience with the tools and concepts you'll need to work with generative AI models. We'll be using Google Colab to run Python code, explore some basic concepts, and dive into working with pre-trained models from Hugging Face.

## Sections Covered
1. Introduction and Overview
2. Playing Around with Google Colab
3. Coding in Python: Getting Comfortable with Google Colab
4. Setting Up Libraries from Hugging Face
5. Importing Necessary Modules
6. Understanding the Imports
7. Loading a Pre-trained Model using Transformers
8. Tokenizing Input and Model Inference
9. Outputs and Prompts

Let's begin!

## 1. Introduction and Overview

In this notebook, we'll walk through setting up your environment in Google Colab, coding in Python to get familiar with the platform, and working with pre-trained Transformer models from Hugging Face.

By the end of this session, you'll have a better understanding of how to use Google Colab, work with Python code in the cloud, and start using powerful AI models from Hugging Face for various natural language processing (NLP) tasks.

## 2. Playing Around with Google Colab

Google Colab is a great tool for running Python code in the cloud, with access to free GPUs. Before diving into the main content, let's explore the Colab environment.

### Task: Simple Arithmetic Operations

We'll start with something simpleâ€”basic arithmetic in Python. This helps you get comfortable with running code cells in Colab.

In [None]:
# Let's do some basic arithmetic
addition = 5 + 3
multiplication = 7 * 6
division = 16 / 4
subtraction = 10 - 2

# Print the results
addition, multiplication, division, subtraction

## 3. Coding in Python: Getting Comfortable with Google Colab

Now that we've done some basic arithmetic, let's try something a bit more complex. We'll write a simple Python function and see how it works.

### Task: Write a Function to Check Even or Odd Numbers

Let's write a Python function that checks whether a number is even or odd.

In [None]:
# Define a function to check if a number is even or odd
def check_even_odd(number):
    if number % 2 == 0:
        return f"{number} is even."
    else:
        return f"{number} is odd."

# Test the function with a few examples
check_even_odd(7), check_even_odd(10)

## 4. Setting Up Libraries from Hugging Face

Now that we're comfortable with the basics, let's set up the libraries we'll need to work with pre-trained models from Hugging Face.

### Task: Install the Hugging Face Transformers Library

Hugging Face provides the `transformers` library, which makes it easy to use state-of-the-art NLP models. We'll start by installing this library.


In [None]:
# Install the Hugging Face Transformers library
!pip install transformers

## 5. Importing Necessary Modules

After installing the library, the next step is to import the necessary modules that we'll use to work with models and tokenizers.

### Task: Import Modules from Hugging Face

In [None]:
# Importing necessary modules from Hugging Face
from transformers import AutoTokenizer, AutoModelForSequenceClassification

print("Modules imported successfully.")

### Explanation

In this section, we imported:
- `AutoTokenizer`: A class that provides all the necessary tools to preprocess text data for the model.
- `AutoModelForSequenceClassification`: A class that loads a pre-trained model for sequence classification tasks.

These imports are critical for tokenizing input data and loading the pre-trained models.

In [None]:
import torch
device = 0 if torch.cuda.is_available() else -1

## 7. Loading a Pre-trained Model using Transformers

We will now load a pre-trained model using the Hugging Face `transformers` library. This model will be used for a sequence classification task.

### Task: Load a Pre-trained BERT Model

We'll load the BERT model, which is widely used in NLP tasks.

In [None]:
# Load a pre-trained BERT model and its tokenizer
model_name = "bert-base-uncased"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

print(f"Model {model_name} loaded successfully.")

## 8. Tokenizing Input and Model Inference

Now that we've loaded the model and tokenizer, the next step is to prepare some text input for the model. We'll tokenize the input and then pass it through the model to get a prediction.

### Task: Tokenize a Sentence and Get Model Output

In [None]:
# Tokenize an example input
input_text = "Generative AI is transforming technology."
inputs = tokenizer(input_text, return_tensors="pt")

# Perform inference with the model
outputs = model(**inputs)

# Print the outputs
outputs

### Explanation

In this code:
- We defined an input sentence `Generative AI is transforming technology`.
- The `tokenizer` converts this sentence into the format required by the BERT model. The `return_tensors="pt"` argument specifies that the output should be PyTorch tensors.
- We then pass the tokenized inputs to the model using `model(**inputs)`, which performs the inference.
- The outputs are printed, which typically include logits (raw predictions) from the model.

This step demonstrates how you can input text into a pre-trained model and get a prediction.


## 9. Asking Different Questions to the NLP Model

In this section, we will use the `Flan-T5 base` model to perform various NLP tasks, including translation and summarization. While the model can handle these tasks, the results might not always be perfect. We'll explore how to improve them using prompt engineering in the next session.

### Task: Text Translation
Let's use the `Flan-T5 base` model to translate a sentence from English to another language.


In [None]:
# Load Flan-T5 base model for translation
from transformers import pipeline

# Create a pipeline for translation using Flan-T5 base
translator = pipeline("translation_en_to_fr", model="google/flan-t5-base")

# Translate a sentence from English to French
translation = translator("Generative AI is transforming the world of technology.")
print("Translation:", translation[0]['translation_text'])

### Explanation

Here, we used the `Flan-T5 base` model via the `pipeline` function for translation. We translated the sentence "Generative AI is transforming the world of technology" from English to French using the `google/flan-t5-base` model.

The translation may be accurate for simple sentences, but more complex sentences could result in less precise translations. We'll work on improving this in the next session.

### Task: Text Summarization

Next, let's use the same `Flan-T5 base` model for summarizing a longer text.

In [None]:
# Create a pipeline for summarization using Flan-T5 base
summarizer = pipeline("summarization", model="google/flan-t5-base", device=device)

# Example paragraph to summarize
text = """
summarize:
Generative AI is a field within artificial intelligence focused on creating models that can generate new content.
These models can produce text, images, music, and even code. Companies are increasingly investing in generative AI
to develop innovative products and services, transforming industries and driving technological advancement.
"""

# Generate a summary of the paragraph
summary = summarizer(text, max_length=50, min_length=25, do_sample=False)
print("Summary:", summary[0]['summary_text'])

### Explanation

In this section, we used the `Flan-T5 base` model for text summarization. The same model can be adapted for various tasks, showcasing its versatility.

The summary provided is a concise version of the original text. As with translation, the summarization may not always capture all the nuances of the original content. We will explore how to refine these summaries using prompt engineering in the next session.


## 10. Conclusion and Preview for Next Session

In this session, we've used the `Flan-T5 base` model to perform various NLP tasks such as translation and summarization. The `Flan-T5 base` model is powerful and flexible, but the results we obtained are not always perfect.

In our next session, we will focus on **prompt engineering**, a technique that allows us to refine these outputs by optimizing the input prompts. By learning how to craft better prompts, we can significantly enhance the performance of these models.

See you in the next session!
