# Understanding Semi-Supervised Learning Algorithms

## Technical requirements

We will use the following as technical requirements to run the code in this chapter:
- Python 3.9 or above
- pip
- Tensorflow (with CUDA if you want to train models on GPUs)
    - Keras is installed as a dependency to this
- scikit-learn Python library
    - Numpy is installed as a dependency to this
- Matplotlib library
- Hugging Face's transformers library
- Langchain library
- Jupyter notebook if running the code directly from Jupyter

In [None]:
! python3 -m pip install --upgrade pip

### For M1+ Macbook (64-bit ARM Based processor)

In [None]:
! arch -arm64 pip3 install --upgrade pip
! arch -arm64 pip3 install tensorflow
! arch -arm64 pip3 install -U scikit-learn
! arch -arm64 pip3 install matplotlib

### For Other Computer Systems

In [None]:
! pip3 install --upgrade pip
! pip3 install tensorflow
! pip3 install -U scikit-learn
! pip3 install matplotlib

## 5. Hands-on Fine Tuning in Python

### 5.1 Training pre-trained model
First we import the required libraries.

In [None]:
import numpy as np
import tensorflow as tf
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.datasets import cifar10
from tensorflow.keras import layers, models
import matplotlib.pyplot as plt
from tensorflow.keras.models import Model

Then we load and prepare the data.

In [None]:
def load_data():
    (x_train, y_train), (x_test, y_test) = cifar10.load_data()
    x_train = x_train.astype('float32') / 255
    x_test = x_test.astype('float32') / 255
    # One-hot encode the labels
    y_train = to_categorical(y_train, num_classes=10)
    y_test = to_categorical(y_test, num_classes=10)
    return x_train, y_train, x_test, y_test

x_train, y_train, x_test, y_test = load_data()

We then create a function to rotate images and prepare labels.

In [None]:
def create_rotated_images(images):
    angles = [0, 90, 180, 270]
    angle_labels = np.random.randint(0, 3, size=len(images))
    rotated_images = np.array([
        tf.image.rot90(images[i], k=angle_labels[i]).numpy()
        for i in range(len(images))
    ])
    return rotated_images, angle_labels

x_train_rotated, y_train_rotated = create_rotated_images(x_train)
x_test_rotated, y_test_rotated = create_rotated_images(x_test)

In [None]:
def plot_rotated_images(images, labels):
    plt.figure(figsize=(10, 5))
    for i in range(10):
        idx = np.random.randint(0, images.shape[0])
        plt.subplot(2, 5, i + 1)
        plt.imshow(images[idx])
        plt.title(f'Rotation: {labels[idx]*90}°')
        plt.axis('off')
    plt.tight_layout()
    plt.show()

plot_rotated_images(x_train_rotated, y_train_rotated)

For the ML model, we'll use a simple CNN model that outputs four classes corresponding to the four rotation angles.

In [None]:
def build_model(input_shape):
    inputs = layers.Input(shape=input_shape)
    x = layers.Conv2D(32, (3, 3), padding='same', activation='relu')(inputs)
    x = layers.BatchNormalization()(x)
    x = layers.Conv2D(32, (3, 3), activation='relu')(x)
    x = layers.MaxPooling2D((2, 2))(x)
    x = layers.Conv2D(64, (3, 3), padding='same', activation='relu')(x)
    x = layers.BatchNormalization()(x)
    x = layers.Conv2D(64, (3, 3), activation='relu')(x)
    x = layers.MaxPooling2D((2, 2))(x)
    x = layers.Conv2D(128, (3, 3), padding='same', activation='relu')(x)
    x = layers.BatchNormalization()(x)
    x = layers.Conv2D(128, (3, 3), activation='relu')(x)
    x = layers.GlobalAveragePooling2D()(x)
    x = layers.Flatten()(x)
    x = layers.Dense(128, activation='relu')(x)
    outputs = layers.Dense(4, activation='softmax')(x)
    
    model = models.Model(inputs=inputs, outputs=outputs)
    model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
    return model

pretext_model = build_model(x_train_rotated[0].shape)

Then we train the model as follows.

In [None]:
pretext_model.fit(x_train_rotated, y_train_rotated, epochs=10, batch_size=64,
                    validation_data=(x_test_rotated, y_test_rotated))

### 5.2 Fine tuning classification model

Now that we have a pre-trained model, we can discard some of the last few dense layers and retain the first few layers to use them in a model that would predict the label of the dataset. We will freeze the layers from the pre-trained models so that they are not retrained when we fit the transfer model. For comparison, we will also train another model from scratch and see the accuracy of the results.

In [None]:
def build_transfer_model(base_model, input_shape, num_classes):
    # Remove the last 2 dense layers and flatten layer
    base_model = models.Model(inputs=base_model.input,
                              outputs=base_model.get_layer(index=-5).output)
    
    for layer in base_model.layers:
        layer.trainable = False  # Freeze the convolutional layers

    new_model = models.Sequential([
        base_model,
        layers.Flatten(),
        layers.Dense(256, activation='relu'),
        layers.Dense(num_classes, activation='softmax')
    ])

    new_model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
    return new_model

transfer_model = build_transfer_model(pretext_model, x_train[0].shape, 10)

Next we fine tune the transfer model for classification on the original data with class labels.

In [None]:
transfer_model.fit(x_train, y_train, epochs=10, validation_data=(x_test, y_test))

In [None]:
loss, accuracy = transfer_model.evaluate(x_test, y_test)
print(f"Test accuracy: {accuracy * 100:.2f}%")

### 5.3 Training classification model from scratch

Let's train a classification model from scratch for comparison.

In [None]:
def build_scratch_model(image, num_classes):
    scratch_model = build_model(image.shape)
    
    scratch_model = models.Model(inputs=scratch_model.input,
                                  outputs=scratch_model.get_layer(index=-2).output)
    
    scratch_model = models.Sequential([
            scratch_model,
            layers.Dense(num_classes, activation='softmax')
        ])
    
    scratch_model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
    return scratch_model

scratch_model = build_scratch_model(x_train[0], 10)

In [None]:
scratch_model.fit(x_train, y_train, epochs=10, validation_data=(x_test, y_test))

As you would notice, when we train the model from scratch, it would take a lot more time for model training for each epoch (evident by time per step). We also note that the model accuracy is comparable while we get a massive gains in training time and thereby computational resources. This helps us understand how using pre-trained models can be of great benefit and can be applied to many of the use cases in practice.

### 5.4 Introduction to Hugging Face and LangChain Libraries

#### 5.4.1 Hugging Face _transformers_ Library
To get started, you first need to install the transformers library. It's generally used with a backend like PyTorch or TensorFlow. Here, I'll show an example using PyTorch. We'll also install backwards-compatible tf-keras package used by _transformers_ library.

In [None]:
## For M1+ Macbook (64-bit ARM Based processor)
! arch -arm64 pip3 install transformers torch
! arch -arm64 pip3 install tf-keras

## For Other Computer Systems
# ! pip3 install transformers torch
# ! pip3 install tf-keras

One of the simplest uses of the _transformers_ library is to load a pre-trained model and use it for inference. Let's use the BERT model for a sentiment analysis task. We will first import the _pipeline_ module from the _transformers_ library. The _pipeline_ module abstract a lot of details away for us and makes it easy to load a pre-trained model for NLP tasks. Then we use the task _sentiment_analysis_ to automatically select and load a model that is trained on sentiment analysis tasks. There are a lot of other tasks available like _text-generation_ (full list: https://huggingface.co/docs/transformers/en/main_classes/pipelines#transformers.pipeline.task) and we can also specify a specific model to be selected by the _pipeline_. We then apply the classifier model returned by the _pipeline_ to a sentence to determine its sentiment.

In [None]:
import tensorflow.keras as tf_keras
from transformers import pipeline

sentiment_pipeline = pipeline('sentiment-analysis')

text = "I love learning new things about AI and I am excited to read the book on Self-Supervised learning."
result = sentiment_pipeline(text)

print(result)

The output is a list of dictionaries, each containing the label (e.g., 'POSITIVE' or 'NEGATIVE') and score (a confidence level between 0 and 1).

Under the hood, the _pipeline_ selects a default model and tokenizer for the specified task and then the sentence provided as an input is first tokenized to convert it to model appropriate input and then fed into the model for predictions or generations. In the following code, we will specify the model and the tokenizer used by the BERT model for English language. We will learn more about the tokenizers and BERT model in Chapter 6.

In [None]:
from transformers import AutoTokenizer, AutoModelForSequenceClassification

model_name = "distilbert-base-uncased-finetuned-sst-2-english"
model = AutoModelForSequenceClassification.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

sentiment_pipeline = pipeline('sentiment-analysis', model=model, tokenizer=tokenizer)

text = "I love learning new things about AI and I am excited to read the book on Self-Supervised learning."
result = sentiment_pipeline(text)

print(result)

As you can notice, the result is exactly the same. So now you know what goes into a pipeline from a transformers library from Hugging Face. The importance of transformers library becomes apparent when they are being used to fine-tune the pre-trained models using neural networks using existing modules from Tensorflow or PyTorch. You can read more on how to fine-tune a pretrained model in Tensorflow with Keras or native PyTorch from the official documentation from Hugging Face: https://huggingface.co/docs/transformers/training

#### 5.4.2 LangChain Library
We need to first install langchain package along with necessary dependencies. Since LangChain can integrate with several backend providers like OpenAI for GPT models, you might also need API keys for those services if you plan to use their models. You can create the OpenAI API account to generate API access key here: https://platform.openai.com/

In [None]:
## For M1+ Macbook (64-bit ARM Based processor)
! arch -arm64 pip3 install langchain langchain-community langchain-core openai

## For Other Computer Systems
# ! pip3 install langchain langchain-community langchain-core openai

To demonstrate the basic capabilities of LangChain, let's create a simple application that uses an OpenAI GPT model to answer questions based on a provided text. This example assumes you have an API key for OpenAI. We first import the necessary modules.

In [None]:
from langchain.prompts import PromptTemplate
from langchain.llms import OpenAI
from langchain.chains import LLMChain, SequentialChain

We set the API key for OpenAI as an environment variable _OPENAI_API_KEY_ which is fetched directly by the _OpenAI_ class instance. The model name is passed to the _OpenAI_ class to initialize the language model.

In [None]:
from getpass import getpass
import os
OPENAI_API_KEY = getpass()
os.environ["OPENAI_API_KEY"] = OPENAI_API_KEY

In [None]:
from langchain_community.chat_models import ChatOpenAI
llm = ChatOpenAI(model_name="gpt-3.5-turbo-0125")

_PromptTemplate_ is designed to structure the input to the language model in a way that fits the task. For our example, we want to get the book title from its genre. The template includes placeholders for context and the question.

In [None]:
prompt_template_title = PromptTemplate(
    input_variables=['genre'],
    template="Suggest a book title from the genre {genre}."
)

_LLMChain_ class in LangChain is a type of chain that runs queries against a LLM. Chains in LangChain orchestrate the flow, sending formatted prompts to the language model and processing the outputs.

In [None]:
title_chain = LLMChain(
    llm=llm,
    prompt=prompt_template_title,
    output_key="book_title"
)

The _run()_ method takes the question and context, processes them through the configured chain, and returns the generated answer.

In [None]:
genre = "Romance"
answer = title_chain.run(genre=genre)

print("Answer:", answer)

We can subsequently chain the output from the LLM as an input to another prompt and get the final result as follows.

In [None]:
prompt_template_content = PromptTemplate(
    input_variables=['book_title'],
    template="Generate a sentence for the book titled {book_title}."
)

content_chain = LLMChain(
    llm=llm,
    prompt=prompt_template_content,
    output_key="book_content"
)

chain = SequentialChain(
        chains=[title_chain, content_chain],
        input_variables=['genre'],
        output_variables=['book_title', 'book_content']
    )

content_from_seq_chain = chain({'genre':'Romance'})
print(content_from_seq_chain)