# Transfer learning using ART

Hi! In this tutorial we will walk you through the process of using ART to perform transfer learning. We will use the [Yelp Reviews](https://huggingface.co/datasets/yelp_review_full) dataset and `bert-base-cased` model from HuggingFace. We will train a classifier to predict the sentiment of a review (positive or negative) and then we will use ART to perform transfer learning to attack the classifier. Most of the code will follow [HF's tutorial](https://huggingface.co/docs/transformers/training) with some modifications to make it work with ART.

We'll do everything in a script, your task will be to fill the `run.py` accordingly with our instructions from this tutorial.

In [None]:
!pip install art nltk wordcloud

## Data Analysis

Firstly we need to download the data and do some analysis on it. We'll use the `datasets` library from HuggingFace to do this, and we'll wrap the model to Lightning's `DataModule` to make it easier to use with PyTorch Lightning. We prapared the dataset for you in `dataset.py`, check it out there. The main function in `run.py` is just rady to download the data and show you a sample from it:

In [None]:
!python run.py

Now we can become one with the data. We want to know some statistics, that will be helpful. We prepared for you a data analisys step in `steps.py`. 
Now it's your turn! Fill the `steps.py` `...` places

As you've done it, modify the main() function as follows:
* read the data
* start the ART project
* add our data analisys step with checking, whether the result exists
* run all the steps (for now we have just one)

<details>

<summary>Correct TextDataAnalisys</summary>

```python
    def do(self, previous_states):
        targets = []
        texts = []

        # Loop through batches in the YelpReviews datamodule train dataloader
        for batch in self.datamodule.train_dataloader():
            # Assuming 'labels' contains the review scores
            targets.extend(batch['label'])
            # Assuming 'text' contains the review text
            texts.extend(batch['text'])

        # Calculate the number of unique classes (review scores) in the targets
        number_of_classes = len(np.unique(targets))

        # Now tell me what the scores are
        class_names = [str(i) for i in sorted(np.unique(targets))]

        # Create a dictionary of class names and their counts
        targets_ints = [int(i) for i in targets]
        class_counts = Counter(targets_ints)

        # count number of unique words
        unique_words = set()
        for text in texts:
            unique_words.update(text.split())
        number_of_unique_words = len(unique_words)

        # Create a word cloud
        wordcloud = WordCloud().generate(' '.join(texts))
        fig = plt.figure(figsize=(12, 12))
        plt.imshow(wordcloud, interpolation='bilinear')
        plt.axis("off")
        MatplotLibSaver().save(
            fig, self.get_step_id(), self.name, "wordcloud"
        )

        self.results.update(
            {
                "number_of_classes": number_of_classes,
                "class_names": class_names,
                "number_of_reviews_in_each_class": class_counts,
            }
        )
```
</details>

<details>

<summary>Correct main()</summary>

```py
def main():
    data = YelpReviews()
    project = ArtProject("yelpreviews", data)
    project.add_step(TextDataAnalysis(), [
                     CheckResultExists("number_of_classes"),
                     CheckResultExists("class_names"),
                     CheckResultExists("number_of_reviews_in_each_class")])
    project.run_all()
```
</details>

In [None]:
!python run.py

If you can see the output below, and the wordcloud.png in checkpoints folder we're good!
```
Steps status:
data_analysis_Data analysis: Completed. Results:
        number_of_classes: 5
        class_names: ['0', '1', '2', '3', '4']
        number_of_reviews_in_each_class: Counter({1: 240, 2: 208, 4: 189, 0: 189, 3: 174})```


## Baselines

As we have the data, now we ca work on our models, that will solve sentiment analisys problem! We start with a simple baseline. But before that, we need to define metrix that we'll use throughout the entire experiment. To do it add following line after adding the DataAnalisys step:



In [None]:
from torchmetrics import Accuracy, Precision, Recall, F1

NUM_CLASSES = project.get_step(0).get_latest_run()["number_of_classes"] #get calculated number of classes in the previous step
METRICS = [Accuracy(num_classes=NUM_CLASSES), Precision(num_classes=NUM_CLASSES), Recall(num_classes=NUM_CLASSES), F1(num_classes=NUM_CLASSES)] #define metrics
project.register_metrics(METRICS) #register metrics in the project

!python run.py

<details>

<summary>Correct main()</summary>

```py
def main():
    data = YelpReviews()
    project = ArtProject("yelpreviews", data)

    project.add_step(TextDataAnalysis(), [
                     CheckResultExists("number_of_classes"),
                     CheckResultExists("class_names"),
                     CheckResultExists("number_of_reviews_in_each_class")])
    # get calculated number of classes in the previous step

    NUM_CLASSES = project.get_step(0).get_latest_run()["number_of_classes"]
    METRICS = [Accuracy(num_classes=NUM_CLASSES), Precision(num_classes=NUM_CLASSES), Recall(
        num_classes=NUM_CLASSES), F1(num_classes=NUM_CLASSES)]  # define metrics
    project.register_metrics(METRICS)  # register metrics in the project

    project.run_all()
```
</details>

At this stage you should see, that the first step was skipped, because we already have executed it.

We prepared one baseline for you in `models/simple_baseline.py`. Add it to the project and run it. You can do it by adding following lines to the main() function:

In [None]:
from models.simple_baseline import HeuristicBaseline

baseline = HeuristicBaseline()
project.add_step(
    step=EvaluateBaseline(baseline),
    checks=[CheckScoreExists(metric=METRICS[i])
            for i in range(len(METRICS))],
)

<details>

<summary>Correct main()</summary>

```py
def main():
    data = YelpReviews()
    project = ArtProject("yelpreviews", data)

    project.add_step(TextDataAnalysis(), [
                     CheckResultExists("number_of_classes"),
                     CheckResultExists("class_names"),
                     CheckResultExists("number_of_reviews_in_each_class")])
    # get calculated number of classes in the previous step

    NUM_CLASSES = 5
    METRICS = [
        Accuracy(num_classes=NUM_CLASSES, average='macro', task='multiclass'),
        Precision(num_classes=NUM_CLASSES, average='macro', task='multiclass'),
        Recall(num_classes=NUM_CLASSES, average='macro', task='multiclass')
    ]  # define metrics
    project.register_metrics(METRICS)  # register metrics in the project

    baseline = HeuristicBaseline()
    project.add_step(
        step=EvaluateBaseline(baseline),
        checks=[CheckScoreExists(metric=METRICS[i])
                for i in range(len(METRICS))],
    )

    project.run_all()
```
</details>

!python run.py

The correct output should look like this:
```
Steps status:

data_analysis_Data analysis: Skipped. Results:
        number_of_classes: 5
        class_names: ['0', '1', '2', '3', '4']
        number_of_reviews_in_each_class: {'4': 189, '1': 240, '3': 174, '0': 189, '2': 208}

        
HeuristicBaseline_2_Evaluate Baseline: Completed. Results:
        MulticlassAccuracy-HeuristicBaseline-validate-Evaluate Baseline: 0.30702152848243713
        MulticlassPrecision-HeuristicBaseline-validate-Evaluate Baseline: 0.3316725790500641
        MulticlassRecall-HeuristicBaseline-validate-Evaluate Baseline: 0.30702152848243713
```

## Training the proper model...

... but first in the experimental mode:
* Check loss on init with frozen backbone - cls head only
* Overfitting one batch with frozen backbone
* Overfitting entire dataset with frozen backbone
* Check loss on init ((???)) with unfrozen backbone
* Overfitting one batch with unfrozen backbone
* Overfitting entire dataset with unfrozen backbone
* Training on entire dataset - first with frozen backbone, then with unfrozen backbone and reduced learning rate


In [1]:
from transformers import AutoModelForSequenceClassification

model = AutoModelForSequenceClassification.from_pretrained("bert-base-cased", num_labels=5)

  from .autonotebook import tqdm as notebook_tqdm
Downloading model.safetensors: 100%|██████████| 436M/436M [00:10<00:00, 39.8MB/s] 
To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to see activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-cased and are newly initialized: ['classifier.weight', 'classifier.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


In [6]:
[param[0] for param in model.named_parameters()][:5]

['bert.embeddings.word_embeddings.weight',
 'bert.embeddings.position_embeddings.weight',
 'bert.embeddings.token_type_embeddings.weight',
 'bert.embeddings.LayerNorm.weight',
 'bert.embeddings.LayerNorm.bias']