# Fine-Tuning
Fine-tuning means taking a model that already learned many things and teaching it more using your own data, so it works better for your task.

# Processing the data

In [3]:
import torch
from torch.optim import AdamW
from transformers import AutoTokenizer, AutoModelForSequenceClassification

In [4]:
checkpoint = "bert-base-uncased"
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = AutoModelForSequenceClassification.from_pretrained(checkpoint)

Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`
To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


In [5]:
sequences = [
    "I've been waiting for a HuggingFace course my whole life.",
    "This course is amazing!",
]

In [6]:
batch = tokenizer(sequences, padding=True, truncation=True, return_tensors="pt")

In [7]:
# This is new
batch["labels"] = torch.tensor([1, 1])

In [8]:
optimizer = AdamW(model.parameters())
loss = model(**batch).loss
loss.backward()
optimizer.step()

In [13]:
loss

tensor(0.5749, grad_fn=<NllLossBackward0>)

In [10]:
optimizer

AdamW (
Parameter Group 0
    amsgrad: False
    betas: (0.9, 0.999)
    capturable: False
    differentiable: False
    eps: 1e-08
    foreach: None
    fused: None
    lr: 0.001
    maximize: False
    weight_decay: 0.01
)

In [11]:
outputs = model(**batch)
predictions = torch.argmax(outputs.logits, dim=-1)

In [12]:
predictions

tensor([1, 0])

# Loading a dataset from the Hub

In [1]:
from datasets import load_dataset

  from .autonotebook import tqdm as notebook_tqdm
