Here's a simplified example of how Deepseek might be fine-tuned for a classification task without any sensitive data. Before we begin, let's install the necessary Python packages to ensure the code runs smoothly. These include transformers for model handling and torch for tensor processing.

### Install the required libraries: Run the following command to install the required libraries:

In [1]:
! pip install transformers torch

Collecting nvidia-cuda-nvrtc-cu12==12.4.127 (from torch)
  Downloading nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12==12.4.127 (from torch)
  Downloading nvidia_cuda_runtime_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-cupti-cu12==12.4.127 (from torch)
  Downloading nvidia_cuda_cupti_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cudnn-cu12==9.1.0.70 (from torch)
  Downloading nvidia_cudnn_cu12-9.1.0.70-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cublas-cu12==12.4.5.8 (from torch)
  Downloading nvidia_cublas_cu12-12.4.5.8-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cufft-cu12==11.2.1.3 (from torch)
  Downloading nvidia_cufft_cu12-11.2.1.3-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-curand-cu12==10.3.5.147 (from torch)
  Downloading nvidia_curand_cu12-10.3.5

#### Now, you can import the necessary libraries for using the Deepseek model:

In [None]:
import torch

from torch.utils.data import Dataset

from transformers import AutoTokenizer, AutoModelForSequenceClassification, Trainer, TrainingArguments

Loading pre-trained Deepseek model and tokenizer: We load the pre-trained Deepseek tokenizer and model to use for sentiment classification:

AutoTokenizer.from_pretrained: Loads the pre-trained Deepseek tokenizer

AutoModelForSequenceClassification.from_pretrained: Loads the pre-trained Deepseek model for sequence classification tasks

In [None]:
tokenizer = AutoTokenizer.from_pretrained("deepseek-ai/deepseek-llm-7b-base", trust_remote_code=True)

model = AutoModelForSequenceClassification.from_pretrained(

   "deepseek-ai/deepseek-llm-7b-base",

   num_labels=2,  # Binary classification

   trust_remote_code=True

)



# Set padding token if not already set

if tokenizer.pad_token is None:

   tokenizer.pad_token = tokenizer.eos_token

   model.config.pad_token_id = tokenizer.pad_token_id

Example text and data tokenization: Here we define two example text samples and their corresponding sentiment labels, and tokenize the input text to convert it into a format suitable for Deepseek.

tokenizer: Converts the text into token IDs that Deepseek can process

padding=True: Ensures the text is padded to the same length

truncation=True: Truncates longer texts to fit Deepseek's maximum input length

return_tensors="pt": Returns PyTorch tensors for compatibility with the model

In [None]:
texts = ["I love machine learning.", "This is an amazing tutorial on Deeplearning."]

labels = [1, 0]  # Labels for sentiment (1 = positive, 0 = negative)

inputs = tokenizer(texts, padding=True, truncation=True, return_tensors="pt")



# Custom Dataset class

class CustomDataset(Dataset):

   def __init__(self, inputs, labels):

       self.input_ids = inputs['input_ids']

       self.attention_mask = inputs['attention_mask']

       self.labels = torch.tensor(labels)



   def __len__(self):

       return len(self.labels)



   def __getitem__(self, idx):

       return {

           'input_ids': self.input_ids[idx],

           'attention_mask': self.attention_mask[idx],

           'labels': self.labels[idx]

       }

#### Prepare the dataset for training: We create a dataset from the tokenized inputs and the labels:

In [None]:
train_data = torch.utils.data.TensorDataset(inputs['input_ids'], torch.tensor(labels))

Next, define training arguments. We specify the settings for the training process, such as the number of epochs, batch size, and logging:

In [None]:
training_args = TrainingArguments(

   output_dir="./results",

   num_train_epochs=3,

   per_device_train_batch_size=1,  # Reduced batch size due to model size

   logging_dir="./logs",

   logging_steps=10,

   save_steps=50,

   learning_rate=1e-5,

   gradient_accumulation_steps=16,  # Add gradient accumulation for large model

   fp16=True,  # Enable mixed precision training

   gradient_checkpointing=True  # Enable gradient checkpointing to save memory

)

#### Next, we fine-tune the Deepseek model with the prepared training data using the Trainer API:

In [None]:
trainer = Trainer(

   model=model,

   args=training_args,

   train_dataset=train_data

)

trainer.train()