## Step 1: Install Dependencies and Set Up Environment

First, run this cell to install the necessary Python libraries. 

**Important:** Before running the next cell, make sure you have uploaded your data and model files to the correct locations in the Colab file explorer:

1.  **Dataset Files**: Create a `dataset/ml-1m` directory and upload your `.pkl` files there. The final paths should look like this:
    - `/content/dataset/ml-1m/train_ood2.pkl`
    - `/content/dataset/ml-1m/valid_ood2.pkl`
    - `/content/dataset/ml-1m/test_ood2.pkl`

2.  **Model Code**: Upload the `minigpt4` and `CoRA` folders from the project. The structure should be:
    - `/content/minigpt4/...`
    - `/content/CoRA/...`

3. **Vicuna Weights**: Upload the Vicuna weights to a directory named `vicuna_weight_working`:
    - `/content/vicuna_weight_working/...`

4. **Pre-trained Model**: Upload the `mf_movielens_best.pth` file to the following path:
    - `/content/dataset/pretrained/mf_movielens_best.pth`

In [None]:
!pip install accelerate==0.16.0 aiohttp==3.8.4 aiosignal==1.3.1 async-timeout==4.0.2 attrs==22.2.0 bitsandbytes==0.37.0 cchardet==2.1.7 chardet==5.1.0 contourpy==1.0.7 cycler==0.11.0 filelock==3.9.0 fonttools==4.38.0 frozenlist==1.3.3 huggingface-hub==0.13.4 importlib-resources==5.12.0 kiwisolver==1.4.4 matplotlib==3.7.0 multidict==6.0.4 openai==0.27.0 packaging==23.0 psutil==5.9.4 pycocotools==2.0.6 pyparsing==3.0.9 python-dateutil==2.8.2 pyyaml==6.0 regex==2022.10.31 tokenizers==0.13.2 tqdm==4.6.1 transformers==4.28.0 timm==0.6.13 spacy==3.5.1 webdataset==0.2.48 scikit-learn==1.2.2 scipy==1.10.1 yarl==1.8.2 zipp==3.14.0 omegaconf==2.3.0 opencv-python==4.7.0.72 iopath==0.1.10 decord==0.6.0 tenacity==8.2.2 peft pycocoevalcap sentence-transformers umap-learn notebook gradio==3.24.1 gradio-client==0.0.8 wandb

## Step 2: Create the YAML Configuration File

In [None]:
import os

config_str = """
model:
  arch: mini_gpt4rec_vx
  model_type: pretrain_vicuna
  freeze_rec: True
  freeze_proj: True
  freeze_lora: False
  freeze_bias: True
  max_txt_len: 1024
  proj_token_num: 1
  proj_drop: 0
  proj_mid_times: 10
  end_sym: "###"
  prompt_path: "CoRA/prompts/tallrec_movie.txt"
  prompt_template: '{}'
  llama_model: "vicuna_weight_working"
  user_num: -100
  item_num: -100
  ans_type: 'v2'
  lora_config:
    use_lora: True
    r: 16
    alpha: 32
    target_modules: ["q_proj", "v_proj", "o_proj", "k_proj"]
    dropout: 0.05
  rec_config:
    user_num: -100
    item_num: -100
    embedding_size: 256
    pretrained_path: "/content/dataset/pretrained/mf_movielens_best.pth"
  ckpt: null

datasets:
  movie_ood:
    path: /content/dataset/ml-1m/
    data_type: default
    build_info:
      storage: /content/dataset/ml-1m/

run:
  task: rec_pretrain
  lr_sched: "linear_warmup_cosine_lr"
  init_lr: 1e-3
  min_lr: 8e-5
  warmup_lr: 1e-5
  mode: 'v2'
  weight_decay: 1e-3
  max_epoch: 1000
  iters_per_epoch: 100
  batch_size_train: 8
  batch_size_eval: 32
  num_workers: 4
  warmup_steps: 200
  seed: 42
  output_dir: logs/test/
  amp: True
  resume_ckpt_path: null
  evaluate: False
  train_splits: ["train"]
  valid_splits: ["valid"]
  test_splits: ["test_warm", "test_cold", "test"]
  device: "cuda"
  world_size: 1
  dist_url: "env://"
  distributed: False
"""

config_dir = '/content/CoRA/train_configs'
os.makedirs(config_dir, exist_ok=True)
config_path = os.path.join(config_dir, 'plora_pretrain_mf_ood_movie.yaml')

with open(config_path, 'w') as f:
    f.write(config_str)

## Step 3: Run the CoRA Tuning Script

In [None]:
!python /content/CoRA/train_collm_mf_din.py --cfg-path=/content/CoRA/train_configs/plora_pretrain_mf_ood_movie.yaml