Skip to content

InezYu0928/MiniCPM_FT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MiniCPM_FT

Introduction

MiniCPM_FT is a repository for fine-tuning the base model MiniCPM-2B-sft-fp32 using datasets from Hugging Face Datasets and Cosmos. This repository provides the necessary tools and code to finetune the model and evaluate its performance on various datasets.

Paper

The paper associated with this repository is available here.

Dataset

This repository utilizes the following datasets:

Repository Structure

  • cleandata: Contains separately reconstructed datasets from Cosmos, Trivia QA Wikipedia, and Trivia QA Web. Data is reformatted into standard question-answer pairs.
  • mergedata: Represents a composite dataset split into train, development, and test datasets derived from the cleaned datasets.
  • finetune: Contains all the code necessary for the complete process of fine-tuning the base model. To produce a fine-tuned model, use bash {}_finetune.sh.
  • models: Stores examples of the base and fine-tuned models.
  • evaluate: Contains codes and sample data for evaluating the fine-tuned model's performance on given datasets. Results are stored in the result folder.

Usage

To finetune the base model, follow these steps:

  1. Clone the repository:

    git clone https://github.com/InezYu0928/MiniCPM_FT.git
  2. Navigate to the Finetune folder:

    cd MiniCPM_FT/finetune
  3. Execute the finetuning script:

    bash xxx_finetune.sh
    

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages