### **Clone My Professor**
This code helps researchers to clone the writing style of your PI / Professor, allowing you to reduce the number of revisions required to adjust the language.



*   Professors have their own writing styles
*   This pattern can be observed in their publications.
* When you approach them with a writing to be edited, naturally they will try revise it based on their style.
* This may require you multiple revisions to adapt to their style.
* what if their writing style can be learned by a locally running LLM and helps you rewrite it.
* This code helps you to do that

**Tech Stack**: ollama, pypdf, unsloath



**Read all the pdf's in a folder**


*   Load your professors publication into a folder named papers.
*   Remember: You will have to repeat it everytime you load.



In [25]:
import os

# Get all files and directories in the current working directory
all_files = os.listdir('/content/papers/')

# Filter for files ending with '.pdf'
pdf_files = [file for file in all_files if file.endswith('.pdf')]

# Print the identified PDF files
print("Identified PDF files:", pdf_files)

Identified PDF files: ['JOV paper.pdf']


In [2]:
%%capture
pip install PyPDF2

## Extract Text from PDFs

Iterate through the list of identified PDF files (`pdf_files`), open each file, and extract its text content page by page. Store the extracted text from each PDF in a list.


In [3]:
from PyPDF2 import PdfReader

extracted_texts = []

for pdf_file_name in pdf_files:
    full_path = os.path.join('/content/papers/', pdf_file_name)
    current_pdf_text = []
    try:
        with open(full_path, 'rb') as f:
            reader = PdfReader(f)
            for page in reader.pages:
                current_pdf_text.append(page.extract_text())
        extracted_texts.append(" ".join(current_pdf_text)) # Join pages text for each PDF
        print(f"Successfully extracted text from {pdf_file_name}")
    except Exception as e:
        print(f"Error extracting text from {pdf_file_name}: {e}")

# Print the first 500 characters of the first extracted PDF text as an example
if extracted_texts:
    print("\nFirst 500 characters of the first extracted PDF text:")
    print(extracted_texts[0][:500])
else:
    print("No text was extracted.")

## Consolidate extracted text
all_combined_text = " ".join(extracted_texts)
print(f"\nTotal number of characters in all extracted text: {len(all_combined_text)}")

Successfully extracted text from JOV paper.pdf

First 500 characters of the first extracted PDF text:
Investigating Vocal Tract Configurations Across Different 
Belting Qualities in Female and Male Musical Theater Singers 
Using Real-Time Dynamic MRI
*,‚Ä†,‚Ä°Denis Michael Rudisch, ¬ßKai Tobias Block, ¬∂Matt Edwards, and ‚à•Aaron M. Johnson, ‚Åé‚Ä†‚Ä°Madison, WI, ¬ß‚à•New 
York, NY, and ¬∂Winchester, VA 
Summary: Objectives/Hypothesis. To identify vocal tract configuration patterns in vocally healthy con-
temporary commercial music (CCM) singers during the production of five industry-typical vocal qualities, 
inc

Total number of characters in all extracted text: 59727


## Clean Combined Text with Regex

Use regular expressions to clean the `all_combined_text` by removing special characters, normalizing whitespace, and preparing it for further analysis.

In [4]:
import re

def clean_text(text):
    # Remove the \xad character (soft hyphen)
    text = text.replace('\xad', '')
    # Replace multiple newlines with a single space
    text = re.sub(r'\n+', ' ', text)
    # Replace multiple spaces with a single space
    text = re.sub(r'\s+', ' ', text)
    # Strip leading/trailing whitespace
    text = text.strip()
    return text

cleaned_combined_text = clean_text(all_combined_text)

print("First 1000 characters of the cleaned combined text:")
print(cleaned_combined_text[:1000])

print(f"\nTotal number of characters in the cleaned combined text: {len(cleaned_combined_text)}")

First 1000 characters of the cleaned combined text:
Investigating Vocal Tract Configurations Across Different Belting Qualities in Female and Male Musical Theater Singers Using Real-Time Dynamic MRI *,‚Ä†,‚Ä°Denis Michael Rudisch, ¬ßKai Tobias Block, ¬∂Matt Edwards, and ‚à•Aaron M. Johnson, ‚Åé‚Ä†‚Ä°Madison, WI, ¬ß‚à•New York, NY, and ¬∂Winchester, VA Summary: Objectives/Hypothesis. To identify vocal tract configuration patterns in vocally healthy con- temporary commercial music (CCM) singers during the production of five industry-typical vocal qualities, including various belting qualities and traditional/legit musical theater singing. Study Design. Prospective, observational study. Methods. Seven professional musical theater singers (four females, three males) performed arpeggiated pat- terns using five different vocal qualities: traditional/legit, neutral belt, brassy belt, warm belt, and rock belt. Real-time magnetic resonance imaging captured midsagittal vocal tract configurations

## Chunk Text into 2000 Character Segments

Divide the `cleaned_combined_text` into smaller segments, each containing a maximum of 2000 characters, to facilitate processing or analysis of smaller text blocks.

In [5]:
def chunk_text(text, chunk_size):
    return [text[i:i + chunk_size] for i in range(0, len(text), chunk_size)]

chunk_size = 2000
text_chunks = chunk_text(cleaned_combined_text, chunk_size)

print(f"Total number of chunks: {len(text_chunks)}")
print("\nFirst chunk (2000 characters):")
print(text_chunks[0])

Total number of chunks: 30

First chunk (2000 characters):
Investigating Vocal Tract Configurations Across Different Belting Qualities in Female and Male Musical Theater Singers Using Real-Time Dynamic MRI *,‚Ä†,‚Ä°Denis Michael Rudisch, ¬ßKai Tobias Block, ¬∂Matt Edwards, and ‚à•Aaron M. Johnson, ‚Åé‚Ä†‚Ä°Madison, WI, ¬ß‚à•New York, NY, and ¬∂Winchester, VA Summary: Objectives/Hypothesis. To identify vocal tract configuration patterns in vocally healthy con- temporary commercial music (CCM) singers during the production of five industry-typical vocal qualities, including various belting qualities and traditional/legit musical theater singing. Study Design. Prospective, observational study. Methods. Seven professional musical theater singers (four females, three males) performed arpeggiated pat- terns using five different vocal qualities: traditional/legit, neutral belt, brassy belt, warm belt, and rock belt. Real-time magnetic resonance imaging captured midsagittal vocal tract configu

## Simplify Text Chunks using OpenAI

### Why are we doing this ?
The task we are trying to teach the LLM is
**Given a paragraph in layman language --> rewrite in the language of your professor**

You many have 100's of paragraphs to rewrite into layman language, this can be automated by OPENAI API.

The below paragraph show how it is done.

You will have to purchase API key 5$ will be more than enough for this task.

## Securely Provide OpenAI API Key

Securely provide your OpenAI API key using Colab's User Secrets feature.

### Instructions:
1. Click on the 'key' icon (üîë) in the left sidebar of Google Colab to open the 'Secrets' panel.
2. Click 'Add new secret'.
3. For the **Name**, enter `OPENAI_API_KEY` (this exact name is important for the next step).
4. For the **Value**, paste your actual OpenAI API key.
5. Ensure 'Notebook access' is toggled ON for this notebook.

Once the secret is added, the next code block will retrieve and use it.

In [6]:
import os
from openai import OpenAI

from google.colab import userdata
OPENAI_API_KEY = userdata.get('OPEN_API_KEY')

if not OPENAI_API_KEY:
    raise ValueError("OPENAI_API_KEY not found. Please set it in Colab Secrets.")

# Initialize the OpenAI client
client = OpenAI(api_key=OPENAI_API_KEY)

OPENAI_MODEL = "gpt-5-nano" # You can change this to a different OpenAI model if desired
simplified_chunks = []

print(f"Starting simplification of {len(text_chunks)} chunks using OpenAI model: {OPENAI_MODEL}\n")

for i, chunk in enumerate(text_chunks):
    try:
        prompt = f"Rewrite the following text in very simple words, like how a first draft for a non-expert audience would look like. Focus on clarity and brevity.\n\nText: {chunk}"
        response = client.chat.completions.create(
            model=OPENAI_MODEL,
            messages=[
                {"role": "system", "content": "You are a helpful assistant that simplifies complex text."},
                {"role": "user", "content": prompt}
            ],
            )
        simplified_text = response.choices[0].message.content.strip()
        simplified_chunks.append(simplified_text)
        print(f"Successfully simplified chunk {i+1}/{len(text_chunks)}")

        # Print the first original and simplified chunk for verification
        if i == 0:
            print("\n--- Original First Chunk ---")
            print(chunk[:500]) # Print first 500 chars of original for brevity
            print("\n--- Simplified First Chunk (OpenAI) ---")
            print(simplified_text[:500]) # Print first 500 chars of simplified for brevity

    except Exception as e:
        print(f"Error simplifying chunk {i+1}: {e}")
        simplified_chunks.append(f"Error: {e}") # Append error message to maintain array length

print(f"\nTotal simplified chunks: {len(simplified_chunks)}")


Starting simplification of 30 chunks using OpenAI model: gpt-5-nano

Successfully simplified chunk 1/30

--- Original First Chunk ---
Investigating Vocal Tract Configurations Across Different Belting Qualities in Female and Male Musical Theater Singers Using Real-Time Dynamic MRI *,‚Ä†,‚Ä°Denis Michael Rudisch, ¬ßKai Tobias Block, ¬∂Matt Edwards, and ‚à•Aaron M. Johnson, ‚Åé‚Ä†‚Ä°Madison, WI, ¬ß‚à•New York, NY, and ¬∂Winchester, VA Summary: Objectives/Hypothesis. To identify vocal tract configuration patterns in vocally healthy con- temporary commercial music (CCM) singers during the production of five industry-typical vocal qualities, includin

--- Simplified First Chunk (OpenAI) ---
Here is a simple, first-draft-style summary of the study.

What they wanted to learn
- How the mouth and throat shape changes when professional musical theatre singers use different belting styles, for both women and men.

Who was in the study
- Seven professional theatre singers: 4 women and 3 men.

What

In [7]:
import numpy as np
np.save('simple_chunks.npy',simplified_chunks )

In [8]:
len(simplified_chunks)

30

### Fine tuning LLM


*   Using unsloath for fine tuning
*   I had chosen Llama3.1, you can choose any of the available models.
* I'm depending on the LoRA method for fine tuning, you can also test QLoRA



In [9]:
%%capture
import os, re
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    # Do this only in Colab notebooks! Otherwise use pip install unsloth
    import torch; v = re.match(r"[0-9]{1,}\.[0-9]{1,}", str(torch.__version__)).group(0)
    xformers = "xformers==" + ("0.0.33.post1" if v=="2.9" else "0.0.32.post2" if v=="2.8" else "0.0.29.post3")
    !pip install --no-deps bitsandbytes accelerate {xformers} peft trl triton cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf "datasets==4.3.0" "huggingface_hub>=0.34.0" hf_transfer
    !pip install --no-deps unsloth
!pip install transformers==4.56.2
!pip install --no-deps trl==0.22.2

In [10]:
from unsloth import FastLanguageModel
import torch
max_seq_length = 2048 # Choose any! We auto support RoPE Scaling internally!
dtype = None # None for auto detection. Float16 for Tesla T4, V100, Bfloat16 for Ampere+
load_in_4bit = True # Use 4bit quantization to reduce memory usage. Can be False.

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "unsloth/Llama-3.2-1B-Instruct",# or choose "unsloth/Llama-3.2-3B-Instruct"
    max_seq_length = max_seq_length,
    dtype = dtype,
    load_in_4bit = load_in_4bit,
    # token = "hf_...", # use one if using gated models like meta-llama/Llama-2-7b-hf
)

ü¶• Unsloth: Will patch your computer to enable 2x faster free finetuning.
ü¶• Unsloth Zoo will now patch everything to make training faster!
==((====))==  Unsloth 2025.11.4: Fast Llama patching. Transformers: 4.56.2.
   \\   /|    Tesla T4. Num GPUs = 1. Max memory: 14.741 GB. Platform: Linux.
O^O/ \_/ \    Torch: 2.9.0+cu126. CUDA: 7.5. CUDA Toolkit: 12.6. Triton: 3.5.0
\        /    Bfloat16 = FALSE. FA [Xformers = 0.0.33.post1. FA2 = False]
 "-____-"     Free license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!


We now add LoRA adapters so we only need to update 1 to 10% of all parameters!

In [11]:
model = FastLanguageModel.get_peft_model(
    model,
    r = 8, # Choose any number > 0 ! Suggested 8, 16, 32, 64, 128
    target_modules = ["q_proj", "k_proj", "v_proj", "o_proj",
                      "gate_proj", "up_proj", "down_proj",],
    lora_alpha = 16,
    lora_dropout = 0, # Supports any, but = 0 is optimized
    bias = "none",    # Supports any, but = "none" is optimized
    # [NEW] "unsloth" uses 30% less VRAM, fits 2x larger batch sizes!
    use_gradient_checkpointing = "unsloth", # True or "unsloth" for very long context
    random_state = 3407,
    use_rslora = False,  # We support rank stabilized LoRA
    loftq_config = None, # And LoftQ
)

Unsloth 2025.11.4 patched 16 layers with 16 QKV layers, 16 O layers and 16 MLP layers.


<a name="Data"></a>
### Data Prep
We now use the `Llama-3.1` format for conversation style finetunes. We process ourdataset in ShareGPT style.
```
{"role": "system", "content": "You are an assistant"}
{"role": "user", "content": "What is 2+2?"}
{"role": "assistant", "content": "It's 4."}
```

In [12]:
formatted_data = []

for i in range(len(simplified_chunks)):
    entry = [
        {"role": "system", "content": "You are an assistant who is helping to rewrite a given paragraph"},
        {"role": "user", "content": simplified_chunks[i]},
        {"role": "assistant", "content": text_chunks[i]}
    ]
    formatted_data.append(entry)

# Print the first entry to verify the format
print("First formatted data entry:")
print(formatted_data[0])

print(f"Total number of formatted entries: {len(formatted_data)}")

First formatted data entry:
[{'role': 'system', 'content': 'You are an assistant who is helping to rewrite a given paragraph'}, {'role': 'user', 'content': 'Here is a simple, first-draft-style summary of the study.\n\nWhat they wanted to learn\n- How the mouth and throat shape changes when professional musical theatre singers use different belting styles, for both women and men.\n\nWho was in the study\n- Seven professional theatre singers: 4 women and 3 men.\n\nWhat they did\n- Singers used five voice styles: traditional/legit, neutral belt, brassy belt, warm belt, and rock belt.\n- They sang short, arpeggio-like patterns.\n- Real-time MRI watched their vocal tracts (mouth to throat) while they sang.\n\nWhat they measured\n- Eight features of the mouth and throat:\n  - lip opening\n  - jaw opening\n  - how far the jaw sticks out\n  - tongue height\n  - uvula height\n  - back-of-throat opening (oropharyngeal opening)\n  - larynx (voice box) height\n  - laryngeal tilt\n\nWhat they found

In [13]:
from unsloth.chat_templates import get_chat_template
from datasets import Dataset

dataset = []
for entry in formatted_data:
    simplified = tokenizer.apply_chat_template(entry, tokenize = False, add_generation_prompt = False)
    dataset.append({"conversations":entry,
                    "text":simplified})
dataset = Dataset.from_list(dataset)

In [14]:
dataset[5]["conversations"]

[{'content': 'You are an assistant who is helping to rewrite a given paragraph',
  'role': 'system'},
 {'content': 'This is a small early study about singing in modern commercial music (CCM) and belt-style singing used in musical theater. The goal was to look at how singers shape their mouth and throat (the vocal tract) when producing five common singing styles, including different belt sounds and a traditional ‚Äúlegit‚Äù musical theater style. By studying these voice shapes, researchers want to understand how belting works, what different belt styles do, and to find unusual voice movements that could harm the voice over time.\n\nWhat they did and who took part:\n- The study got approval from NYU‚Äôs ethics board.\n- Seven professional musical theater singers from New York City participated. They can sing in several styles.\n- The singers were classified using a standard system and shown to be high-skill, market-selected theater performers (International Musical Theater Principal; Nat

In [15]:
dataset[5]["text"]

'<|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\nCutting Knowledge Date: December 2023\nToday Date: 30 Nov 2025\n\nYou are an assistant who is helping to rewrite a given paragraph<|eot_id|><|start_header_id|>user<|end_header_id|>\n\nThis is a small early study about singing in modern commercial music (CCM) and belt-style singing used in musical theater. The goal was to look at how singers shape their mouth and throat (the vocal tract) when producing five common singing styles, including different belt sounds and a traditional ‚Äúlegit‚Äù musical theater style. By studying these voice shapes, researchers want to understand how belting works, what different belt styles do, and to find unusual voice movements that could harm the voice over time.\n\nWhat they did and who took part:\n- The study got approval from NYU‚Äôs ethics board.\n- Seven professional musical theater singers from New York City participated. They can sing in several styles.\n- The singers were classified 

<a name="Train"></a>
### Train the model
Now let's train our model. We do 60 steps to speed things up, but you can set `num_train_epochs=1` for a full run, and turn off `max_steps=None`. We also support TRL's `DPOTrainer`!

In [16]:
from trl import SFTConfig, SFTTrainer
from transformers import DataCollatorForSeq2Seq
trainer = SFTTrainer(
    model = model,
    tokenizer = tokenizer,
    train_dataset = dataset,
    dataset_text_field = "text",
    max_seq_length = max_seq_length,
    data_collator = DataCollatorForSeq2Seq(tokenizer = tokenizer),
    packing = False, # Can make training 5x faster for short sequences.
    args = SFTConfig(
        per_device_train_batch_size = 2,
        gradient_accumulation_steps = 4,
        warmup_steps = 5,
        # num_train_epochs = 1, # Set this for 1 full training run.
        max_steps = 60,
        learning_rate = 2e-4,
        logging_steps = 1,
        optim = "adamw_8bit",
        weight_decay = 0.001,
        lr_scheduler_type = "linear",
        seed = 3407,
        output_dir = "outputs",
        report_to = "none", # Use TrackIO/WandB etc
    ),
)

Unsloth: Tokenizing ["text"] (num_proc=6):   0%|          | 0/30 [00:00<?, ? examples/s]

We also use Unsloth's `train_on_completions` method to only train on the assistant outputs and ignore the loss on the user's inputs.

In [17]:
from unsloth.chat_templates import train_on_responses_only
trainer = train_on_responses_only(
    trainer,
    instruction_part = "<|start_header_id|>user<|end_header_id|>\n\n",
    response_part = "<|start_header_id|>assistant<|end_header_id|>\n\n",
)

Map (num_proc=6):   0%|          | 0/30 [00:00<?, ? examples/s]

In [18]:
tokenizer.decode(trainer.train_dataset[5]["input_ids"])

'<|begin_of_text|><|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\nCutting Knowledge Date: December 2023\nToday Date: 30 Nov 2025\n\nYou are an assistant who is helping to rewrite a given paragraph<|eot_id|><|start_header_id|>user<|end_header_id|>\n\nThis is a small early study about singing in modern commercial music (CCM) and belt-style singing used in musical theater. The goal was to look at how singers shape their mouth and throat (the vocal tract) when producing five common singing styles, including different belt sounds and a traditional ‚Äúlegit‚Äù musical theater style. By studying these voice shapes, researchers want to understand how belting works, what different belt styles do, and to find unusual voice movements that could harm the voice over time.\n\nWhat they did and who took part:\n- The study got approval from NYU‚Äôs ethics board.\n- Seven professional musical theater singers from New York City participated. They can sing in several styles.\n- The singers

In [19]:
space = tokenizer(" ", add_special_tokens = False).input_ids[0]
tokenizer.decode([space if x == -100 else x for x in trainer.train_dataset[5]["labels"]])

'                                                                                                                                                                                                                                                                                           rences and trends in industry-relevant CCM singing qualities. The goal of this descriptive pilot study was to identify vocal tract configuration patterns and interactions between anatomical structures within and across vocally healthy CCM singers during the production of five industry-typical vocal qualities, including various belting qualities and a traditional/legit approach used in musical theater singing. The study of vocal tract configurations and differences in vocal tract shapes during CCM singing is necessary to shed light on the ongoing discussion on the production of belting voice qualities, to further describe the typical and distinct functions used in different belting qualities, and to subseque

In [20]:
# @title Show current memory stats
gpu_stats = torch.cuda.get_device_properties(0)
start_gpu_memory = round(torch.cuda.max_memory_reserved() / 1024 / 1024 / 1024, 3)
max_memory = round(gpu_stats.total_memory / 1024 / 1024 / 1024, 3)
print(f"GPU = {gpu_stats.name}. Max memory = {max_memory} GB.")
print(f"{start_gpu_memory} GB of memory reserved.")

GPU = Tesla T4. Max memory = 14.741 GB.
1.203 GB of memory reserved.


In [21]:
trainer_stats = trainer.train()

==((====))==  Unsloth - 2x faster free finetuning | Num GPUs used = 1
   \\   /|    Num examples = 30 | Num Epochs = 15 | Total steps = 60
O^O/ \_/ \    Batch size per device = 2 | Gradient accumulation steps = 4
\        /    Data Parallel GPUs = 1 | Total batch size (2 x 4 x 1) = 8
 "-____-"     Trainable parameters = 5,636,096 of 1,241,450,496 (0.45% trained)


Unsloth: Will smartly offload gradients to save VRAM!


Step,Training Loss
1,2.4616
2,2.26
3,2.6679
4,2.5273
5,2.0606
6,2.3602
7,2.5226
8,2.4838
9,2.4102
10,2.108


 ### Inferencing using text streamer

 You can also use a `TextStreamer` for continuous inference - so you can see the generation token by token, instead of waiting the whole time!

In [22]:
FastLanguageModel.for_inference(model) # Enable native 2x faster inference

messages = [
    {"role": "user", "content": "rewrite this sentence; The goal of this descriptive pilot study was to identify vocal tract configuration patterns and interactions between anatomical structures within and across vocally healthy CCM singers during the production of five industry-typical vocal qualities, including various belting qualities and a traditional/legit approach used in musical theater singing. The study of vocal tract configurations and differences in vocal tract shapes during CCM singing is necessary to shed light on the ongoing discussion on the production of belting voice qualities, to further describe the typical and distinct functions used in different belting qualities, and to subsequently identify atypical vocal tract adjustments that might contribute to phonotrauma-related injury"},
]
inputs = tokenizer.apply_chat_template(
    messages,
    tokenize = True,
    add_generation_prompt = True, # Must add for generation
    return_tensors = "pt",
).to("cuda")

from transformers import TextStreamer
text_streamer = TextStreamer(tokenizer, skip_prompt = True)
_ = model.generate(input_ids = inputs, streamer = text_streamer, max_new_tokens = 128,
                   use_cache = True, temperature = 1.5, min_p = 0.1)

The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.


in the vocally healthy singing subset of CCM singers. Investigating Vocal Tract Configurations and Dynamics 6 The purpose of this descriptive pilot study was to identify vocal tract configuration patterns, anatomical interactions within and across vocal tract configurations, and interactions be- tween anatomical structures within and across vocally healthy CCM singers performing five industry- typical vocal qualities (includ- ing diverse belting voice qualities and a modified version of the traditional/legit approach often used in musical theater singing). Investigating vocal tract configurations and dynamics has been essential to shed light on the ongoing ‚Äúbelting‚Äù voice in- vestigation as


### Results
* Input - rewrite this sentence; The goal of this descriptive pilot study was to identify vocal tract configuration patterns and interactions between anatomical structures within and across vocally healthy CCM singers during the production of five industry-typical vocal qualities, including various belting qualities and a traditional/legit approach used in musical theater singing. The study of vocal tract configurations and differences in vocal tract shapes during CCM singing is necessary to shed light on the ongoing discussion on the production of belting voice qualities, to further describe the typical and distinct functions used in different belting qualities, and to subsequently identify atypical vocal tract adjustments that might contribute to phonotrauma-related injury

* Output - in the vocally healthy CCM singer with the most diverse vocal qualities described during five industry-typical vocal qualities, including various belting qualities and a traditional/legit voice approach used in musical theater singing. Description of vocal tract configurations, interaction of anatomical structures within and across species, anatomical structures in and out of vocally healthy CCM singers During this descriptive pilot study, the goal was to identify vocal tract configurations and vocal tract shape interactions between anatomical structures of anatomical structures across different vocal qualities used in CCM singing during the production of five industry-typical vocal qualities, including various belting qualities, traditional/legit


<a name="Save"></a>
### Saving, loading finetuned models
To save the final model as LoRA adapters, either use Huggingface's `push_to_hub` for an online save or `save_pretrained` for a local save.

**[NOTE]** This ONLY saves the LoRA adapters, and not the full model. To save to 16bit or GGUF, scroll down!

In [23]:
model.save_pretrained("lora_model")  # Local saving
tokenizer.save_pretrained("lora_model")

('lora_model/tokenizer_config.json',
 'lora_model/special_tokens_map.json',
 'lora_model/chat_template.jinja',
 'lora_model/tokenizer.json')

In [24]:
# This step requires more VRAM
#model.save_pretrained_gguf("model", tokenizer, quantization_method = "q4_k_m")