Skip to content

Conversation

@quic-swatia
Copy link
Contributor

@quic-swatia quic-swatia commented Jan 21, 2025

  1. Adding the support to resume the fine tuning using checkpoints from a prev run which would have stopped in between.
  2. Checkpoints, both intermediate and for complete epoch, will get saved for each epoch through these changes.
  3. There's no necessity to pass tokenizer_name if a model_name is passed. It will take the same name as model_name by default.
    If a different tokenizer_name is required than the model_name, then it can be passed separately as an argument in the command.

… prev run whoch would have stopped in between. There's no necessity to pass tokenizer_name if a model_name is passed. It will take the same name as model_name by default. If a different tokenizer_name is required than the model_name, then it can be passed separately as an argument.

Signed-off-by: Swati Allabadi <quic_sallabad@quicinc.com>
@quic-swatia quic-swatia marked this pull request as ready for review January 28, 2025 07:44
…ers into finetune

Signed-off-by: Swati Allabadi <quic_sallabad@quicinc.com>
Signed-off-by: Swati Allabadi <quic_sallabad@quicinc.com>
…ts and check for loss convergence.

Signed-off-by: Swati Allabadi <quic_sallabad@quicinc.com>
@quic-mamta
Copy link
Contributor

If we don't change output_dir then after resuming FT, will new tensorboard data be appended to previous tensorboard log files?

@quic-swatia
Copy link
Contributor Author

quic-swatia commented Feb 24, 2025

Irrespective of the value of the output_dir, the tensorboard files get saved inside directory named runs. For each fine tuning job, a new directory is created inside runs. So, if we run the following command : "tensorboard --logdir runs --bind_all", tensorboard data from both the jobs will show up together in a single plot.

@quic-swatia quic-swatia merged commit f3d87ad into quic:main Mar 18, 2025
4 checks passed
quic-swatia pushed a commit that referenced this pull request Mar 20, 2025
…re computed (#233)

1) Adding the support to resume the fine tuning using checkpoints from a
prev run which would have stopped in between.
2) Checkpoints, both intermediate and for complete epoch, will get saved
for each epoch through these changes.
3) There's no necessity to pass tokenizer_name if a model_name is
passed. It will take the same name as model_name by default.
If a different tokenizer_name is required than the model_name, then it
can be passed separately as an argument in the command.

---------

Signed-off-by: Swati Allabadi <quic_swatia@quicinc.com>
Co-authored-by: Swati Allabadi <quic-swatia@quicinc.com>
quic-rishinr pushed a commit to quic-rishinr/efficient-transformers that referenced this pull request Mar 21, 2025
…re computed (quic#233)

1) Adding the support to resume the fine tuning using checkpoints from a
prev run which would have stopped in between.
2) Checkpoints, both intermediate and for complete epoch, will get saved
for each epoch through these changes.
3) There's no necessity to pass tokenizer_name if a model_name is
passed. It will take the same name as model_name by default.
If a different tokenizer_name is required than the model_name, then it
can be passed separately as an argument in the command.

---------

Signed-off-by: Swati Allabadi <quic_sallabad@quicinc.com>
Co-authored-by: Swati Allabadi <quic-swatia@quicinc.com>
Signed-off-by: Rishin Raj <quic_rishinr@quicinc.com>
qcdipankar pushed a commit to qcdipankar/efficient-transformers that referenced this pull request Apr 1, 2025
…re computed (quic#233)

1) Adding the support to resume the fine tuning using checkpoints from a
prev run which would have stopped in between.
2) Checkpoints, both intermediate and for complete epoch, will get saved
for each epoch through these changes.
3) There's no necessity to pass tokenizer_name if a model_name is
passed. It will take the same name as model_name by default.
If a different tokenizer_name is required than the model_name, then it
can be passed separately as an argument in the command.

---------

Signed-off-by: Swati Allabadi <quic_sallabad@quicinc.com>
Co-authored-by: Swati Allabadi <quic-swatia@quicinc.com>
Signed-off-by: Dipankar Sarkar <quic_dipankar@quicinc.com>
eplatero97 pushed a commit to eplatero97/efficient-transformers that referenced this pull request Apr 29, 2025
…re computed (quic#233)

1) Adding the support to resume the fine tuning using checkpoints from a
prev run which would have stopped in between.
2) Checkpoints, both intermediate and for complete epoch, will get saved
for each epoch through these changes.
3) There's no necessity to pass tokenizer_name if a model_name is
passed. It will take the same name as model_name by default.
If a different tokenizer_name is required than the model_name, then it
can be passed separately as an argument in the command.

---------

Signed-off-by: Swati Allabadi <quic_swatia@quicinc.com>
Co-authored-by: Swati Allabadi <quic-swatia@quicinc.com>
eplatero97 pushed a commit to eplatero97/efficient-transformers that referenced this pull request Apr 29, 2025
…re computed (quic#233)

1) Adding the support to resume the fine tuning using checkpoints from a
prev run which would have stopped in between.
2) Checkpoints, both intermediate and for complete epoch, will get saved
for each epoch through these changes.
3) There's no necessity to pass tokenizer_name if a model_name is
passed. It will take the same name as model_name by default.
If a different tokenizer_name is required than the model_name, then it
can be passed separately as an argument in the command.

---------

Signed-off-by: Swati Allabadi <quic_swatia@quicinc.com>
Co-authored-by: Swati Allabadi <quic-swatia@quicinc.com>
eplatero97 pushed a commit to eplatero97/efficient-transformers that referenced this pull request Apr 29, 2025
…re computed (quic#233)

1) Adding the support to resume the fine tuning using checkpoints from a
prev run which would have stopped in between.
2) Checkpoints, both intermediate and for complete epoch, will get saved
for each epoch through these changes.
3) There's no necessity to pass tokenizer_name if a model_name is
passed. It will take the same name as model_name by default.
If a different tokenizer_name is required than the model_name, then it
can be passed separately as an argument in the command.

---------

Signed-off-by: Swati Allabadi <quic_swatia@quicinc.com>
Co-authored-by: Swati Allabadi <quic-swatia@quicinc.com>

Signed-off-by: eplatero <quic_eplatero@quicinc.com>
eplatero97 pushed a commit to eplatero97/efficient-transformers that referenced this pull request Apr 29, 2025
…re computed (quic#233)

1) Adding the support to resume the fine tuning using checkpoints from a
prev run which would have stopped in between.
2) Checkpoints, both intermediate and for complete epoch, will get saved
for each epoch through these changes.
3) There's no necessity to pass tokenizer_name if a model_name is
passed. It will take the same name as model_name by default.
If a different tokenizer_name is required than the model_name, then it
can be passed separately as an argument in the command.

---------

Signed-off-by: Swati Allabadi <quic_swatia@quicinc.com>
Co-authored-by: Swati Allabadi <quic-swatia@quicinc.com>

Signed-off-by: eplatero <quic_eplatero@quicinc.com>
eplatero97 pushed a commit to eplatero97/efficient-transformers that referenced this pull request Apr 29, 2025
…re computed (quic#233)

1) Adding the support to resume the fine tuning using checkpoints from a
prev run which would have stopped in between.
2) Checkpoints, both intermediate and for complete epoch, will get saved
for each epoch through these changes.
3) There's no necessity to pass tokenizer_name if a model_name is
passed. It will take the same name as model_name by default.
If a different tokenizer_name is required than the model_name, then it
can be passed separately as an argument in the command.

---------

Signed-off-by: Swati Allabadi <quic_swatia@quicinc.com>
Co-authored-by: Swati Allabadi <quic-swatia@quicinc.com>

Signed-off-by: eplatero <quic_eplatero@quicinc.com>
eplatero97 pushed a commit to eplatero97/efficient-transformers that referenced this pull request Apr 29, 2025
…re computed (quic#233)

1) Adding the support to resume the fine tuning using checkpoints from a
prev run which would have stopped in between.
2) Checkpoints, both intermediate and for complete epoch, will get saved
for each epoch through these changes.
3) There's no necessity to pass tokenizer_name if a model_name is
passed. It will take the same name as model_name by default.
If a different tokenizer_name is required than the model_name, then it
can be passed separately as an argument in the command.

---------

Signed-off-by: Swati Allabadi <quic_swatia@quicinc.com>
Co-authored-by: Swati Allabadi <quic-swatia@quicinc.com>

Signed-off-by: eplatero <quic_eplatero@quicinc.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants