Dev finetune update #1698

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Merged

okhat merged 27 commits into main from dev_finetune_update

Nov 7, 2024

Collaborator

dilarasoylu commented Oct 26, 2024

No description provided.

dilarasoylu and others added 16 commits

October 24, 2024 13:05


          Adapter updates


          Client updates

7a41af8


          Add provider

f0339f9


          Add cache dir utils

29268de


          Add BootstrapFinetune

5adbc18


          Add BetterTogether draft

c25c24a


          Prepare PR

923854c


          Add AnyScale changes -- ruff

753888b


          Teporarily remove BetterTogether

11bf92f


          Merge branch 'main' into dev_finetune_update

6e1b72e


          Remove BetterTogether import

f80e207


          Merge branch 'dev_finetune_update' of github.com:stanfordnlp/dspy int…

74ee77f

…o dev_finetune_update


          Add comment

2028bde


          Replace OpenAI client call with library call

184c8f6


          Add OpenAI models list to check valid models

d4ac07c


          Merge branch 'main' into dev_finetune_update

a02c7a8

Collaborator

okhat commented Oct 27, 2024

Note to self: we may need to think about the role of adapter retries at (i) bootstrapping time and (ii) inference time with the FT'ed model. My basic intuition is that (i) is fine but that (ii) needs the adapter with which finetuning was conducted to be attached to the program permanently and saved with it, etc.

dilarasoylu commented

View reviewed changes

dspy/clients/lm.py

    
                      if not data_format:

                          adapter = self.infer_adapter()

                          data_format = infer_data_format(adapter)

                      validate_data_format(data=train_data, data_format=data_format)

Collaborator Author

dilarasoylu Oct 28, 2024

Data validation applicable to all finetunes

chenmoneygithub previously approved these changes

View reviewed changes

Collaborator

chenmoneygithub left a comment

Awesome work! Can we create a notebook (Colab or other format) to demonstrate our use case from end to end? We may want to ship a code example together with this PR.

dspy/adapters/base.py Outdated Show resolved Hide resolved

dspy/adapters/chat_adapter.py

    
                      return format_turn(signature, values, role, incomplete)

                  # TODO(PR): Looks ok?

                  def format_finetune_data(self, signature, demos, inputs, outputs):

Collaborator

chenmoneygithub Oct 29, 2024

I wonder if this is generally true for all finetuning service?

Collaborator

okhat Oct 29, 2024

this function looks good to me.
the provider class for an unusual service can just reformat the data.

Collaborator Author

dilarasoylu Oct 30, 2024

In our case, DataFormat is the object that represents the format for fine-tuning. The purpose of this adapter method is to re-assemble the actual call that's made with its completion and prompt. We could re-name to make this clear.

dspy/clients/anyscale.py

    
              from dspy.clients.finetune import (

                  FinetuneJob,

                  TrainingMethod,

                  # TrainingMethod,

Collaborator

chenmoneygithub Oct 29, 2024

Is this expected?

Collaborator Author

dilarasoylu Oct 30, 2024

Yes, this is a ruff fix change. The anyscale.py file needs to be separately updated to match our new interface.

Collaborator

chenmoneygithub Oct 31, 2024

hmm should we remove this TrainingMethod instead of commenting it out?

Collaborator Author

dilarasoylu Oct 31, 2024

Oh, yes!

dspy/clients/lm.py Outdated

    
                      # Remember to update LM.copy() if you modify the constructor!

                      if launch_kwargs is not None or provider is not None:

                          import dspy

                          assert dspy.settings.experimental, "The `launch_kwargs` and `provider` arguments are experimental. Set `dspy.settings.experimental` to `True` to use them."

Collaborator

chenmoneygithub Oct 29, 2024

In general assert statement is discouraged: https://google.github.io/styleguide/pyguide.html#244-decision. For this case, we can throw a ValueError.

Collaborator

okhat Oct 29, 2024

it's fine for now

dspy/clients/lm.py

    
                      return job

                  def _run_finetune_job(self, job: TrainingJob):

                      # TODO(enhance): We should listen for keyboard interrupts somewhere.

Collaborator

chenmoneygithub Oct 29, 2024

Do we need it? I was assuming finetune will return immediate.

Collaborator Author

dilarasoylu Oct 30, 2024

That's a good point -- finetune returns immediately after starting a thread, which runs this blocking function.

It would be nicer to switch all of this to by async, perhaps we should discuss for the next iteration.

dspy/teleprompt/bootstrap_finetune.py

    
                          key_to_job[key] = lm.finetune(**finetune_kwargs)

                      key_to_lm = {}

                      for ind, (key, job) in enumerate(key_to_job.items()):

Collaborator

chenmoneygithub Oct 29, 2024

Do we need this chunk? We are waiting for all jobs to finish in the order of creation. Shall we create an event for each job thread?

Collaborator Author

dilarasoylu Oct 29, 2024 •

edited

Loading

Event loop is a good idea!

Here we don't mind waiting because we need all the predictors' LMs to be trained in order to have a complete, optimized program.

dspy/teleprompt/bootstrap_finetune.py

    
              class BootstrapFinetune(Teleprompter):

                  # TODO(PR) check with team

Collaborator

chenmoneygithub Oct 29, 2024

Let's add docstring to explain what these args do (they are not very self-explanable)

Collaborator Author

dilarasoylu Oct 29, 2024

Sounds great! Holding off for now in case we decide to change the, making a note for my next update

dspy/teleprompt/bootstrap_finetune.py

    
                      data = []

                      adapter = self.adapter[lm] or lm.infer_adapter()

                      data_format = infer_data_format(adapter)

Collaborator

chenmoneygithub Oct 29, 2024

Instead of having constructor take in adapters, and infer the data format from the adapter. Shall we set data_format in the constructor?

Collaborator Author

dilarasoylu Oct 29, 2024

I think the interaction here could be improved, but I don't have concrete proposals.

Either way, we would need to access an "adapter", that may not be directly inferred from the data_format (Here it can be inferred, but this is not true for all possible data_formats we could have, say prompt, chosen, rejected) Making a note of this for a future discussion.

dspy/teleprompt/bootstrap_finetune.py

    
                  )

                  # TODO(PR): Should "trace" not be included in the lambda function?

                  _metric = metric if metric else lambda example, prediction: 1

                  evaluator(program, metric=_metric)

Collaborator

chenmoneygithub Oct 29, 2024

Why are doing a full eval here?

Collaborator Author

dilarasoylu Oct 29, 2024

I was using the evaluator to make use of the threaded sampling

dspy/teleprompt/bootstrap_finetune.py

    
                      prediction=prediction,

                      trace=trace,

                  )

                  if metric:

Collaborator

chenmoneygithub Oct 29, 2024

Do we need to filter out examples that are below some threshold?

Collaborator Author

dilarasoylu Oct 29, 2024

This function is leaving that to the caller and just worrying about collecting the data -- BootstrapFinetune is filtering the examples (but other optimizers may not)

chenmoneygithub self-requested a review

October 29, 2024 21:23

chenmoneygithub dismissed their stale review

October 29, 2024 21:24

I accidentally clicked on the approval button...

okhat reviewed

View reviewed changes

dspy/adapters/chat_adapter.py

    
                      messages = self.format(signature, demos, inputs)

                      # Add the assistant message

                      role = "assistant"

Collaborator

okhat Oct 29, 2024

nit: just pass the keyword=value to the function directly

okhat reviewed

View reviewed changes

dspy/clients/provider.py

    
              from dspy.utils.logging import logger

              class TrainingJob(Future):

Collaborator

okhat Oct 29, 2024

We need a docstring that says why we inherit from Future, what are we trying to gain from that here

Collaborator Author

dilarasoylu Oct 30, 2024

Sounds good!

Collaborator Author

dilarasoylu Oct 30, 2024

(It allow us to use the following methods to check on the training status of a job: done(), result() and set_result())

dilarasoylu added 5 commits

October 30, 2024 23:21


          Temporarily switch to print

5fa4d28


          Prepare BootstrapFinetune for BetterTogether

027db01


          Add BetterTogether

c13a688


          Merge branch 'dev_finetune_update' of github.com:stanfordnlp/dspy int…

5086d01

…o dev_finetune_update


          Add dev notebook

3f916bc

okhat added the Behavior 2.5 label


          Revamp ChainOfThoughtWithHint and adjust max auto valset of MIPROv2

218b71e

okhat self-requested a review

November 7, 2024 18:04

okhat force-pushed the dev_finetune_update branch 3 times, most recently from 6040baf to 218b71e Compare

November 7, 2024 18:49

okhat added 5 commits

November 7, 2024 10:58


          Merge branch 'main' into dev_finetune_update

b739da2

fix

244fce0


          ruff fixes


          disable cot_hint tests

797d074


          unsafe ruff fixes

8dd3b39

okhat merged commit 1e539b0 into main

3 of 4 checks passed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels