Investigate crewAI-training, see how it could apply to our agents #298

evangriffiths · 2024-07-02T09:46:53Z

https://docs.crewai.com/core-concepts/Training-Crew

Have only had a glance so far. Looks like it does some form of RL to tune the prompts! This is exactly the kind of feature we want for our PM agents to learn over time!

evangriffiths · 2024-07-18T14:22:46Z

Here's the commit where the training feature is added crewAIInc/crewAI@175d5b3.

My understanding of how it works:

it builds on top of an existing 'human feedback' feature that crewai has, where, if enabled:
- when an agent has finished its task, it gives the proposed output to the user, who can give feedback
- the agent re-runs with this feedback to produce the final output
crew.train turns this into an iterative process, where it runs the agent normally, with the 'human feedback' feature enabled, but...
each iteration the proposed output, the human feedback, and the final output are appended to a .pkl file
and each iteration this is loaded and appended to the prompt for each agent
then at the end of the training process, the contents of the .pkl file (the "training data") for each agent are sythesized/summarized into useful learnings, and these are saved to a separate .pkl file
now when the crew is run normally, outside of the train process, it loads the contents of the second .pkl file into the agent's prompt

All-in-all, this is quite similar to what we were trying to do with RememberPastActions, but with important differences:

it uses human-in-the-loop to evaluate agent actions -- we don't have any explicit functionality atm for the agent to evaluate its strategy.
it is way more structured -- i.e. we just let the agent decide when to do RememberPastActions, but here the human gives feedback every iteration, and this is appended to the prompt in a fixed way.

I haven't tried crewai.train, so don't know if it works that well, but I think we could take inspiration from this for the general agent for sure.

evangriffiths · 2024-07-18T14:41:42Z

Overlaps with #207. Think about these issues together

evangriffiths · 2024-08-06T12:06:17Z

I've written up an idea for how to apply the learnings from this feature in a new issue here #370, so closing this issue.

evangriffiths added the general agent label Jul 2, 2024

evangriffiths added the high priority label Jul 18, 2024

evangriffiths mentioned this issue Jul 18, 2024

For microchain/general agent, 'process' learnings to update prompt #207

Open

evangriffiths self-assigned this Jul 22, 2024

kongzii added this to the General Agent milestone Jul 23, 2024

evangriffiths mentioned this issue Aug 6, 2024

Add new prediction agent that improves over time through in-context learning #370

Open

evangriffiths closed this as completed Aug 6, 2024

kongzii removed this from the General Agent - Task Queue milestone Aug 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigate crewAI-training, see how it could apply to our agents #298

Investigate crewAI-training, see how it could apply to our agents #298

evangriffiths commented Jul 2, 2024

evangriffiths commented Jul 18, 2024

evangriffiths commented Jul 18, 2024 •

edited

Loading

evangriffiths commented Aug 6, 2024

Investigate crewAI-training, see how it could apply to our agents #298

Investigate crewAI-training, see how it could apply to our agents #298

Comments

evangriffiths commented Jul 2, 2024

evangriffiths commented Jul 18, 2024

evangriffiths commented Jul 18, 2024 • edited Loading

evangriffiths commented Aug 6, 2024

evangriffiths commented Jul 18, 2024 •

edited

Loading