Define a strategy for prompt testing #144
Replies: 1 comment 2 replies
-
Totally. This is actually a very good point, also considering that some of the recent changes have introduced bugs. However, it might happen we will need to change it at some point. In that case, I recommend to write down a list of "questions/prompts" we can ask PandasAI to benchmark whether there are regressions. I suggest we write them down in this conversation and once we have, say 50 different use cases, we add to the documentation. In the long run, it would also be cool to run them in the CI when something is changed within a prompt (a little bit expensive, but still shouldn't happen too often). What do you think? |
Beta Was this translation helpful? Give feedback.
-
Hi,
we've been changing / adjusting the code generation prompt a couple of times in the past days. I highly suggest defining a strategy for prompt evaluation, to measure the improvement and make sure it is not patching one failing scenario while breaking several others. Do we have one at the moment ?
Beta Was this translation helpful? Give feedback.
All reactions