During Agent Evaluation:
According to this sentence mentioned in the doc, for my case:
tests/agent_evaluations/test_agent_evaluations.py::test_agent_evaluations[journal_agent.test] Summary: `EvalStatus.NOT_EVALUATED` for Metric: `tool_trajectory_avg_score`. Expected threshold: `0.6`, actual value: `None`.
+----+-------------------+---------+-------------+---------------------------------------------------------------+---------------------+-------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| | eval_status | score | threshold | prompt | expected_response | actual_response | expected_tool_calls | actual_tool_calls |
+====+===================+=========+=============+===============================================================+=====================+===================+===========================================================================================================================================================================================================================================+================================================================================================================================================================================================================================================+
| 0 | EvalStatus.FAILED | 0 | 0.6 | I need to create a journal entry to accrue $50,000 in revenue | | | id=None args={'agent_name': 'journal_agent'} name='transfer_to_agent' | id='call_AAN5F6rdDkHFtytOr1UZvsmc' args={'agent_name': 'journal_agent'} name='transfer_to_agent' |
| | | | | | | | id=None args={'request': "Create a journal entry to accrue $50,000 in revenue. Set journal type to Accrual and amount to 50,000. Leave ledger and period for user selection.'} name='form_generation_agent"} name='form_generation_agent' | id='call_Vd8tuvc74i9UFKhkJOJUqjUX' args={'request': 'Create a journal entry to accrue $50,000 in revenue. Journal type: Accrual, Amount: 50,000 (credit revenue, debit accounts receivable or accrued revenue).'} name='form_generation_agent' |
+----+-------------------+---------+-------------+---------------------------------------------------------------+---------------------+-------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| 1 | EvalStatus.FAILED | 0 | 0.6 | I need to create a journal entry to accrue $50,000 in revenue | | | id=None args={'agent_name': 'journal_agent'} name='transfer_to_agent' | id='call_Ds9YDzxii51DyWt62Erzvxc9' args={'agent_name': 'journal_agent'} name='transfer_to_agent' |
| | | | | | | | id=None args={'request': "Create a journal entry to accrue $50,000 in revenue. Set journal type to Accrual and amount to 50,000. Leave ledger and period for user selection.'} name='form_generation_agent"} name='form_generation_agent' | id='call_mHIHBlNcx0WQKYP1VbhasbZM' args={'request': 'Create a journal entry to accrue $50,000 in revenue. Amount: 50000, type: Accrual, purpose: revenue accrual'} name='form_generation_agent' |
+----+-------------------+---------+-------------+---------------------------------------------------------------+---------------------+-------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
For the first test case, the score should be 0.5 because the first tool call match exactly giving it 1 if we don't consider id and second tool call will be 0 as it doesn't match but its still saying 0 in the tool trajectory avg score.
Desktop (please complete the following information):
- OS: [e.g. macOS, Linux, Windows] Mac
- Python version(python -V): 3.12
- ADK version(pip show google-adk): 2.0.0
Model Information:
- Are you using LiteLLM: Yes
During Agent Evaluation:
According to this sentence mentioned in the doc, for my case:
For the first test case, the score should be 0.5 because the first tool call match exactly giving it 1 if we don't consider
idand second tool call will be 0 as it doesn't match but its still saying 0 in the tool trajectory avg score.Desktop (please complete the following information):
Model Information: