Skip to content

Push on main

Push on main #16993

GitHub Actions / promptflow-evals test result succeeded May 22, 2024 in 0s

All 66 tests pass in 15m 39s

 12 files   12 suites   15m 39s ⏱️
 66 tests  66 ✅ 0 💤 0 ❌
792 runs  792 ✅ 0 💤 0 ❌

Results for commit f17b01e.

Annotations

Check notice on line 0 in .github

See this annotation in the file changed.

@github-actions github-actions / promptflow-evals test result

66 tests found

There are 66 tests, see "Raw output" for the full list of tests.
Raw output
tests.evals.unittests.test_batch_run_context.TestBatchRunContext ‑ test_with_codeclient
tests.evals.unittests.test_batch_run_context.TestBatchRunContext ‑ test_with_pfclient
tests.evals.unittests.test_built_in_evaluator.TestBuiltInEvaluators ‑ test_fluency_evaluator
tests.evals.unittests.test_built_in_evaluator.TestBuiltInEvaluators ‑ test_fluency_evaluator_empty_string
tests.evals.unittests.test_built_in_evaluator.TestBuiltInEvaluators ‑ test_fluency_evaluator_non_string_inputs
tests.evals.unittests.test_chat_evaluator.TestChatEvaluator ‑ test_conversation_validation_invalid_citations
tests.evals.unittests.test_chat_evaluator.TestChatEvaluator ‑ test_conversation_validation_missing_role
tests.evals.unittests.test_chat_evaluator.TestChatEvaluator ‑ test_conversation_validation_normal
tests.evals.unittests.test_chat_evaluator.TestChatEvaluator ‑ test_conversation_validation_question_answer_not_paired
tests.evals.unittests.test_chat_evaluator.TestChatEvaluator ‑ test_per_turn_results_aggregation
tests.evals.unittests.test_content_safety_chat_evaluator.TestChatEvaluator ‑ test_conversation_validation_missing_role
tests.evals.unittests.test_content_safety_chat_evaluator.TestChatEvaluator ‑ test_conversation_validation_normal
tests.evals.unittests.test_content_safety_chat_evaluator.TestChatEvaluator ‑ test_conversation_validation_question_answer_not_paired
tests.evals.unittests.test_content_safety_chat_evaluator.TestChatEvaluator ‑ test_per_turn_results_aggregation
tests.evals.unittests.test_content_safety_defect_rate.TestContentSafetyDefectRate ‑ test_content_safety_defect_rate
tests.evals.unittests.test_evaluate.TestEvaluate ‑ test_apply_column_mapping
tests.evals.unittests.test_evaluate.TestEvaluate ‑ test_apply_column_mapping_target[json_data0-inputs_mapping0-I'm fine]
tests.evals.unittests.test_evaluate.TestEvaluate ‑ test_apply_column_mapping_target[json_data1-inputs_mapping1-I'm great]
tests.evals.unittests.test_evaluate.TestEvaluate ‑ test_apply_column_mapping_target[json_data2-inputs_mapping2-I'm fine]
tests.evals.unittests.test_evaluate.TestEvaluate ‑ test_apply_column_mapping_target[json_data3-inputs_mapping3-I'm fine]
tests.evals.unittests.test_evaluate.TestEvaluate ‑ test_apply_column_mapping_target[json_data4-inputs_mapping4-I'm great]
tests.evals.unittests.test_evaluate.TestEvaluate ‑ test_apply_column_mapping_target[json_data5-inputs_mapping5-I'm fine]
tests.evals.unittests.test_evaluate.TestEvaluate ‑ test_apply_target_to_data[questions.jsonl-questions_answers.jsonl-expected_columns0-_target_fn]
tests.evals.unittests.test_evaluate.TestEvaluate ‑ test_apply_target_to_data[questions_ground_truth.jsonl-questions_answers_ground_truth.jsonl-expected_columns1-_target_fn2]
tests.evals.unittests.test_evaluate.TestEvaluate ‑ test_evaluate_evaluators_not_a_dict
tests.evals.unittests.test_evaluate.TestEvaluate ‑ test_evaluate_invalid_data
tests.evals.unittests.test_evaluate.TestEvaluate ‑ test_evaluate_invalid_evaluator_config
tests.evals.unittests.test_evaluate.TestEvaluate ‑ test_evaluate_invalid_jsonl_data
tests.evals.unittests.test_evaluate.TestEvaluate ‑ test_evaluate_missing_data
tests.evals.unittests.test_evaluate.TestEvaluate ‑ test_evaluate_missing_required_inputs
tests.evals.unittests.test_evaluate.TestEvaluate ‑ test_evaluate_missing_required_inputs_target
tests.evals.unittests.test_evaluate.TestEvaluate ‑ test_evaluate_output_path[False]
tests.evals.unittests.test_evaluate.TestEvaluate ‑ test_evaluate_output_path[True]
tests.evals.unittests.test_evaluate.TestEvaluate ‑ test_evaluate_with_errors
tests.evals.unittests.test_evaluate.TestEvaluate ‑ test_renaming_column
tests.evals.unittests.test_evaluate.TestEvaluate ‑ test_target_raises_on_outputs
tests.evals.unittests.test_evaluate.TestEvaluate ‑ test_wrong_target
tests.evals.unittests.test_get_trace_destination_config.TestGetTraceDestinationConfig ‑ test_get_trace_destination_config[NONE-None]
tests.evals.unittests.test_get_trace_destination_config.TestGetTraceDestinationConfig ‑ test_get_trace_destination_config[NoNe-None]
tests.evals.unittests.test_get_trace_destination_config.TestGetTraceDestinationConfig ‑ test_get_trace_destination_config[None-None]
tests.evals.unittests.test_get_trace_destination_config.TestGetTraceDestinationConfig ‑ test_get_trace_destination_config[none-None]
tests.evals.unittests.test_get_trace_destination_config.TestGetTraceDestinationConfig ‑ test_get_trace_destination_config_with_override
tests.evals.unittests.test_save_eval.TestSaveEval ‑ test_load_and_run_evaluators
tests.evals.unittests.test_save_eval.TestSaveEval ‑ test_save_evaluators[ChatEvaluator]
tests.evals.unittests.test_save_eval.TestSaveEval ‑ test_save_evaluators[CoherenceEvaluator]
tests.evals.unittests.test_save_eval.TestSaveEval ‑ test_save_evaluators[ContentSafetyChatEvaluator]
tests.evals.unittests.test_save_eval.TestSaveEval ‑ test_save_evaluators[ContentSafetyEvaluator]
tests.evals.unittests.test_save_eval.TestSaveEval ‑ test_save_evaluators[F1ScoreEvaluator]
tests.evals.unittests.test_save_eval.TestSaveEval ‑ test_save_evaluators[FluencyEvaluator]
tests.evals.unittests.test_save_eval.TestSaveEval ‑ test_save_evaluators[GroundednessEvaluator]
tests.evals.unittests.test_save_eval.TestSaveEval ‑ test_save_evaluators[HateUnfairnessEvaluator]
tests.evals.unittests.test_save_eval.TestSaveEval ‑ test_save_evaluators[QAEvaluator]
tests.evals.unittests.test_save_eval.TestSaveEval ‑ test_save_evaluators[RelevanceEvaluator]
tests.evals.unittests.test_save_eval.TestSaveEval ‑ test_save_evaluators[SelfHarmEvaluator]
tests.evals.unittests.test_save_eval.TestSaveEval ‑ test_save_evaluators[SexualEvaluator]
tests.evals.unittests.test_save_eval.TestSaveEval ‑ test_save_evaluators[SimilarityEvaluator]
tests.evals.unittests.test_save_eval.TestSaveEval ‑ test_save_evaluators[ViolenceEvaluator]
tests.evals.unittests.test_simulator.TestSimulator ‑ test_initialization_with_all_valid_scenarios
tests.evals.unittests.test_simulator.TestSimulator ‑ test_simulator_raises_validation_error_with_unsupported_scenario
tests.evals.unittests.test_synthetic_callback_conv_bot.TestCallbackConversationBot ‑ test_generate_response_with_callback_exception
tests.evals.unittests.test_synthetic_callback_conv_bot.TestCallbackConversationBot ‑ test_generate_response_with_no_callback_response
tests.evals.unittests.test_synthetic_callback_conv_bot.TestCallbackConversationBot ‑ test_generate_response_with_valid_callback
tests.evals.unittests.test_synthetic_conversation_bot.TestConversationBot ‑ test_conversation_bot_initialization_assistant
tests.evals.unittests.test_synthetic_conversation_bot.TestConversationBot ‑ test_conversation_bot_initialization_user
tests.evals.unittests.test_synthetic_conversation_bot.TestConversationBot ‑ test_generate_response_first_turn_with_starter
tests.evals.unittests.test_synthetic_conversation_bot.TestConversationBot ‑ test_generate_response_with_history_and_role