Feature/implement the mentalchat16k dataset support for clinical evaluation#1218
Merged
chakravarthik27 merged 56 commits intorelease/2.7.0from Sep 9, 2025
Conversation
…n Snow Labs to Pacific AI Corp across documentation.
…n Snow Labs to Pacific AI Corp across documentation.
…corp' of https://github.com/pacific-ai-corp/langtest into refactor/replace-links-from-johnsnowlabs-to-pacific-ai-corp
…ified model classes for openrouter integration; adjusted MedFuzzSample initialization to exclude None and unset values
…xisting evaluation framework
…tationMetricScores and MHCEvaluation classes
…from-johnsnowlabs-to-pacific-ai-corp Refactor/replace links from johnsnowlabs to pacific ai corp
…xisting evaluation framework
…tationMetricScores and MHCEvaluation classes
…-clinical-evaluation' of https://github.com/Johnsnowlabs/langtest into feature/implement-the-mentalchat16k-dataset-support-for-clinical-evaluation
…for better type safety
…tion processing logic
…fleoptions sample
…eToSummarySample classes
…-Corp in README.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request introduces a new mental health evaluation capability to the codebase, enabling the assessment of AI-generated mental health counseling responses using a set of clinical consultation metrics. It adds a new evaluation prompt and schema, a corresponding evaluation class, and integrates this functionality into the clinical test transformation pipeline. The changes also include a new
SimplePromptsample type to support these evaluations and ensure results are parsed and scored appropriately.Mental Health Evaluation Integration
MENTAL_HEALTH_EVAL_PROMPTandMHCEvaluationschema ineval_prompts.pyto define a structured prompt and scoring rubric for mental health counseling response evaluation.RatingEvalclass inllm_eval.py, which uses the new prompt and schema to parse and score AI responses, including batch evaluation support.mental_healthtest type, with a dedicatedMentalHealthclass that loads data, transforms samples, and runs evaluations using the new prompt and scoring system. [1] [2]Sample Type Extension
SimplePromptclass tosample.py, designed for prompt-response pairs, with methods for evaluation, scoring, and pass/fail determination using the mental health metrics and the new evaluation pipeline.Internal Imports and Wiring