Skip to content

refactor: back2onehypo #805

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
82 changes: 82 additions & 0 deletions rdagent/scenarios/data_science/proposal/exp_gen/prompts_v2.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -212,3 +212,85 @@ output_format:
}
},
}

hypothesis_v2: |-
{
"component": "The component name that the hypothesis {% if pipeline %}mainly {% endif %}focuses on. Must be one of ('DataLoadSpec', 'FeatureEng', 'Model', 'Ensemble', 'Workflow').",
"hypothesis": "Start with [Ablate with Trial xx]. A concise, testable statement derived from previous experimental outcomes. Limit it to one or two sentences that clearly specify the expected change or improvement in the <component>'s performance.",
"reason": "A brief explanation, also in one or two sentences, outlining the rationale behind the hypothesis. It should reference specific trends or failures from past experiments and explain how the proposed approach may address these issues.",
}




hypo_task_gen:
system: |-
{% include "scenarios.data_science.share:scen.role" %}
The user is improving a Kaggle competition implementation iteratively through traces where each new trace is modified from the current SOTA in the trace, not necessarily the immediate predecessor.

You will be given a competition scenario, a description of the trace history, and the current SOTA implementation. You need to carefully analyze the past experiments and their corresponding results to determine which techniques are appropriate for the current competition context. Then, based on the current experiment, propose a new hypothesis (which must be grounded in previous shortcomings and designed to produce an ablation effect). Finally, formulate a detailed experimental task that aligns with the hypothesis and addresses the targeted areas for improvement.

## Step 1: Step 1: Analyze Previous Experiments and Feedback
Carefully review prior experiments and their results; identify which techniques have been tried, which have failed, and which gaps remain.
Do not output this analysis explicitly.

## Step2: Hypothesis Proposal
### Hypothesis Defination
A hypothesis is a precise, testable, and actionable statement that proposes a specific modification or improvement to address an identified problem in a Kaggle competition implementation. Be careful not to use vague phrases like "xx technique"—you must be specific and clearly state the exact method or approach being proposed.
{% if not pipeline %}
Each hypothesis should focus on one of the following 5 components of an implementation:
{{ component_desc }}
{% else %}
Although we don't require each hypothesis to focus on single specific component, you should still respond the main component that the hypothesis focus on.
Candidate components are:
{{ component_desc }}
{% endif %}

### Hypothesis Specification
{{ hypothesis_spec }}

### Hypothesis Output Format
{{ hypothesis_output_format }}

## Step3: Task Design
### Task Design Defination
Your first task is to generate new solution based on the proposed hypothesis. Your task should very detailed with specific steps and instructions. The task should be specific and fine-grained, avoiding general or vague statements.

### Task Design Specification
{{ task_specification }}

### Task Design Guidelines
The task should be concise with several steps each only in a few sentences.
DO NOT repeat the details which has already included in the SOTA code. If the SOTA code has covered the steps perfectly, you should not repeat the steps in detail.
DO NOT write any code in the task description!
Observing reasons from failed experiments and feedback to prevent repeating similar mistakes in analogous situations.

### [Partial Response Format 1] Task Output Format:
{{ task_output_format }}

{% if workflow_check %}
## Step 4: Workflow Update
Since components have dependencies, your second task is to update the workflow to reflect the changes made to the target component. Please also decide whether the workflow needs to be updated and provide a brief description of the change task.
{{ component_desc }}
[Partial Response Format 2] Your generated workflow description should be a simple text and the following agent will do the implementation. If you think the workflow should not be updated, just respond with "No update needed".
{% endif %}

Your final output should strictly adhere to the following JSON format.
{
"hypothesis": ---he dict corresponding to Hypothesis Output Format---,
"task_design": ---The dict corresponding to Task Output Format---,
{% if workflow_check %}"workflow_update": ---A string corresponding to workflow description--- {% endif %}
}

user: |-
# Scenario Description
{{ scenario_desc }}

# Previous Experiments and Feedbacks:
{{ exp_and_feedback_list_desc }}

# Current SOTA Implementation
{{ sota_exp_desc }}

# Feedback from Previous Failed Experiments (e.g., experiments that did not pass evaluation, encountered bugs, or failed to surpass SOTA performance):
{{ failed_exp_and_feedback_list_desc }}
23 changes: 23 additions & 0 deletions rdagent/scenarios/data_science/proposal/exp_gen/proposal.py
Original file line number Diff line number Diff line change
Expand Up @@ -354,6 +354,8 @@ def task_gen(
task_spec = sota_exp.experiment_workspace.file_dict[component_info["spec_file"]]
else:
task_spec = T(f"scenarios.data_science.share:component_spec.{hypothesis.component}").r()


sys_prompt = T(".prompts_v2:task_gen.system").r(
targets=component_info["target_name"],
task_specification=task_spec,
Expand Down Expand Up @@ -430,6 +432,27 @@ def gen(self, trace: DSTrace, pipeline: bool = False) -> DSExperiment:
type="failed",
)

if pipeline:
component_info = COMPONENT_TASK_MAPPING["Pipeline"]
else:
component_info = COMPONENT_TASK_MAPPING.get(hypothesis.component)


sys_prompt = T(".prompts_v2:hypo_task_gen.system").r(
component_desc=component_desc,
hypothesis_spec=T(".prompts_v2:specification.hypothesis").r(),
hypothesis_output_format=T(".prompts_v2:output_format.hypothesis_v2").r(pipeline=pipeline),
task_specification=task_spec,
task_output_format=component_info["task_output_format"],
workflow_check=not pipeline and hypothesis.component != "Workflow",
)
user_prompt = T(".prompts_v2:hypo_task_gen.user").r(
scenario_desc=scenario_desc,
exp_feedback_list_desc=exp_feedback_list_desc,
sota_exp_desc=sota_exp_desc,
failed_exp_and_feedback_list_desc=failed_exp_feedback_list_desc,
)

# Step 1: Identify problems
scen_problems = self.identify_scenario_problem(
scenario_desc=scenario_desc,
Expand Down