Skip to content

EnvCommons/SocialData

Repository files navigation

SocialData

OpenReward Environment

Description

SocialData is a collection of data science competition environments sourced from DrivenData. It contains 5 multi-turn sandboxed tasks where agents develop machine learning models to solve real-world prediction problems spanning public health, infrastructure, natural language processing, and disaster response.

Capabilities

  • Data exploration and feature engineering
  • Machine learning model development
  • Time series prediction
  • Multi-target classification and regression
  • Document summarization with LLMs

Compute Requirements

Agents are given a sandboxed environment with 1 CPU and 4 GB RAM, with access to scientific Python libraries (pandas, scikit-learn, etc.).

Tasks

There are 5 environment variants, each with a train split:

Variant Description Metric
FluVaccinePrediction Predict H1N1 and seasonal flu vaccination probabilities Mean ROC AUC
PumpItUpPrediction Classify water pump functionality in Tanzania F1-micro
DocSumTask Summarize social science research papers ROUGE-2 F1
DengAIPrediction Predict weekly dengue fever case counts Mean Absolute Error
RichterPrediction Predict earthquake building damage grades F1-micro

Reward Structure

This is a multi-turn environment. Agents explore data, develop models, generate predictions, and submit via the submit_predictions tool. Each variant uses its specific evaluation metric:

  • FluVaccinePrediction: Mean ROC AUC across H1N1 and seasonal targets (0-1)
  • PumpItUpPrediction: Micro-averaged F1 across 3 classes (0-1)
  • DocSumTask: ROUGE-2 F1 score (0-1)
  • DengAIPrediction: Inverted MAE (lower error = higher reward)
  • RichterPrediction: Micro-averaged F1 across 3 damage grades (0-1)

Data

Training data is mounted read-only at /orwd_data. Each competition includes:

  • Training features and labels
  • Test features (labels hidden)
  • Data dictionaries and descriptions

Data is sourced from DrivenData competitions and stored on the OpenReward platform.

Tools

Each variant provides CLI tools plus a submission tool:

Tool Description
bash Execute shell commands in the sandbox
glob Find files by pattern
grep Search file contents
ls List directory contents
read Read file contents
write Write file contents
edit Edit existing files
multi_edit Make multiple edits
todo_write Track task progress
submit_predictions Submit predictions CSV for evaluation. Ends the episode.

Time Horizon

Multi-turn. Agents explore data, develop and train models, generate predictions, save to submission.csv, and submit for evaluation.

Environment Difficulty

[Put environment difficulty here]

Other Environment Requirements

None. All evaluation is deterministic using competition-specific metrics.

Safety

Agents in SocialData work within sandboxed environments to develop ML models. The environment does not present direct safety risks.

Citation

@software{socialdata_openreward,
  title={SocialData: DrivenData Competition Environments for OpenReward},
  author={GeneralReasoning},
  year={2025},
  url={https://openreward.ai/GeneralReasoning/SocialData}
}

About

Social impact data science

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors