diff --git a/plugins/everyrow/skills/everyrow-sdk/SKILL.md b/plugins/everyrow/skills/everyrow-sdk/SKILL.md index f9825fc6..d1ed88aa 100644 --- a/plugins/everyrow/skills/everyrow-sdk/SKILL.md +++ b/plugins/everyrow/skills/everyrow-sdk/SKILL.md @@ -178,3 +178,12 @@ async with create_session(name="Async Ranking") as session: # Continue with other work... result = await task.await_result() ``` + +## Best Practices + +Everyrow operations have associated costs. To avoid re-running them unnecessarily: + +- **Separate data processing from analysis**: Save everyrow results to a file (CSV, Parquet, etc.), then do analysis in a separate script. This way, if analysis code has bugs, you don't re-trigger the everyrow step. +- **Use intermediate checkpoints**: For multi-step pipelines, consider saving results after each everyrow operation. + - You are able to chain multiple operations together without needing to download and re-upload intermediate results via the SDK. However for most control, implement each step as a dedicated job, possibly orchestrated by tools such as Apache Airflow or Prefect. +- **Test with `preview=True`**: Operations like `rank`, `screen`, and `merge` support `preview=True` to process only a few rows first.