Skip to content

Conversation

rdheekonda
Copy link
Contributor

@rdheekonda rdheekonda commented Aug 29, 2025

Key Changes:

  • Migrated export_runs to paginated endpoint for better performance
  • Changed from in-memory DataFrame to disk-based file exports
  • Added support for multiple output formats (parquet, CSV, JSON, and JSONL)

Added:

  • Paginated export endpoint integration (/export/paginated)
  • Disk-based export with organized directory structure
  • Multiple format support: parquet, CSV, JSON, JSONL
  • Intelligent file naming based on chunk size and run count
  • Helper functions for loading exported data back into DataFrames
  • Progress logging and automatic cleanup of old exports
  • New parameters: format and base_dir for export customization

Changed:

  • export_runs() return type from DataFrame to directory path string
  • Updated API client to handle paginated responses with headers
  • Enhanced error handling and logging throughout export process
  • Modified function signatures to include new export parameters
  • Updated method documentation with new parameter descriptions

Documentation Updates:

  • export guide with new paginated workflow examples
  • Added file structure documentation and loading patterns
  • Updated all export_runs usage examples across documentation
  • Added helper function examples for loading exported files
  • Enhanced filtering and export format examples

Generated Summary:

  • Introduced paginated export functionality for run data, enhancing the export mechanism:
    • Exports data to disk instead of returning a DataFrame directly.
    • Incorporates robust logging to monitor the export process and its outcomes.
    • Provides an intuitive way to specify export formats (parquet, csv, json, jsonl) and base directory options.
  • Updated export_runs method to handle pagination, ensuring all data is written to appropriately named files.
  • Made structural updates in the API documentation reflecting new export behaviors and parameters:
    • Included format and base_dir parameters in the export_runs method documentation.
    • Adjusted example code throughout documents to utilize the new method signatures, emphasizing consistency and usability.
  • Enhanced error handling to ensure clarity on export failures if encountered.
  • Fixed import statements to include necessary libraries when they are used in methods, ensuring avoidance of unused import warnings.

This summary was generated with ❤️ by rigging

This update significantly improves export performance for large datasets by leveraging server-side pagination and organized file-based storage, eliminating memory constraints while providing flexible output formats.

@dreadnode-renovate-bot dreadnode-renovate-bot bot added area/docs Changes to documentation and guides type/docs Documentation updates and improvements labels Aug 29, 2025
@rdheekonda rdheekonda merged commit 1ef3266 into main Sep 2, 2025
4 of 8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/docs Changes to documentation and guides type/docs Documentation updates and improvements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant