Skip to content

Replace explorations with notebook-centric API and redesign dataset module#271

Merged
cristian-tamblay merged 219 commits into
developfrom
improvement/dataset-exploration
Sep 5, 2025
Merged

Replace explorations with notebook-centric API and redesign dataset module#271
cristian-tamblay merged 219 commits into
developfrom
improvement/dataset-exploration

Conversation

@Irozuku
Copy link
Copy Markdown
Collaborator

@Irozuku Irozuku commented Aug 22, 2025

This pull request introduces significant changes to the backend API and frontend design, primarily focusing on migrating functionality from datasets and explorations to a new notebook-centric model, removing the explorations endpoint, and expanding dataset file access capabilities. The changes also update validation logic and add new endpoints for interacting with datasets via file paths.

Notebook-centric API migration and endpoint updates:

  • Removed the entire explorations API (endpoints/explorations.py), and updated the main API router to drop the /exploration endpoints and add /notebook endpoints, reflecting a shift from exploration-based to notebook-based workflows. [1] [2] [3]
  • Refactored the converters endpoint to operate on notebooks instead of datasets: updated models, parameters, and logic to use notebook_id, and added a new endpoint to fetch finished converters by notebook. [1] [2] [3] [4] [5]
  • Updated explorers endpoint validation and logic to reference notebooks (and their datasets) instead of explorations, ensuring all explorer operations are now notebook-centric. [1] [2] [3] [4]

Enhanced dataset file access and utility endpoints:

  • Added multiple new endpoints to the datasets API to:
    • Fetch a sample of rows directly from a dataset file path.
    • Retrieve dataset info and column types by file path.
    • Paginate through dataset rows by file, returning row data and total count. [1] [2] [3]
  • Improved error handling and JSON serialization for dataset file access. [1] [2]

These changes collectively modernize the backend data model, streamline API usage around notebooks, and provide more flexible, file-based dataset access for downstream consumers.

Irozuku and others added 30 commits August 4, 2025 17:57
…ate converter_job with the new ConverterList model
Irozuku and others added 21 commits August 21, 2025 13:08
@Irozuku Irozuku force-pushed the improvement/dataset-exploration branch from 6d97c26 to 1955118 Compare August 26, 2025 13:13
Copy link
Copy Markdown
Member

@cristian-tamblay cristian-tamblay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general it looks good, couldnt review it completely

@cristian-tamblay cristian-tamblay merged commit a3ec6f3 into develop Sep 5, 2025
8 checks passed
@cristian-tamblay cristian-tamblay deleted the improvement/dataset-exploration branch September 5, 2025 18:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants