Skip to content

Upgrade dataset analysis and visualization with extended metadata and interactive UI#377

Merged
cristian-tamblay merged 13 commits into
developfrom
feat/dataset-info
Nov 11, 2025
Merged

Upgrade dataset analysis and visualization with extended metadata and interactive UI#377
cristian-tamblay merged 13 commits into
developfrom
feat/dataset-info

Conversation

@Irozuku
Copy link
Copy Markdown
Collaborator

@Irozuku Irozuku commented Nov 11, 2025

This pull request introduces a major upgrade to the dataset analysis and visualization features in DashAI. The backend now computes and stores extended metadata for datasets, including detailed statistical summaries and quality indicators. The frontend has been refactored to present this information in a more interactive, tabbed interface, allowing users to explore different aspects of their datasets such as overview, numeric and categorical analysis, data quality, and correlations. Additionally, the frontend adds new dependencies and UI improvements for enhanced visualization.

Backend: Extended Metadata Computation and Storage

  • The method nan_per_column is replaced by compute_metadata, which now calculates and stores comprehensive metadata for each dataset, including NaN counts, column types, numeric/categorical/text stats, quality indicators, and correlations, all in self.splits for frontend visualization.
  • Metadata saving and retrieval logic is updated to ensure the new extended metadata is correctly persisted and loaded for frontend use.
  • The job pipeline now calls compute_metadata instead of the old NaN calculation method, ensuring all new metadata is available after dataset processing.

Frontend: Visualization Refactor and UI Enhancements

  • The dataset visualization component is refactored to use a tabbed interface (Tabs, Tab), with dedicated tabs for Overview, Numerical Analysis, Categorical, Data Quality, and Correlations, each powered by the new backend metadata.
  • The quick stats and quality score alert are improved, providing users with a clear summary of dataset health and key indicators at a glance.
  • New UI components and dependencies are added (e.g., recharts for charts, new MUI icons and controls) to support richer data presentation and interactivity.
Video_251111121229.mp4

@cristian-tamblay cristian-tamblay merged commit 99375f8 into develop Nov 11, 2025
18 checks passed
@cristian-tamblay cristian-tamblay deleted the feat/dataset-info branch November 11, 2025 17:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants