Inspect β’ Visualize β’ Clean β’ Generate Insights
Stay inside your editor. Explore datasets faster.
- Overview
- Why DataGuard
- Features
- Design Goals
- Supported Formats
- Smart Activation
- Screenshots
- Demonstration
- Dataset Attribution
- Workflow
- Installation
- Requirements
- Tech Stack
- Privacy
- Performance
- Distribution
- Security
- License
- Team
DataGuard is a Visual Studio Code extension designed to make dataset exploration feel native to the editor experience.
Open a supported dataset to instantly access an interactive workspace for analysis, visual inspection, visualization, and safe cleaning operations β without switching to notebooks, or external tools.
Core processing runs locally.
Tip
AI capabilities can be enabled for deeper dataset insights.
Traditional dataset workflows usually involve unnecessary context switching:
Open Dataset
β Launch Notebook
β Inspect
β Clean
β Export
β Return to Editor
DataGuard reduces that workflow into a single environment.
- Local-first execution
- Fast feedback loops
- Minimal context switching
- Safe modification workflow
- Optional AI augmentation
- Automatic dataset profiling
- Dataset overview and metadata
- Row and column inspection
- Column type detection
- Missing value analysis
- Duplicate discovery
- Statistical summaries
- Dataset composition
- Numerical distributions
- Missing value breakdown
- Column exploration
- Interactive charts
Perform cleaning operations directly inside VS Code:
- Remove duplicates
- Fill missing values
- Convert data types
- Column operations
- Safe save workflow
Note
Changes are applied locally.
Configure an AI provider to generate:
- Dataset summaries
- Pattern discovery
- Cleaning suggestions
- High-level observations
Compatible with configurable AI providers:
- OpenAI
- Anthropic
- Google Gemini
- Groq
- Cohere
Note
AI features are optional and remain disabled until explicitly configured.
- Fast startup
- Local-first processing
- Minimal workflow interruption
- Safe cleaning operations
- Optional AI assistance
| Format | Supported |
|---|---|
| CSV | β |
| TSV | β |
| JSON | β |
DataGuard activates only for supported dataset files.
To avoid unnecessary interruptions during development workflows, common project and configuration JSON files are intentionally ignored.
Examples:
package.json
package-lock.json
tsconfig.json
jsconfig.json
launch.json
settings.json
extensions.json
tasks.json
manifest.json
devcontainer.json
Watch the demo on YouTube:
Screenshots, demonstrations, and promotional materials shown in this repository may include examples generated using the googleplaystore.csv dataset.
Dataset source:
- L. Gupta, "Google Play Store Apps," Feb 2019. [Online]. Available: Kaggle
Usage purpose:
- Product demonstration
- Dashboard showcase
- Visualization examples
- Documentation screenshots
Important
DataGuard is not affiliated with, endorsed by, or associated with the dataset maintainers.
Note
The extension itself is dataset-agnostic and supports analysis of user-provided datasets in supported formats.
Open Dataset
β
Automatic Detection
β
Local Processing
β
Interactive Dashboard
β
Visualize β’ Inspect β’ Clean
β
Save Changes
- Open Visual Studio Code
- Open Extensions tab
- Search for DataGuard
- Click Install
- Open a supported dataset
Repository releases include installable extension packages.
- Open the GitHub Releases page
- Download the available
.vsix - Open Visual Studio Code
- Extensions β Install from VSIX
- Select downloaded package
| Requirement | Version |
|---|---|
| VS Code | Latest Stable |
| Python | 3.10+ |
DataGuard automatically detects required Python dependencies.
If manual installation is needed:
pip install pandas numpyIf Python path detection does not work:
Open Command Palette β DataGuard: Set Python Path
| Layer | Technology |
|---|---|
| Extension Platform | VS Code Extension API |
| Runtime | TypeScript |
| Processing | Python |
| Data Engine | Pandas |
| Visualization | Chart.js |
| Packaging | VSCE |
DataGuard processes datasets locally.
AI features require explicit configuration.
No data leaves your machine unless an AI provider is enabled.
Performance depends on:
- Dataset size
- Available memory
- Python environment
Note
Designed to support analysis across datasets of varying sizes.
DataGuard is available through:
Primary installation channel.
Install and update directly inside Visual Studio Code.
This repository serves as a project showcase and distribution companion for the extension.
- README
- Branding assets
- Screenshots
- Demonstration material
- License
- Security policy
- VSIX release packages
Note
This repository does not contain the extension source code. It is intended for distribution assets and release artifacts only.
Important
Release policy: Additional releases are published only when a new version of the extension is available.
Security practices and responsible reporting guidance are available in: SECURITY.md
Usage rights and restrictions are documented in the LICENSE file.
Note
Commercial use and redistribution are prohibited unless explicitly permitted by the license.
Built and maintained by the DataGuard Team.
Lead Developer
Developer
Developer
Made for faster dataset workflows inside Visual Studio Code.




