Cube Creator User Guide

Overview

Cube Creator transforms your CSV data into semantic RDF data cubes that can be published and visualized on platforms like visualize.admin.ch and opendata.swiss. The tool ensures your data follows W3C Data Cube standards and includes proper metadata for discoverability and reuse.

Prerequisites

Before starting with Cube Creator, ensure you have:

Access to the Cube Creator platform
CSV data files prepared according to the requirements
Basic understanding of your data structure and dimensions
Knowledge of the target publication platforms (visualize.admin.ch, opendata.swiss and LINDAS to query your data through SPARQL)

Step-by-Step Workflow

Step 0: Data Preparation

Purpose: Prepare your input data in valid CSV format that meets Cube Creator requirements. You can can validate your with the CSV validation tool.

Basic Requirements

✅ UTF-8 Encoding: All CSV files must use UTF-8 encoding, if you're on Windows, you can use Notepad to detect and change encoding
✅ No Empty Lines: Remove empty lines, especially at the end of files
✅ Syntactic Validity: Follow RFC4180 standards
- Use commas as delimiters (preferred)
- Quote strings containing commas or quotes
- Double quotes within strings: "Hans ""Johnny"" Müller"
✅ No Umlauts in Filenames: Avoid special characters (ä, ö, ü) in CSV filenames

Data Formats

Data Type	Format	Example
Date	YYYY-MM-DD	`2001-01-31`
DateTime	YYYY-MM-DDThh:mm:ss	`2001-01-31T17:30:00`
Time	hh:mm:ss	`21:32:52`
Boolean	true/false	`true`, `false`
Decimal	Use . as separator	`123.456`
Integer	No thousands separators	`-2147483648`

Tips for Excel Users

Change system delimiter settings for consistent CSV export
Consider data transposition if you have cross-tabulated data
Use the CSV validation tool to check syntax

Step 1: CSV Mapping

Purpose: Upload CSV files and map them to cube dimensions and tables.

Key Concepts

Dimensions: Categories that organize your data (e.g., geography, time, measurements). They correspond to the columns in the CSV file
Cube Table: The main table containing observations and dimensions. It describes the structure of your cube
Shared Dimensions: Represents the concepts that are used over and over in many cubes. These concepts or terms can be reused inside your data cubes. Shared dimensions allows us to link our data to other existing data in LINDAS
Concept Tables: Additional tables for multilingual labels and metadata. A concept table is the possibility to handle the values of a dimension as a URL to a new ressource (a concept). This is similar to an object that is the URL of a Shared Dimension's term but here the concepts are created for the cube and uploaded with the cube

Workflow

Upload CSV Files
- Click "+" to upload syntactically valid CSV files
- Preview columns and first three rows
- Multiple CSV files can be uploaded for complex cubes
Create Tables

Cube Table (Required)
- Select the columns/dimensions that you would like to include in the generated RDF cube. Check "Cube table" checkbox for your main data table
- Contains observations with key and measurement dimensions
- Must distinguish between different dimension types
Concept Tables (Optional)
- Create for multilingual concepts. Select the columns that you would like to include in the concept table. A concept table is created without checking the "Cube table" checkbox
- Link to cube table using "Link to another table". This allows us to link a dimension to a concept (to treat the values of a dimension as a resource)
- Provide translations and additional metadata
Configure Table Settings

Identifier Template
- Creates unique URIs for each row
- Use format: table-name/{column1}/{column2}. Recommandation : use columns that have values that do not contain special characters
- Auto-complete available with { trigger
- Leave empty for auto-generated identifiers
Display Color
- Visual connection between CSV inputs and mapped tables
- Used only within Cube Creator interface
Map Columns

Target Properties
- For Cube Tables: Dimension properties
- For Concept Tables: schema:name, schema:description, schema:position
- Auto-complete for common ontologies (e.g., "schema"). This is the list of preloaded ontologies in cube creator : rdfs, schema,qb, sdmx, dcterms, cd11, skos, skosxl, xskos, xsd, wgs.
Data Types
- Choose appropriate type for validation
- Critical for measurement dimensions
- Transformation fails on type mismatches
Language Settings
- Specify language for string dimensions
- Use concept tables for multilingual content
- schema:name mandatory in concept tables
Default Values
- Handle missing values with meaningful defaults
- Prevents explicitly missing values in final cube

Step 2: Transformation

Purpose: Convert mapped CSV data into RDF cube format.

Process

Start Transformation
- Click "Start transformation" button after mapping completion
- Monitor progress with status indicators:
  - Grey: Pending
  - Blue (blinking): Running
  - Red: Failed
  - Green: Successful
Monitor Jobs
- View transformation history
- Access detailed logs for debugging
- Check error messages for failed transformations
Handle Errors
- Invalid Datatype: Data doesn't match selected type
- Check error line numbers in logs
- Verify data format matches dimension settings
Replace CSV (Optional)
- Update cube with new data
- Same separator character required
- Column names must match (case-sensitive)
- All original columns must be present

💡 Tip: Use sample CSV files for initial testing to avoid long wait times with large datasets.

Step 3: Cube Designer

Purpose: Add metadata, verify data correctness, and prepare for publication. A dimension must have a Scale of measure.

Metadata Configuration

Dataset/Cube Metadata

Access via 🖊️ icon next to cube title
Complete all required fields
Provide multilingual descriptions

Status Settings

Draft: Shows in applications as draft status
Published: Supersedes draft versions
Choose publication targets (visualize.admin.ch, opendata.swiss)

Dimension Metadata

Click 🖊️ for each dimension column

Essential Fields:

Name: Descriptive name for the dimension
Description: Detailed explanation of dimension content
Dimension Type:
- Measurement: Contains data values/observations
- Key: Used for filtering for instance in visualize.admin.ch (cannot be deleted by users)
- Optional: Available as optional filter

Scale of Measure

Nominal: Named categories (e.g., cantons, colors)
Ordinal: Ordered categories (e.g., small, medium, large)
Interval: Proportional intervals (e.g., temperature in Celsius)
Ratio: Has meaningful zero point (e.g., mass)

Units (for Measurement Dimensions - every measurement dimension must have a unit)

Select from QUDT-based unit list
Use "Number (#)" for counts
Use "Percent (%)" for percentages

Data Kind (Optional)

Geographic coordinates: For lat/long data
Geographic shape: For shape data from Shared Dimensions
Time description: For temporal data types

Linking to Shared Dimensions

Available for nominal/ordinal concepts
Click 🔗 symbol to map to existing concept hierarchies
Enables geographic shapes and standardized concepts
Requires re-transformation after mapping

Data Verification

Completeness Checks

Compare observation count with CSV line count
Page through generated observations
Verify all data was converted correctly

Quality Checks

Test links to concept tables
Verify multilingual content (switch language in top right)
Check metadata completeness for all languages
Validate dimension mappings

Step 4: Publication

Purpose: Publish the completed cube to public databases and platforms.

Pre-Publication Checklist

✅ Verify all data and metadata accuracy
✅ Confirm publication status (draft/published)
✅ Set publication targets (visualize.admin.ch, opendata.swiss)
✅ Ensure non-sensitive data only

⚠️ Warning: For visualize.admin.ch republishing, maintain cube structure to avoid breaking existing visualizations.

Publication Process

Start Publication
- Click "Start publication" button
- Monitor job in "Previous publications" list
- Wait for green status indicator
Verify Publication
- Check availability on target platforms
- Test cube functionality in visualize.admin.ch
- Confirm metadata appears on opendata.swiss
- Run the provided SPARQL query by cube creator in LINDAS

Publication Management

Version Control

Each publication creates a complete new version
Previous versions marked as expired automatically
Old versions remain accessible for compatibility

Management Actions

Expired: Automatic flag for superseded versions
Unlist: Makes cube unavailable (reversible by republishing)
Delete: Manual database removal (irreversible)

Best Practices

Data Preparation

Use consistent delimiters across all CSV files
Validate CSV syntax before upload
Keep backup copies of original data
Document data transformations and decisions

Mapping Strategy

Start with sample data for complex cubes
Use meaningful identifier templates
Link to Shared Dimensions when appropriate
Maintain consistent naming conventions

Metadata Quality

Provide descriptions in all relevant languages
Use specific, descriptive dimension names
Choose appropriate scales of measure
Include units for all measurements

Publication Management

Test with draft status before publishing
Maintain cube structure for existing visualizations
Document changes between versions
Monitor platform availability after publication

Troubleshooting

Common Upload Issues

Encoding Problems: Convert files to UTF-8
Syntax Errors: Use CSV validation tool
Empty Lines: Remove all empty rows

Transformation Failures

Type Mismatches: Check data format against selected types
Missing Values: Set appropriate default values
Invalid Characters: Review identifier templates

Publication Issues

Missing Metadata: Complete all required fields
Platform Unavailability: Check platform status and configuration
Visualization Errors: Verify cube structure and metadata completeness

Getting Help

Check transformation logs for detailed error messages
Use the cube-checker tool for visualize.admin.ch cubes
Review the official wiki for detailed documentation
Submit issues to the GitHub repository

This guide covers the essential workflow for using Cube Creator. For advanced features and specific use cases, consult the official documentation.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitignore		.gitignore
EXAMPLE.MD		EXAMPLE.MD
README.md		README.md
_710_CHdipl_excel_de.csv		_710_CHdipl_excel_de.csv
data_clean.csv		data_clean.csv
degree_concept.csv		degree_concept.csv
download_graph_from_lindas.py		download_graph_from_lindas.py
lindas_graph.ttl		lindas_graph.ttl
requirements.txt		requirements.txt

Linked-Data-Governance/cube-creator

Folders and files

Latest commit

History

Repository files navigation

Cube Creator User Guide

Table of Contents

Overview

Prerequisites

Step-by-Step Workflow

Step 0: Data Preparation

Basic Requirements

Data Formats

Tips for Excel Users

Step 1: CSV Mapping

Key Concepts

Workflow

Step 2: Transformation

Process

Step 3: Cube Designer

Metadata Configuration

Linking to Shared Dimensions

Data Verification

Step 4: Publication

Pre-Publication Checklist

Publication Process

Publication Management

Best Practices

Data Preparation

Mapping Strategy

Metadata Quality

Publication Management

Troubleshooting

Common Upload Issues

Transformation Failures

Publication Issues

Getting Help

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages