Skip to content

Latest commit

 

History

History
142 lines (101 loc) · 8.61 KB

index.md

File metadata and controls

142 lines (101 loc) · 8.61 KB
icon
kolena/area-of-interest-16

:kolena-area-of-interest-20: Core Concepts

In this section, we explore Kolena's core concepts, focusing on the key features that facilitate model evaluation and testing. For a quick overview, refer to the Quickstart Guide.

:kolena-dataset-20: Dataset

A dataset is a structured assembly of datapoints, designed for model evaluation. This structure is immutable, meaning once a datapoint is added, it cannot be altered without creating a new version of the dataset. This immutability ensures the integrity and traceability of the data used in testing models.

Datapoints

Datapoints are versatile and immutable objects. A datapoint is a set of inputs that you would want to test on your models and has the following key characteristics:

  • Unified Object Structure: Datapoints are singular, grab-bag objects that can embody various types of data, including images, as indicated by the presence of a data_type field.

  • Immunity to Change: Once a datapoint is added to a dataset, it cannot be altered. Any update to a datapoint results in the creation of a new datapoint, and this action consequently versions the dataset.

  • Exclusive Association with Datasets: Datapoints are exclusive to the dataset they belong to and are not shared across different datasets. This exclusivity ensures clear demarcation and management of data within specific datasets.

  • Role in Data Ingestion: Datapoints play a central role in the data ingestion process. They are represented in a DataFrame structure with special treatment for certain columns like locator and text.

  • Extension of Data Classes: Datapoints extend data classes, allowing for flexibility and customization. For instance, they can include annotation objects like [BoundingBox][kolena.annotation.BoundingBox], and these objects can be further extended as needed.

Consider a single row within the :kolena-widget-16: Classification (CIFAR-10) ↗ dataset with the following columns:

locator ground_truth image_brightness image_contrast
s3://kolena-public-examples/cifar10/data/horse0000.png horse 153.994 84.126

This datapoint points to an image horse0000.png which has the ground_truth classification of horse, and has brightness and contrast data.

Datapoint Components

Unique Identifier: each datapoint should have a hashable unique identifier.

You are able to select one or more fields as your ID field during the import process via the Web App :kolena-dataset-16: Datasets or the SDK by using the upload_dataset function.

Meta data: you can add additional informaiton about your datapoint simply by adding columns to the dataset with the metadaname and values in each row.

Referenced Files: each datapoint can contain a primary reference to a file stored on your cloud storage. Kolena automatically renders referenced files with column name locator. Other column names result in references to appear as text. Below table outlines what extensions are supported for optimal visualization.

Data Type Supported file formats
Image jpg, jpeg, png, gif, bmp and other web browser supported image types.
Audio flac, mp3, wav, acc, ogg, ra and other web browser supported audio types.
Video mov, mp4, mpeg, avi and other web browser supported video types.
Document txt and pdf files.
Point Cloud pcd files.

Assets: allow you to connect multiple referenced files to each datapoint for visualization and analysis. Multiple assets can be attached to a single datapoint.

Asset Type Description
ImageAsset Useful if your data is modeled as multiple related images.
BinaryAsset Useful if you want to attach any segmentation or bitmap masks.
AudioAsset Useful if you want to attach an audio file.
VideoAsset Useful if you want to attach a video file.
PointCloudAsset Useful for attaching 3D point cloud data.

Annotations: allow you to visualize overlays on top of datapoints through the use ofannotation. We currently support 10 different types of annotations each enabling a specific modality.

??? "How to generate datapoints" You can structure your dataset as a CSV file. Each row in the file should represent a distinct datapoint. For complete information on creating datasets, visit formatting your datasets.

:kolena-quality-standard-20: Quality Standard

A Quality Standard tracks a standardized process for how a team evaluates a model's performance on a dataset. Users may define and manage quality standards for a dataset in the Kolena web application using the Quality Standards tab.

A Quality Standard is composed of Test Cases and Metrics.

Test Cases

Test cases allow users to evaluate their datasets at various levels of division, providing visibility into how models perform at differing subsets of the full dataset, and mitigating failures caused by hidden stratifications.

Kolena supports easy test case creation through dividing a dataset along categorical or numeric datapoint properties. For example, if you have a dataset with images of faces of individuals, you may wish to create a set of test cases that divides your dataset by datapoint.race (categorical) or datapoint.age (numeric).

The quickstart guide provides a more hands-on example of defining test cases.

Metrics

Metrics describe the criteria used to evaluate the performance of a model and compare it with other models over a given dataset and its test cases.

Kolena supports defining metrics by applying standard aggregations over datapoint level results or by leveraging common machine learning aggregations, such as Precision or F1 Score. Once defined, users may also specify highlighting for metrics, indicating if Higher is better, or if Lower is better.

The datasets quickstart provides a more hands-on example of defining metrics. For more details on out-of-the-box and custom metrics visit Task Metrics

Model Comparison

Once you've defined your test cases and metrics, you can view and compare model results in the Quality Standards tab, which provides a quick and standardized high level overview of which models perform best over your different test cases.

For step-by-step instructions, take a look at the quickstart for model comparison.

Debugging

The Debugger tab of a dataset allows users to experiment with test cases and metrics without saving them off to the team level quality standards. This allows users to search for meaningful test cases and experiment with different metrics with the confidence that they can safely save these updated values to their quality standards when comfortable, without the risk of accidentally replacing what the team has previously defined. This also provides a view for visualizing results and relations in plots.

For step-by-step instructions, take a look at the quickstart for results exploration.