# CIFAR-10 Dataset Handling with Atria

## Setup and Auto-reloading Modules
We enable auto-reloading of modules so that any changes in imported libraries are automatically reflected.

In [1]:
%load_ext autoreload
%autoreload 2

## Importing Dependencies
Here, we modify the system path to include the project's root directory and import necessary modules for dataset handling.

## Loading the CIFAR-10 Dataset
We load the CIFAR-10 dataset using the `CIFAR10.load` method, specifying the training split.

In [6]:
from atria_core.utilities.imports import _get_package_base_path

from atria_datasets import AtriaImageDataset, FileStorageType

package_path = _get_package_base_path("atria")
dataset = AtriaImageDataset.load_from_registry(
    name="cifar10",
    provider="atria_datasets",
    build_kwargs ={
        "max_train_samples": 1000,
        "max_test_samples": 1000,
        "max_validation_samples": 1000,
    }
)
dataset.train.dataframe()


[2025-07-11 15:22:32][atria_datasets.core.dataset.atria_dataset][INFO] Loading dataset cifar10 from registry.
[2025-07-11 15:22:32][atria_datasets.core.dataset.atria_dataset][INFO] Caching dataset to storage dir: /mnt/hephaistos/.atria/datasets/cifar10/main
[2025-07-11 15:22:32][atria_datasets.core.dataset.atria_dataset][INFO] Loading dataset split train from cached storage: /mnt/hephaistos/.atria/datasets/cifar10/main/delta/train
[2025-07-11 15:22:32][atria_datasets.core.dataset.atria_dataset][INFO] Loading dataset split test from cached storage: /mnt/hephaistos/.atria/datasets/cifar10/main/delta/test


Unnamed: 0,index,sample_id,image_file_path,image_content,image_width,image_height,gt_classification,gt_ser,gt_ocr,gt_qa,gt_vqa,gt_layout
0,0,3576b4e0-8e49-4cba-9706-25961865ad78,,b'\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00\x00\...,32,32,"{""label"": {""value"": 6, ""name"": ""frog""}}",,,,,
1,1,994c31a7-e664-4772-af41-f69f890dfeb4,,b'\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00\x00\...,32,32,"{""label"": {""value"": 9, ""name"": ""truck""}}",,,,,
2,2,a5c7c216-4f9a-4df7-9ae2-d4c13bd2f186,,b'\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00\x00\...,32,32,"{""label"": {""value"": 9, ""name"": ""truck""}}",,,,,
3,3,a542f5ca-98cb-4000-b125-a2721cecb714,,b'\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00\x00\...,32,32,"{""label"": {""value"": 4, ""name"": ""deer""}}",,,,,
4,4,bdaa5a1e-4e4e-4a7a-b4b8-4f0051178244,,b'\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00\x00\...,32,32,"{""label"": {""value"": 1, ""name"": ""automobile""}}",,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...
995,995,dbe47cdd-e126-4b50-b608-015460cdc310,,b'\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00\x00\...,32,32,"{""label"": {""value"": 3, ""name"": ""cat""}}",,,,,
996,996,cfeacf75-1c46-4a0b-ac2b-2c8d5740a7ab,,b'\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00\x00\...,32,32,"{""label"": {""value"": 5, ""name"": ""dog""}}",,,,,
997,997,8ce04f7c-8195-40e8-b2c7-38d6e8c20f25,,b'\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00\x00\...,32,32,"{""label"": {""value"": 1, ""name"": ""automobile""}}",,,,,
998,998,78a83987-f2d6-4fd0-ac00-53628da5c239,,b'\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00\x00\...,32,32,"{""label"": {""value"": 3, ""name"": ""cat""}}",,,,,


In [5]:
dataset.upload_to_hub(name="cifar10-2")

[2025-07-11 15:13:21][atria_datasets.core.dataset.atria_dataset][INFO] Uploading dataset Cifar10 to hub with name cifar10-2 and config main.
HTTP Request: GET http://127.0.0.1:8000/api/v1/health/ "HTTP/1.1 200 OK"
HTTP Request: GET http://127.0.0.1:8000/api/v1/credentials/AKIAJ2NNAE7KY6KJDFVQ "HTTP/1.1 200 OK"


AttributeError: 'AtriaHubClient' object has no attribute '_conf'

In [8]:
dataset.load_from_hub(
    name="cifar10-2",
    branch="test2",
)

HTTP Request: GET http://127.0.0.1:8000/api/v1/health/ "HTTP/1.1 200 OK"
HTTP Request: GET http://127.0.0.1:8000/api/v1/credentials/AKIAJ2NNAE7KY6KJDFVQ "HTTP/1.1 200 OK"
HTTP Request: GET http://127.0.0.1:8000/api/v1/dataset/find_one/?name=cifar10-2 "HTTP/1.1 200 OK"
[2025-07-11 15:47:20][atria_datasets.core.dataset.atria_dataset][INFO] Loading dataset cifar10-2 from hub with branch test2 into storage directory /mnt/hephaistos/.atria/datasets/cifar10-2/test2.
