# Use Datafolders to load data for models

This tutorial shows how to use the `DataFolder` class to load data for models: first you load the data, then you generate the .npy files, and finally you load the .npy files for the model using the `DataFolder` class.

## Imports

In [1]:
import sys, os
sys.path.append('..')
from rna_data import DataFolder

## Define your local repositories


In [2]:
my_local_folders = ['for_testing']

## Load the data from HuggingFace

You could also load the data from a local folder.

In [3]:
data = {folder: DataFolder.from_huggingface(folder) for folder in my_local_folders}

Fetching 10 files:   0%|          | 0/10 [00:00<?, ?it/s]

## Generate the .npy files

You don't need to run this cell if you already have the .npy files.

In [11]:
for repo in data:
    data[repo].generate_npy()

## Use the .npy files to train a model

In [8]:
import numpy as np
i_like_this_data = data['for_testing']

dms_data = np.load(i_like_this_data.get_dms_npy(), allow_pickle=True)
print("I like this DMS data, its shape is {}".format(dms_data.shape))

structure_data = np.load(i_like_this_data.get_base_pairs_npy(), allow_pickle=True)
print("I like this structure data, its shape is {}".format(structure_data.shape))

I like this DMS data, its shape is (2,)
I like this structure data, its shape is (2,)
