In [1]:
import random
import numpy as np
import tensorflow as tf
import pandas as pd
from src.basenet import BaseNetDatabase
from IPython.display import display

# BaseNetDatabase

## Advanced use tutorial

In this JuPyter Notebook we will learn further uses and specifications of the ``BaseNetDatabase`` Class.

### Contents

1. About ``BaseNetDatabase``.
2. Construction.
    1. Build from raw data.
    2. Build from ``numpy.array`` class.
    3. Build from ``tensorflow.data.Dataset`` class.
    4. Build from ``pandas.DataFrame`` class.
3. Save and load databases.
4. Split databases.
5. Merge databases.
6. Specify train, validation and test datasets explicitly.

## 1. About BaseNetDatabase.

The BaseNetDatabase is an Python Class that contains the relevant information for the API to work with a Database. The Datasets are stored in ``np.ndarrays``.

    The BaseNetDatabase class converts a set of inputs (x) and solutions (y) into the API wrapper database.
    The BaseNetDatabase will create a new set of attributes from the database randomly:

        *   xtrain: A subset of (x) with the train inputs of the network.
        *   ytrain: A subset of (y) with the train solutions of the network.
        *   xval: A subset of (x) with the training validation inputs of the network.
        *   yval: A subset of (y) with the training validation solutions of the network.
        *   xtest: A subset of (x) with excluded inputs of the network; for future testing.
        *   ytest: A subset of (y) with excluded solutions of the network; for future testing.

        *   dtype: Data type of the input data (x) and output data (y) in a tuple of strings (x_dtype, y_dtype).
        *   name: Name of the database.
        *   distribution: Train, validation and test distribution of the input database.
        *   batch_size: Current batch size of the database.
        *   size: The size of the database (train, validation , test).

    The BaseNetDatabase can be loaded and saved with its own methods.
    
## 2. Construction.

To build a BaseNetDatabase we usually give it two parameters: 'x' and 'y'. 'x' contains the input values of the model and 'y' contains the solutions.

        BaseNetDatabase(x, y=None, 
                        distribution: dict = None, 
                        name='unnamed_database', 
                        batch_size: int = None,
                        rescale: float = 1.0, 
                        dtype: tuple[str, str] = ('float', 'float'), 
                        bits: tuple[int, int] = (32, 32))
        
        This class builds a BaseNetDatabase, compatible with the NetBase API.
        :x: Inputs of the dataset.
        :y: Solutions of the dataset.
        :distribution: The distribution of the datasets, default: {'train': 70, 'val': 20, 'test': 10}
        :name: The database name.
        :batch_size: Custom batch size for training.
        :rescale: Rescale factor, all the values in x are divided by this factor, in case rescale is needed.
        :dtype: Data type of the dataset. ('input', 'output') (x, y)
        :bits: Bits used for the data type. ('input', 'output') (x, y)

In [2]:
def __init__(self, 
             x, y=None, distribution: dict = None, name='unnamed_database', batch_size: int = None,
             rescale: float = 1.0, dtype: tuple[str, str] = ('float', 'float'), bits: tuple[int, int] = (32, 32)):
    [...]

#### 2.1. Build from raw data.

You can build your model from raw data. We will create a random Dataset, not very large. Where 'x' will be random data and 'y' random solutions.

In [3]:
def create_random_dataset(x_dim, y_dim):
    return [[random.random() for _ in range(x_dim)] for _ in range(y_dim)], [random.randint(0, 10) for _ in range(y_dim)]

x, y = create_random_dataset(3, 10)
x, y

([[0.9502457351725069, 0.8436442369903798, 0.23317060471725104],
  [0.5792361744164586, 0.25472658344652943, 0.14503465503541246],
  [0.7727576933534599, 0.9214589688527682, 0.7578435485725693],
  [0.8843969141157557, 0.9177107612724367, 0.10917853819455048],
  [0.6303568906101952, 0.7995476070335449, 0.1094439008630681],
  [0.6147708775678057, 0.8298626546772362, 0.851563057821567],
  [0.7335646932915939, 0.8592629871967927, 0.8827126794747593],
  [0.47398027733282877, 0.4067264792280043, 0.2600070460463293],
  [0.9914459488212445, 0.287847889984154, 0.9885820498408971],
  [0.6330518709648455, 0.3404476138595036, 0.012274734441484525]],
 [6, 5, 4, 8, 1, 7, 8, 4, 3, 2])

We can create a distribution from training, validation and test and a name for the database:

In [4]:
distribution = {'train': 60, 'val': 20, 'test': 20}
name = 'random_database'

The datatypes are also a useful parameter, we will define 'x' as float32 and 'y' as int8.

In [5]:
dtype = ('float', 'int')
bits = (32, 8)

The rescale parameter is a number which will divide all the data in 'x'. We will set it to 2, so our data will be between [0, 0.5] instead of [0, 1] with default rescale. 

This is useful, for example, for images, where the data is usually in the range of [0, 255], so a recale of 255 is needed.

In [6]:
rescale = 2

The batch size is a parameter that will be used by our model. This batches are a group of data that will be evaluated as a whole in the training process of Deep Learning models. Refer to the Machine Learning theory to undertand more abour this feature.

In [7]:
batch_size = 2

Now we build our database with our specifications:

In [8]:
my_db = BaseNetDatabase(x, y, 
                        distribution=distribution, 
                        name=name, 
                        batch_size=batch_size,
                        rescale=rescale, 
                        dtype=dtype, 
                        bits=bits)
my_db

BaseNetDatabase with 10 instances.

The default representation of the model will show you the instances. Now let's see the Datasets:

In [9]:
for dataset in ('train', 'val', 'test'):
    print('')
    for element in ('x', 'y'):
        fulldataset = f'{element}{dataset}'
        print(f'my_db.{fulldataset}:\n{getattr(my_db, f"{fulldataset}")}')


my_db.xtrain:
[[0.49572298 0.14392394 0.49429104]
 [0.31517845 0.3997738  0.05472195]
 [0.23699014 0.20336324 0.13000353]
 [0.36678234 0.4296315  0.44135633]
 [0.28961807 0.1273633  0.07251733]
 [0.47512287 0.42182213 0.1165853 ]]
my_db.ytrain:
[[0 0 0 1 0 0 0 0 0]
 [0 1 0 0 0 0 0 0 0]
 [0 0 0 0 1 0 0 0 0]
 [0 0 0 0 0 0 0 0 1]
 [0 0 0 0 0 1 0 0 0]
 [0 0 0 0 0 0 1 0 0]]

my_db.xval:
[[0.44219846 0.4588554  0.05458927]
 [0.38637885 0.46072948 0.37892178]]
my_db.yval:
[[0 0 0 0 0 0 0 0 1]
 [0 0 0 0 1 0 0 0 0]]

my_db.xtest:
[[0.30738544 0.41493133 0.42578152]
 [0.31652594 0.1702238  0.00613737]]
my_db.ytest:
[[0 0 0 0 0 0 0 1 0]
 [0 0 1 0 0 0 0 0 0]]


As you can see, the 'y' values are automatically binarized. This is because the Deep Learning models do not work with integers, but with probability. Each output of the model will represent a probability that each value belongs to the reference element. In this case, we had integers between [0, 10], that is why there are 10 elements in each value of the 'y' datasets.

We can also see that the values of the original data is halved due to the 

Note that this is a feature that cannot predict the total range of your values, it takes the maximum to binarize, so feel free to binarize your input beforehand (which is slightly recomended).

We also check that the 'y' values are of type ``int8``:

In [10]:
print('Type of y:', my_db.ytest.dtype, '\nType of x:', my_db.xtest.dtype)

Type of y: int8 
Type of x: float32


The batch size is automatically assigned if it is not provided. The batch size that will be assigned depends on the number of instances provided for training and follows the function:

     batch_size = 2^round(log_2(len(xtrain) / 256))
     
Which is similar to:

        batch_size = round(len(xtrain) / 256)  # This is not the function.
    
But the last does not provide a number that can be expresed as a power of 2.

This is a good way to estimate your batch size, but it can be different for every problem, so it is recomended to be defined beforehand.

If we used auto-batch:

In [11]:
my_db_autobatch = BaseNetDatabase(x, y, 
                                  distribution=distribution, 
                                  name=name,
                                  rescale=rescale, 
                                  dtype=dtype, 
                                  bits=bits)
print('\nBatch size:\t', my_db.batch_size, '\nAutobatch:\t', my_db_autobatch.batch_size)


Batch size:	 2 
Autobatch:	 1


#### 2.1. Build from numpy data.

You can build your database from different sources. We will rebuild our dataset as a numpy array and feed it to the constructor, as usual.

In [12]:
x_np = np.array(x)
y_np = np.array(y)

Now we call the constructor as we did before:

In [13]:
def build_and_print_db(x_v, y_v):
    my_db = BaseNetDatabase(x_v, y_v, 
                            distribution=distribution, 
                            name=name, 
                            batch_size=batch_size,
                            rescale=rescale, 
                            dtype=dtype, 
                            bits=bits)
    print_db(my_db)
    
def print_db(my_db):
    for dataset in ('train', 'val', 'test'):
        print('')
        for element in ('x', 'y'):
            fulldataset = f'{element}{dataset}'
            print(f'my_db.{fulldataset}:\n{getattr(my_db, f"{fulldataset}")}')
    print('\nType of y:', my_db.ytest.dtype, '\nType of x:', my_db.xtest.dtype)
            
build_and_print_db(x, y)


my_db.xtrain:
[[0.49572298 0.14392394 0.49429104]
 [0.31517845 0.3997738  0.05472195]
 [0.44219846 0.4588554  0.05458927]
 [0.36678234 0.4296315  0.44135633]
 [0.38637885 0.46072948 0.37892178]
 [0.28961807 0.1273633  0.07251733]]
my_db.ytrain:
[[0 0 0 1 0 0 0 0 0]
 [0 1 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 1]
 [0 0 0 0 0 0 0 0 1]
 [0 0 0 0 1 0 0 0 0]
 [0 0 0 0 0 1 0 0 0]]

my_db.xval:
[[0.23699014 0.20336324 0.13000353]
 [0.31652594 0.1702238  0.00613737]]
my_db.yval:
[[0 0 0 0 1 0 0 0 0]
 [0 0 1 0 0 0 0 0 0]]

my_db.xtest:
[[0.47512287 0.42182213 0.1165853 ]
 [0.30738544 0.41493133 0.42578152]]
my_db.ytest:
[[0 0 0 0 0 0 1 0 0]
 [0 0 0 0 0 0 0 1 0]]

Type of y: int8 
Type of x: float32


As we can see, it is the same result, but randomly shuffled.

#### 2.3. Build from ``tensorflow.data.Dataset`` class.

If some time you used TensorFlow, you would have notice it has a special way to store datasets. The API automatically creates a BaseNetDataset from the given ``tensorflow.data.Dataset`` as 'x' if you set up 'y' to a tuple to idenfity 'x' and 'y' (default value).

In [14]:
x_is = 'x_values'
y_is = 'y_values'

tfds = tf.data.Dataset.from_tensor_slices({x_is: x, y_is: y})
tfds

<TensorSliceDataset element_spec={'x_values': TensorSpec(shape=(3,), dtype=tf.float32, name=None), 'y_values': TensorSpec(shape=(), dtype=tf.int32, name=None)}>

And build the BaseNetDatabase with the arguments:

        my_db = BaseNetDatabase(tfds, ('x_values', 'y_values'), 
                               distribution=distribution, 
                               name=name, 
                               batch_size=batch_size,
                               rescale=rescale, 
                               dtype=dtype, 
                               bits=bits)

In [15]:
my_db = BaseNetDatabase(tfds, ('x_values', 'y_values'), 
                       distribution=distribution, 
                       name=name, 
                       batch_size=batch_size,
                       rescale=rescale, 
                       dtype=dtype, 
                       bits=bits)
print_db(my_db)


my_db.xtrain:
[[0.28961807 0.1273633  0.07251733]
 [0.47512287 0.42182213 0.1165853 ]
 [0.44219846 0.4588554  0.05458927]
 [0.31517845 0.3997738  0.05472195]
 [0.30738544 0.41493133 0.42578152]
 [0.38637885 0.46072948 0.37892178]]
my_db.ytrain:
[[0 0 0 0 0 1 0 0 0]
 [0 0 0 0 0 0 1 0 0]
 [0 0 0 0 0 0 0 0 1]
 [0 1 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 1 0]
 [0 0 0 0 1 0 0 0 0]]

my_db.xval:
[[0.23699014 0.20336324 0.13000353]
 [0.36678234 0.4296315  0.44135633]]
my_db.yval:
[[0 0 0 0 1 0 0 0 0]
 [0 0 0 0 0 0 0 0 1]]

my_db.xtest:
[[0.49572298 0.14392394 0.49429104]
 [0.31652594 0.1702238  0.00613737]]
my_db.ytest:
[[0 0 0 1 0 0 0 0 0]
 [0 0 1 0 0 0 0 0 0]]

Type of y: int8 
Type of x: float32


Te became the same database as we did from raw data!

#### 2.4. Build from ``pandas.DataFrame`` class.

If some time you used Pandas, you would have notice it has a special way to store datasets (in pandas.DataFrames). The API automatically creates a BaseNetDataset from the given ``pandas.DataFrame`` as long as you specify the name of the columns that identify 'y', in a string. The remaining parts of the DataFrame will be 'x'. Let's create the DataFrame from a numpy array.

In [16]:
column_names = ['a', 'b', 'c', 'y_column']
x_np = np.array(x)
y_np = np.array([y], dtype='int8').T
np_data = np.concatenate([x_np, y_np], axis=1)
df = pd.DataFrame(data=np_data, columns=column_names)
df.head(None)

Unnamed: 0,a,b,c,y_column
0,0.950246,0.843644,0.233171,6.0
1,0.579236,0.254727,0.145035,5.0
2,0.772758,0.921459,0.757844,4.0
3,0.884397,0.917711,0.109179,8.0
4,0.630357,0.799548,0.109444,1.0
5,0.614771,0.829863,0.851563,7.0
6,0.733565,0.859263,0.882713,8.0
7,0.47398,0.406726,0.260007,4.0
8,0.991446,0.287848,0.988582,3.0
9,0.633052,0.340448,0.012275,2.0


Now, it's time to build pand print the model with:

    my_db = BaseNetDatabase(df, 'y_column', 
                            distribution=distribution, 
                            name=name, 
                            batch_size=batch_size,
                            rescale=rescale, 
                            dtype=dtype, 
                            bits=bits)
                            
Note: If you do not specify a 'y' value or set it to None, the 'y' value will be the last column of the DataFrame by default.

In [17]:
my_db = BaseNetDatabase(df, 'y_column', 
                        distribution=distribution, 
                        name=name, 
                        batch_size=batch_size,
                        rescale=rescale, 
                        dtype=dtype, 
                        bits=bits)
print_db(my_db)


my_db.xtrain:
[[0.49572298 0.14392394 0.49429104]
 [0.36678234 0.4296315  0.44135633]
 [0.47512287 0.42182213 0.1165853 ]
 [0.38637885 0.46072948 0.37892178]
 [0.28961807 0.1273633  0.07251733]
 [0.31652594 0.1702238  0.00613737]]
my_db.ytrain:
[[0 0 0 1 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 1]
 [0 0 0 0 0 0 1 0 0]
 [0 0 0 0 1 0 0 0 0]
 [0 0 0 0 0 1 0 0 0]
 [0 0 1 0 0 0 0 0 0]]

my_db.xval:
[[0.23699014 0.20336324 0.13000353]
 [0.30738544 0.41493133 0.42578152]]
my_db.yval:
[[0 0 0 0 1 0 0 0 0]
 [0 0 0 0 0 0 0 1 0]]

my_db.xtest:
[[0.44219846 0.4588554  0.05458927]
 [0.31517845 0.3997738  0.05472195]]
my_db.ytest:
[[0 0 0 0 0 0 0 0 1]
 [0 1 0 0 0 0 0 0 0]]

Type of y: int8 
Type of x: float32


#### 3. Save and load databases.

The way to save and load databases is using the ``.save()`` and ``.load()`` methods.

In [18]:
my_db.save('./my_test_db.db')

True

In [19]:
my_loaded_db = BaseNetDatabase.load('./my_test_db.db')
print_db(my_loaded_db)


my_db.xtrain:
[[0.49572298 0.14392394 0.49429104]
 [0.36678234 0.4296315  0.44135633]
 [0.47512287 0.42182213 0.1165853 ]
 [0.38637885 0.46072948 0.37892178]
 [0.28961807 0.1273633  0.07251733]
 [0.31652594 0.1702238  0.00613737]]
my_db.ytrain:
[[0 0 0 1 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 1]
 [0 0 0 0 0 0 1 0 0]
 [0 0 0 0 1 0 0 0 0]
 [0 0 0 0 0 1 0 0 0]
 [0 0 1 0 0 0 0 0 0]]

my_db.xval:
[[0.23699014 0.20336324 0.13000353]
 [0.30738544 0.41493133 0.42578152]]
my_db.yval:
[[0 0 0 0 1 0 0 0 0]
 [0 0 0 0 0 0 0 1 0]]

my_db.xtest:
[[0.44219846 0.4588554  0.05458927]
 [0.31517845 0.3997738  0.05472195]]
my_db.ytest:
[[0 0 0 0 0 0 0 0 1]
 [0 1 0 0 0 0 0 0 0]]

Type of y: int8 
Type of x: float32


#### 4. Split databases.

You can split BaseNetDatabases calling the method ``split()`` or dividing by an integer.

In [20]:
splitted_db = my_db / 2
print('First database: =======================================\n')
print_db(splitted_db[0])
print('\n\nSecond database: =======================================\n')
print_db(splitted_db[1])



my_db.xtrain:
[[0.49572298 0.14392394 0.49429104]
 [0.36678234 0.4296315  0.44135633]
 [0.47512287 0.42182213 0.1165853 ]]
my_db.ytrain:
[[0 0 0 1 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 1]
 [0 0 0 0 0 0 1 0 0]]

my_db.xval:
[[0.23699014 0.20336324 0.13000353]]
my_db.yval:
[[0 0 0 0 1 0 0 0 0]]

my_db.xtest:
[[0.44219846 0.4588554  0.05458927]]
my_db.ytest:
[[0 0 0 0 0 0 0 0 1]]

Type of y: int8 
Type of x: float32




my_db.xtrain:
[[0.38637885 0.46072948 0.37892178]
 [0.28961807 0.1273633  0.07251733]
 [0.31652594 0.1702238  0.00613737]]
my_db.ytrain:
[[0 0 0 0 1 0 0 0 0]
 [0 0 0 0 0 1 0 0 0]
 [0 0 1 0 0 0 0 0 0]]

my_db.xval:
[[0.30738544 0.41493133 0.42578152]]
my_db.yval:
[[0 0 0 0 0 0 0 1 0]]

my_db.xtest:
[[0.31517845 0.3997738  0.05472195]]
my_db.ytest:
[[0 1 0 0 0 0 0 0 0]]

Type of y: int8 
Type of x: float32


In [21]:
splitted_db = my_db.split(2)
print('First database: =======================================\n')
print_db(splitted_db[0])
print('\n\nSecond database: =======================================\n')
print_db(splitted_db[1])



my_db.xtrain:
[[0.49572298 0.14392394 0.49429104]
 [0.36678234 0.4296315  0.44135633]
 [0.47512287 0.42182213 0.1165853 ]]
my_db.ytrain:
[[0 0 0 1 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 1]
 [0 0 0 0 0 0 1 0 0]]

my_db.xval:
[[0.23699014 0.20336324 0.13000353]]
my_db.yval:
[[0 0 0 0 1 0 0 0 0]]

my_db.xtest:
[[0.44219846 0.4588554  0.05458927]]
my_db.ytest:
[[0 0 0 0 0 0 0 0 1]]

Type of y: int8 
Type of x: float32




my_db.xtrain:
[[0.38637885 0.46072948 0.37892178]
 [0.28961807 0.1273633  0.07251733]
 [0.31652594 0.1702238  0.00613737]]
my_db.ytrain:
[[0 0 0 0 1 0 0 0 0]
 [0 0 0 0 0 1 0 0 0]
 [0 0 1 0 0 0 0 0 0]]

my_db.xval:
[[0.30738544 0.41493133 0.42578152]]
my_db.yval:
[[0 0 0 0 0 0 0 1 0]]

my_db.xtest:
[[0.31517845 0.3997738  0.05472195]]
my_db.ytest:
[[0 1 0 0 0 0 0 0 0]]

Type of y: int8 
Type of x: float32


#### 5. Merge databases.

The oposite of the ``split()`` method is the ``merge()`` method. You can merge two BaseNetDatabases into one just by calling ``merge()`` or using the operator ``+``.

In [22]:
reconstructed_db = splitted_db[0] + splitted_db[1]
print_db(reconstructed_db)


my_db.xtrain:
[[0.49572298 0.14392394 0.49429104]
 [0.36678234 0.4296315  0.44135633]
 [0.47512287 0.42182213 0.1165853 ]
 [0.38637885 0.46072948 0.37892178]
 [0.28961807 0.1273633  0.07251733]
 [0.31652594 0.1702238  0.00613737]]
my_db.ytrain:
[[0 0 0 1 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 1]
 [0 0 0 0 0 0 1 0 0]
 [0 0 0 0 1 0 0 0 0]
 [0 0 0 0 0 1 0 0 0]
 [0 0 1 0 0 0 0 0 0]]

my_db.xval:
[[0.23699014 0.20336324 0.13000353]
 [0.30738544 0.41493133 0.42578152]]
my_db.yval:
[[0 0 0 0 1 0 0 0 0]
 [0 0 0 0 0 0 0 1 0]]

my_db.xtest:
[[0.44219846 0.4588554  0.05458927]
 [0.31517845 0.3997738  0.05472195]]
my_db.ytest:
[[0 0 0 0 0 0 0 0 1]
 [0 1 0 0 0 0 0 0 0]]

Type of y: int8 
Type of x: float32


In [23]:
reconstructed_db = splitted_db[0].merge(splitted_db[1])
print_db(reconstructed_db)


my_db.xtrain:
[[0.49572298 0.14392394 0.49429104]
 [0.36678234 0.4296315  0.44135633]
 [0.47512287 0.42182213 0.1165853 ]
 [0.38637885 0.46072948 0.37892178]
 [0.28961807 0.1273633  0.07251733]
 [0.31652594 0.1702238  0.00613737]]
my_db.ytrain:
[[0 0 0 1 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 1]
 [0 0 0 0 0 0 1 0 0]
 [0 0 0 0 1 0 0 0 0]
 [0 0 0 0 0 1 0 0 0]
 [0 0 1 0 0 0 0 0 0]]

my_db.xval:
[[0.23699014 0.20336324 0.13000353]
 [0.30738544 0.41493133 0.42578152]]
my_db.yval:
[[0 0 0 0 1 0 0 0 0]
 [0 0 0 0 0 0 0 1 0]]

my_db.xtest:
[[0.44219846 0.4588554  0.05458927]
 [0.31517845 0.3997738  0.05472195]]
my_db.ytest:
[[0 0 0 0 0 0 0 0 1]
 [0 1 0 0 0 0 0 0 0]]

Type of y: int8 
Type of x: float32


#### 6. Specify train, validation and test datasets explicitly.

If you dont feel conftable about the BaseNetDatabase is randomly shuffling your data into train, validation and test datasets from a single block of data, you can manually specify the datasets to the ``from_datasets()`` method.

    my_manual_db = BaseNetDatabase.from_datasets(train=(train_x, train_y), val=(val_x, val_y), test=(test_x, test_y), 
                                                 batch_size=batch_size,
                                                 name=name, dtype=dtype,
                                                 bits=bits, rescale=rescale)
                                                 
Note: When you introduce the datasets manually, the binarization and normalization is not performed, as we suppose that you did it manually and your datasets are correctly pre-processed.

In [26]:
# Definition:
@staticmethod
def from_datasets(train: tuple, val: tuple, test: tuple, batch_size: int = None,
                  name: str = 'unnamed_database', dtype: tuple[str, str] = ('float', 'float'),
                  bits: tuple[int, int] = (32, 32), rescale: float = 1.):
    [...]

In [25]:
train_x = x[0:5]
train_y = y[0:5]
val_x = x[5:7]
val_y = y[5:7]
test_x = x[7:]
test_y = y[7:]
my_manual_db = BaseNetDatabase.from_datasets(train=(train_x, train_y), val=(val_x, val_y), test=(test_x, test_y), 
                                             batch_size=batch_size,
                                             name=name, dtype=dtype,
                                             bits=bits, rescale=rescale)
print_db(my_manual_db)


my_db.xtrain:
[[0.47512287 0.42182213 0.1165853 ]
 [0.28961807 0.1273633  0.07251733]
 [0.38637885 0.46072948 0.37892178]
 [0.44219846 0.4588554  0.05458927]
 [0.31517845 0.3997738  0.05472195]]
my_db.ytrain:
[6 5 4 8 1]

my_db.xval:
[[0.30738544 0.41493133 0.42578152]
 [0.36678234 0.4296315  0.44135633]]
my_db.yval:
[7 8]

my_db.xtest:
[[0.23699014 0.20336324 0.13000353]
 [0.49572298 0.14392394 0.49429104]
 [0.31652594 0.1702238  0.00613737]]
my_db.ytest:
[4 3 2]

Type of y: int8 
Type of x: float32


As you can see, there is no binarization of the 'y' values.

You reached the end of the advanced BaseNetDatabase tutorial. Crongratulations!