About failed to read the hdf5 file #57

CHENGHUAN555 · 2023-08-18T13:19:57Z

Is there an existing issue for this?

I have searched the existing issues

Bug description

When running the following code, an error was reported in line 30:

,"continuous_label = cebra.load_data(file="auxiliary_behavior_data.h5", key="auxiliary_variables", columns=["continuous1", "continuous2", "continuous3"])" ,mainly because an error occurred when cebra.load_data() was used as the.h5 file. I do not know how to solve it, and I hope to seek the author's help and solution.
---------------------------------------------------------------------------------------------------------------------------------------------
test.py
---------------------------------------------------------------------------------------------------------------------------------------------
# Create a .h5 file, containing a pd.DataFrame
import pandas as pd
import numpy as np
X_continuous = np.random.normal(0,1,(100,3))
X_discrete = np.random.randint(0,10,(100, ))
df = pd.DataFrame(np.array(X_continuous), columns=["continuous1", "continuous2", "continuous3"])
df["discrete"] = X_discrete
df.to_hdf("auxiliary_behavior_data.h5", key="auxiliary_variables")


import cebra
from numpy.random import uniform, randint
from sklearn.model_selection import train_test_split

# 1. Define a CEBRA model
cebra_model = cebra.CEBRA(
    model_architecture = "offset10-model",
    batch_size = 512,
    learning_rate = 1e-4,
    max_iterations = 10, # TODO(user): to change to at least 10'000
    max_adapt_iterations = 10, # TODO(user): to change to ~100-500
    time_offsets = 10,
    output_dimension = 8,
    verbose = False
)

# 2. Load example data
neural_data = cebra.load_data(file="neural_data.npz", key="neural")
new_neural_data = cebra.load_data(file="neural_data.npz", key="new_neural")
continuous_label = cebra.load_data(file="auxiliary_behavior_data.h5", key="auxiliary_variables", columns=["continuous1", "continuous2", "continuous3"])
discrete_label = cebra.load_data(file="auxiliary_behavior_data.h5", key="auxiliary_variables", columns=["discrete"]).flatten()

assert neural_data.shape == (100, 3)
assert new_neural_data.shape == (100, 4)
assert discrete_label.shape == (100, )
assert continuous_label.shape == (100, 3)

# 3. Split data and labels
(
    train_data,
    valid_data,
    train_discrete_label,
    valid_discrete_label,
    train_continuous_label,
    valid_continuous_label,
) = train_test_split(neural_data,
                    discrete_label,
                    continuous_label,
                    test_size=0.3)

# 4. Fit the model
# time contrastive learning
cebra_model.fit(train_data)
# discrete behavior contrastive learning
cebra_model.fit(train_data, train_discrete_label,)
# continuous behavior contrastive learning
cebra_model.fit(train_data, train_continuous_label)
# mixed behavior contrastive learning
cebra_model.fit(train_data, train_discrete_label, train_continuous_label)

# 5. Save the model
cebra_model.save('/tmp/foo.pt')

# 6. Load the model and compute an embedding
cebra_model = cebra.CEBRA.load('/tmp/foo.pt')
train_embedding = cebra_model.transform(train_data)
valid_embedding = cebra_model.transform(valid_data)
assert train_embedding.shape == (70, 8)
assert valid_embedding.shape == (30, 8)

# 7. Evaluate the model performances
goodness_of_fit = cebra.sklearn.metrics.infonce_loss(cebra_model,
                                                     valid_data,
                                                     valid_discrete_label,
                                                     valid_continuous_label,
                                                     num_batches=5)

# 8. Adapt the model to a new session
cebra_model.fit(new_neural_data, adapt = True)

# 9. Decode discrete labels behavior from the embedding
decoder = cebra.KNNDecoder()
decoder.fit(train_embedding, train_discrete_label)
prediction = decoder.predict(valid_embedding)
assert prediction.shape == (30,)

Operating System

windows 10

CEBRA version

cebra version 0.2.0

Device type

gpu

Steps To Reproduce

No response

Relevant log output

Traceback (most recent call last):
  File "E:\crop\injuryrun4\test.py", line 30, in <module>
    continuous_label = cebra.load_data(file="auxiliary_behavior_data.h5", key="auxiliary_variables", columns=["continuous1", "continuous2", "continuous3"])
  File "E:\anaconda\envs\injuryrun4test\lib\site-packages\cebra\data\load.py", line 661, in load
    data = loader.load(file, key=key, columns=columns)
  File "E:\anaconda\envs\injuryrun4test\lib\site-packages\cebra\data\load.py", line 211, in load
    raise ModuleNotFoundError()
ModuleNotFoundError

Anything else?

No response

Code of Conduct

I agree to follow this project's Code of Conduct

The text was updated successfully, but these errors were encountered:

MMathisLab · 2023-08-29T16:31:16Z

@gonlairo can you take a look?

MMathisLab · 2023-09-02T14:58:35Z

@CHENGHUAN555 did you install the data pip install cebra[datasets] otherwise indeed the module is not loaded. I suggest checking out demos here: https://cebra.ai/docs/demos.html, which use a particular data loader, but nonetheless you get the idea. See install here: https://cebra.ai/docs/installation.html#id1

EricThomson · 2023-09-28T14:33:21Z

This solved it for me when I hit this error when working through the code in the Usage page.

A couple of notes (feel free to ignore 😄) -- I found the ModuleNotFound message a bit hard to interpret as it didn't say what module, so I wasn't sure how to proceed. Also, the installation page says that the datasets optional dependency is for working with the datasets at Figshare. Hence, when I got the error on the Usage page when trying to do stuff with synthetic data, I didn't consider the correct solution.

Anyway, minor wrinkles -- congrats on the cool package I'm having fun with it so far!

stes · 2023-09-28T14:37:39Z

@EricThomson , thanks for flagging. I created a new issue to track these potential improvements here: #77

CHENGHUAN555 assigned stes Aug 18, 2023

MMathisLab closed this as completed Sep 2, 2023

stes mentioned this issue Sep 28, 2023

error messages for missing extra dependencies #77

Closed

2 tasks

stes mentioned this issue Oct 3, 2023

Enhance ModuleNotFoundError messages #85

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About failed to read the hdf5 file #57

About failed to read the hdf5 file #57

CHENGHUAN555 commented Aug 18, 2023 •

edited by MMathisLab

MMathisLab commented Aug 29, 2023

MMathisLab commented Sep 2, 2023

EricThomson commented Sep 28, 2023

stes commented Sep 28, 2023

About failed to read the hdf5 file #57

About failed to read the hdf5 file #57

Comments

CHENGHUAN555 commented Aug 18, 2023 • edited by MMathisLab

Is there an existing issue for this?

Bug description

Operating System

CEBRA version

Device type

Steps To Reproduce

Relevant log output

Anything else?

Code of Conduct

MMathisLab commented Aug 29, 2023

MMathisLab commented Sep 2, 2023

EricThomson commented Sep 28, 2023

stes commented Sep 28, 2023

CHENGHUAN555 commented Aug 18, 2023 •

edited by MMathisLab