Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected pipeline behavior with NDArrayField #271

Open
rmchurch opened this issue Dec 14, 2022 · 0 comments
Open

Unexpected pipeline behavior with NDArrayField #271

rmchurch opened this issue Dec 14, 2022 · 0 comments

Comments

@rmchurch
Copy link

I don't think this is a bug, but as a new user it surprised me, perhaps its documented but if not perhaps should be.
I define a dataset like so:

fshape = (1,31,32)
writer = DatasetWriter(write_path, {
    'data': NDArrayField(shape=fshape, dtype=np.dtype('float64')),
    'target': NDArrayField(shape=fshape, dtype=np.dtype('float64')),
    'vol': NDArrayField(shape=(1,fshape[-1]), dtype=np.dtype('float64')),
    'temp': NDArrayField(shape=(1,), dtype=np.dtype('float64')),

}, num_workers=64) 

After writing the .beton file, I at first tried creating a loader using the same pipeline

float_pipeline = [NDArrayDecoder(), ToTensor()]

# Pipeline for each data field
pipelines = {
    'data': float_pipeline,
    'target': float_pipeline,
    'vol': float_pipeline,
    'temp': float_pipeline
}       

loader = Loader(ffcv_file, batch_size=64, num_workers=8,
                order=OrderOption.RANDOM, pipelines=pipelines)
data,target,vol,temp = next(iter(loader))

However, all of the variables have the shape of the smallest array, in this case temp (i.e. data is shape (Nbatch,1), where it should be (Nbatch,1,31,32)).

When I create separate pipelines for each variable which is a different size, then things come out correctly:

float_pipeline = [NDArrayDecoder(), ToTensor()]
vol_pipeline = [NDArrayDecoder(), ToTensor()]
T_pipeline = [NDArrayDecoder(), ToTensor()]

# Pipeline for each data field
pipelines = {
    'data': float_pipeline,
    'target': float_pipeline,
    'vol': vol_pipeline,
    'temp': T_pipeline
}      
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant