Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: remove prtDataSetCellArray #52

Closed
patrickkwang opened this issue Jan 29, 2017 · 3 comments
Closed

proposal: remove prtDataSetCellArray #52

patrickkwang opened this issue Jan 29, 2017 · 3 comments

Comments

@patrickkwang
Copy link
Collaborator

I believe that prtDataSetClassReshape solves the same problem better.

Three things use prtDataSetCellArray: prtDataGenCifar1, prtDataSetTimeSeries, and prtDataGenMsrcorid. The first two appear to be replaceable with prtDataSetClassReshape with no issues. For prtDataGenMsrcorid, the observations are actually different shapes... It's an older dataset anyway and probably superceded by something like ImageNet, but I don't know that there's a good reason to allow prtDataSets with observations of arbitrary sizes.

Are there other use cases for this of which I'm not aware?

@peterTorrione
Copy link
Collaborator

I think you're right that most of the CIFAR/image ones can be done with prtDataSetClassReshape as long as the image chips are the same size. BUT prtDataSetCellArray is specifically for the case were observations are different sizes. For example, prtDataSetTimeSeries can handle time-series of arbitrary length, and that's important, I think?

I propose switching anything that CAN use prtDataSetClassReshape to use it, but leaving prtDataSetCellArray as it's specific use case is for images of varying sizes, graphs, time-series of unknown lengths, etc.

@patrickkwang
Copy link
Collaborator Author

@peterTorrione pointed out that the PRT machinery is useful for data sets with observations of different sizes, provided that you have some sort of shape/size-invariant feature extractor (as a prtAction). I can't find any such feature extractor currently included with PRT, but nevertheless I concede the point.

When observations have a consistent shape, as with prtDataGenCifar1, prtDataSetClassReshape should be used instead of prtDataSetCellArray. It stores data more efficiently and provides some additional useful methods.

@peterTorrione
Copy link
Collaborator

Just as an example - I think you can actually do this with

dsTimeSeries = ... % time series data as a prtDataSetCell
class = prtClassMap('rvs',prtRvHmm);
yOut = class.kfolds(dsTImeSeries);

But we're in agreement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants