Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor real data generators #30

Open
kwinkunks opened this issue Sep 16, 2021 · 0 comments
Open

Refactor real data generators #30

kwinkunks opened this issue Sep 16, 2021 · 0 comments
Labels
bug Something isn't working

Comments

@kwinkunks
Copy link
Member

Not strictly a bug, but it's current behaviour is not ideal.

Right now, the data are not shuffled — and hitting Regenerate does nothing. So you always have the same test data points with these datasets.

Current method: slice 2n positive and n negative samples from the beginning of the array for train, and 2n +ve and n -ve from the end of the array (hence the negative slices in the Test function). It's not a great way to do it. It also means you can't shuffle the data.

The problem is that there is two separate functions and they don't know about each other. So how does the Test generator know which samples were used for Train? I think the Train generator has to pass back the indices of the Test set, which we can then pass to the Test generator.

This combines and closes #15 and #23.

@kwinkunks kwinkunks added the bug Something isn't working label Sep 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant