Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make error message on invalid training column values more clear #102

Open
fedarko opened this issue Jan 8, 2020 · 1 comment
Open

Make error message on invalid training column values more clear #102

fedarko opened this issue Jan 8, 2020 · 1 comment
Labels
ui Improving the user interface

Comments

@fedarko
Copy link
Collaborator

fedarko commented Jan 8, 2020

@vjcantu and I ran into this today -- if values in this column are labelled e.g. train and test, then 0 Train samples are identified (see line 189 below) and then later on Tensorflow fails with an obscure error that num_classes should be positive, got 0.

songbird/songbird/util.py

Lines 178 to 197 in 22ec2b5

def split_training(dense_table, metadata, design, training_column=None,
num_random_test_examples=10, seed=None):
if training_column is None:
np.random.seed(seed)
idx = np.random.random(design.shape[0])
i = np.argsort(idx)[num_random_test_examples]
threshold = idx[i]
train_idx = ~(idx < threshold)
else:
train_idx = metadata.loc[design.index, training_column] == "Train"
trainX = design.loc[train_idx].values
testX = design.loc[~train_idx].values
trainY = dense_table.loc[train_idx].values
testY = dense_table.loc[~train_idx].values
return trainX, testX, trainY, testY

Even if we'd prefer to leave this as case sensitive, it might be good to add a note to the README/FAQs explaining where this error comes from and/or that the training column stuff is case sensitive.

@mortonjt
Copy link
Collaborator

mortonjt commented Jan 8, 2020

👍 on the README update.

Note that we already have this explicit in the README and the CLI documentation (see this line). So maybe screenshots or notes on the exact casing is necessary.

fedarko added a commit to fedarko/songbird that referenced this issue Feb 4, 2020
Ideally the error message given would be better, but I guess that's
a future TODO.
@fedarko fedarko changed the title Making training/test column split case-insensitive? Make error message on invalid training column values more clear Feb 4, 2020
@fedarko fedarko added the ui Improving the user interface label Feb 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ui Improving the user interface
Projects
None yet
Development

No branches or pull requests

2 participants