-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
porting an example from tensorflow #86
Comments
Hi, you can train directly on multi-dimensional numpy data as explained in the documentation: https://ydf.readthedocs.io/en/latest/tutorial/multidimensional_feature The super short version of it is (with random data)
|
Hi, |
Ok, Here is the test. Extract files(train.npy, test.npy) from the attached zip file import numpy as np
import ydf
train_data = np.load('train.npy')
train_label = np.random.randint(0, 2, size=(train_data.shape[0]))
print(train_data.shape)
train_ds = {"features": train_data, "label": train_label}
model = ydf.GradientBoostedTreesLearner(label="label").train(train_ds)
test_data = {"features": np.load('test.npy')}
predictions = model.predict(test_data)
print(predictions) For the same data, TensorFlow's predictions are 99% correct but ydf's predictions look random to me. Am I missing something |
This notebook shows how to train a model on this dataset and make predictions with a Random Forest and a Gradient Boosted Trees model. The notebook also runs a cross-validation to evaluate the quality of predictions on this small dataset. The model self evaluation ( You mention that with "TensorFlow's predictions are 99% correct". Are you sure you are using the same dataset? If so, are you sure you are not evaluating on the training dataset? |
Hello,
I'm trying to convert a simple of project of silent (little noise) detection in audio files to ydf from tensorflow.
The input data is single numpy array of shape (1500, 20). There are 1500 samples of Mel Frequency Cepstrum Coefficient (MFCC) with 20 floats in each.
How do I train this data using ydf?
Later I would like to generate predictions of a single MFCC array of 20 floats.
Thanks
The text was updated successfully, but these errors were encountered: