Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Blank invalid exception while creating classifier #176

Closed
gaetschwartz opened this issue Feb 2, 2021 · 13 comments
Closed

Blank invalid exception while creating classifier #176

gaetschwartz opened this issue Feb 2, 2021 · 13 comments
Assignees

Comments

@gaetschwartz
Copy link

Here is my data :

(src, day, time, dest)
(0, 0, 450, 4)
(1, 0, 110, 5)
(0, 1, 450, 4)
(1, 1, 110, 5)
(0, 2, 450, 4)
(1, 2, 110, 5)
(0, 3, 450, 4)
(1, 3, 110, 5)
(0, 4, 450, 4)
(1, 4, 110, 5)
(0, 5, 450, 4)
(1, 5, 110, 5)
(2, 6, 660, 6)
(3, 6, 1170, 7)
(0, 0, 450, 4)
(1, 0, 110, 5)
(0, 1, 450, 4)
(1, 1, 110, 5)
(0, 2, 450, 4)
(1, 2, 110, 5)
(0, 3, 450, 4)
(1, 3, 110, 5)
(0, 4, 450, 4)
(1, 4, 110, 5)
(0, 5, 450, 4)
(1, 5, 110, 5)
(2, 6, 660, 6)
(3, 6, 1170, 8)

And it then throws this exception while trying try to create the classifier:

Unhandled exception:
Invalid argument(s)
#0      _TypedList._setFloat32 (dart:typed_data-patch/typed_data_patch.dart:2126:36)
#1      _Float32ArrayView.[]= (dart:typed_data-patch/typed_data_patch.dart:4461:16)
#2      new Float32MatrixDataManager.fromList
package:ml_linalg//data_manager/float32_matrix_data_manager.dart:37

#3      MatrixFactoryImpl.fromList
package:ml_linalg//matrix/matrix_factory_impl.dart:21
#4      new Matrix.fromList
package:ml_linalg/matrix.dart:42
#5      DataFrameImpl.toMatrix
package:ml_dataframe//data_frame/data_frame_impl.dart:143
#6      createLogLikelihoodOptimizer
package:ml_algo//_helpers/create_log_likelihood_optimizer.dart:46

#7      LogisticRegressorFactoryImpl.create
package:ml_algo//logistic_regressor/logistic_regressor_factory_impl.dart:58
#8      new LogisticRegressor
package:ml_algo//logistic_regressor/logistic_regressor.dart:153
#9      main.<anonymous closure>
bin\knn.dart:41
#10     main
bin\knn.dart:53
<asynchronous suspension>

Classifier is constructed this way :

 final createClassifier = (DataFrame samples) => LogisticRegressor(
        samples,
        targetColumnName,
        optimizerType: LinearOptimizerType.gradient,
        iterationsLimit: 90,
        learningRateType: LearningRateType.decreasingAdaptive,
        batchSize: samples.rows.length,
        probabilityThreshold: 0.7,
      );
@gyrdym
Copy link
Owner

gyrdym commented Feb 15, 2021

@gaetschwartz thank you for creating the issue, I'll take a look at this

@gyrdym gyrdym self-assigned this Feb 15, 2021
@jose-almir
Copy link

Hello everyone, this happens to me too. Follow my sample data:

Q1_1,Q1_2,Q1_3,Q1_4,Q2_1,Q3_1,Q3_2,Q3_3,Q3_4,Q3_5,Q3_6,Q3_7,Q3_8,Q3_9,Q3_10,Q3_11,Q3_12,Q3_13,Q3_14,Q4_1,Q4_2,Q4_3,Q4_4,Q4_5,Q4_6,Q4_7,Q4_8,Q4_9,Q4_10,Q4_11,Q4_12,Q4_13,MORT
2,4,4,4,3,3,1,2,2,3,5,6,2,3,5,5,1,2,2,2,2,2,3,5,1,1,3,3,4,1,1,1,I
2,1,1,1,3,3,2,4,4,5,5,6,2,3,5,7,1,2,2,2,3,4,4,7,1,1,3,3,2,3,1,1,I
2,3,4,1,3,3,1,2,4,2,3,6,2,3,5,8,1,2,2,2,3,4,4,7,1,1,3,4,1,2,1,1,O
2,1,1,1,3,1,3,4,2,3,1,6,1,3,2,2,1,2,2,5,3,4,4,7,1,1,3,5,4,2,1,1,M
2,3,4,1,3,2,1,2,4,4,3,6,2,3,5,5,1,2,2,3,3,4,4,7,1,1,3,2,3,1,4,1,O
2,1,1,1,3,2,3,2,3,4,1,6,2,3,5,4,1,2,2,3,3,4,4,7,1,1,2,2,3,1,1,1,M
2,2,3,1,3,3,3,4,3,5,5,6,2,3,5,8,1,2,2,2,3,4,4,7,2,1,3,3,2,1,1,1,O
1,3,3,1,1,2,2,1,4,3,4,6,2,2,2,5,1,2,2,3,3,4,4,7,1,1,3,3,4,1,2,1,O
2,1,1,1,3,2,3,4,4,3,5,6,2,2,5,5,1,2,2,3,3,4,4,7,1,1,3,3,4,1,2,1,O
2,1,1,1,3,2,1,2,4,3,3,6,2,2,5,7,1,2,1,3,3,4,3,7,2,1,3,3,3,2,1,1,I
2,2,4,1,3,4,2,1,3,2,1,6,2,3,2,8,1,2,2,2,3,4,4,7,2,1,3,3,4,1,1,1,O
2,1,1,1,4,3,1,1,4,6,4,6,2,2,5,5,2,2,2,3,3,4,4,7,2,1,3,3,4,1,1,1,I
1,1,1,1,1,2,2,4,4,5,5,2,1,3,2,6,1,2,2,2,3,4,4,7,2,1,3,4,2,1,1,1,I
2,3,4,1,3,2,3,2,3,3,4,6,2,3,2,7,1,2,2,3,3,4,4,7,2,1,3,3,3,1,1,1,M
2,1,1,1,3,3,2,2,2,5,5,6,1,3,5,9,1,2,2,4,3,4,4,7,2,1,3,2,2,1,1,1,O
2,1,1,1,3,4,2,2,3,4,3,6,2,2,1,9,1,2,2,2,2,2,3,4,2,1,3,3,4,3,1,1,O
4,2,2,1,3,4,3,2,3,4,4,6,2,2,1,9,1,2,2,2,2,2,3,4,2,1,3,3,4,3,1,1,I
2,3,4,3,3,4,1,2,3,2,2,4,2,3,5,6,1,1,2,3,3,4,4,7,2,1,2,4,2,3,1,2,I
2,3,1,1,3,3,2,1,2,1,4,3,2,3,5,4,1,2,2,2,3,4,4,7,2,1,2,4,2,3,1,2,I
2,3,4,4,3,3,2,3,4,1,1,1,2,2,2,6,1,1,2,3,3,2,2,6,1,1,2,3,3,3,1,2,I
2,2,4,1,3,2,1,2,3,5,1,6,2,3,2,7,1,2,1,3,3,4,3,6,2,1,2,3,4,1,1,1,I
2,1,1,1,3,2,3,4,3,4,1,6,2,3,5,5,1,2,2,4,3,4,4,6,1,1,2,3,3,1,1,1,M
2,2,1,4,1,2,1,1,4,4,5,4,2,3,3,7,1,2,2,2,3,4,3,7,1,1,3,5,4,2,1,1,M
2,1,1,1,1,3,1,2,4,4,5,5,2,3,3,7,1,2,2,2,3,4,3,7,1,1,3,5,5,1,1,1,O
3,3,1,4,3,3,1,2,3,5,4,5,2,3,3,9,1,2,2,2,3,4,3,6,1,1,1,1,3,1,1,1,I
3,1,1,1,3,4,2,2,3,5,4,6,2,2,1,9,1,2,2,2,3,4,4,7,1,1,3,4,5,1,1,1,M
4,1,1,1,3,4,2,2,3,4,2,6,2,3,3,9,1,2,2,2,3,4,4,7,1,1,3,2,3,1,1,1,M
3,2,1,1,3,4,1,1,2,4,4,2,2,2,2,3,1,1,2,2,3,4,3,7,1,1,3,4,4,2,1,1,I
2,1,1,1,3,5,2,1,3,1,5,6,2,3,2,9,1,2,2,2,3,4,3,6,1,1,3,2,2,1,1,1,O
2,3,4,3,3,4,1,2,3,2,2,4,2,3,5,6,1,1,2,3,3,4,4,7,2,1,2,4,4,1,1,2,I
2,3,3,4,3,4,1,2,3,2,2,4,2,3,5,6,1,1,2,3,3,4,4,7,2,1,2,6,5,2,1,2,I
2,3,4,4,3,4,1,2,3,2,2,4,2,3,5,6,1,1,2,3,3,4,4,7,2,1,2,5,3,1,1,2,M
2,3,4,4,3,4,1,2,3,2,2,4,2,3,5,6,1,1,2,3,3,4,4,7,2,1,2,5,5,1,1,2,M
2,3,4,3,3,4,1,2,3,2,2,4,2,3,5,6,1,1,2,3,3,4,4,7,2,1,2,6,4,2,1,2,I
2,2,4,4,3,4,1,2,3,2,2,4,2,3,5,6,1,1,2,3,3,4,4,7,2,1,2,6,4,1,1,2,I
2,3,4,4,3,4,1,2,3,2,2,4,2,3,5,6,1,1,2,3,3,4,4,7,2,1,2,6,5,2,1,2,I
2,3,4,4,3,4,1,2,3,2,2,4,2,3,5,6,1,1,2,3,3,4,4,7,2,1,2,5,3,2,1,2,I
2,3,4,4,3,4,1,2,3,2,2,4,2,3,5,6,1,1,2,3,3,4,4,7,2,1,2,5,3,2,1,2,I
2,3,4,4,3,4,1,2,3,2,2,4,2,3,5,6,1,1,2,3,3,4,4,7,2,1,2,6,4,2,1,2,I
2,3,4,4,3,4,1,2,3,2,2,4,2,3,5,6,1,1,2,3,3,4,4,7,2,1,2,6,3,2,1,2,I
2,3,4,4,3,3,2,1,2,1,4,3,2,3,5,4,1,2,2,2,3,4,4,7,2,1,2,5,2,3,1,2,O
2,3,4,4,3,3,2,1,2,1,4,3,2,3,5,4,1,2,2,2,3,4,4,7,2,1,2,5,2,3,1,2,M
2,3,4,4,3,3,2,1,2,1,4,3,2,3,5,4,1,2,2,2,3,4,4,7,2,1,2,5,2,2,1,2,I
2,3,4,4,3,3,2,1,2,1,4,3,2,3,5,4,1,2,2,2,3,4,4,7,2,1,2,5,2,2,1,2,O
2,3,4,4,3,3,2,1,2,1,4,3,2,3,5,4,1,2,2,2,3,4,4,7,2,1,2,5,2,2,1,2,O
2,4,4,4,3,3,2,1,2,1,1,3,2,3,5,4,1,2,2,2,3,4,4,7,2,1,2,5,4,1,1,2,O
2,4,4,4,3,3,2,1,2,1,1,3,2,3,5,4,1,2,2,2,3,4,4,7,2,1,2,5,4,1,1,2,I
2,4,4,4,3,3,2,1,2,1,1,3,2,3,5,4,1,2,2,2,3,4,4,7,2,1,2,5,4,2,1,2,I
2,4,4,4,3,3,2,1,2,1,1,3,2,3,5,4,1,2,2,2,3,4,4,7,2,1,2,5,4,1,1,2,O
2,4,4,4,3,3,2,1,2,1,1,3,2,3,5,4,1,2,2,2,3,4,4,7,2,1,2,5,4,1,1,2,O
2,1,1,1,3,5,2,1,3,1,5,6,2,3,2,9,1,2,2,2,3,4,3,6,1,1,3,3,5,1,1,1,O
2,2,1,1,3,5,2,1,3,1,5,6,2,3,2,9,1,2,2,2,3,4,3,6,1,1,3,3,4,1,1,1,O
2,1,1,1,3,5,2,1,3,1,5,6,2,3,2,9,1,2,2,2,3,4,3,6,1,1,3,3,3,1,1,1,O
2,1,1,1,3,5,2,1,3,1,5,6,2,3,2,9,1,2,2,2,3,4,3,6,1,1,3,3,4,1,1,1,M
2,1,1,1,3,5,2,1,3,1,5,6,2,3,2,9,1,2,2,2,3,4,3,6,1,1,3,3,5,1,1,1,M
2,2,1,1,3,5,2,1,3,1,5,6,2,3,2,9,1,2,2,2,3,4,3,6,1,1,3,3,5,1,1,1,M
2,2,1,1,3,5,2,1,3,1,5,6,2,3,2,9,1,2,2,2,2,3,3,6,1,1,3,4,5,1,1,1,O
2,2,1,1,3,5,2,1,3,1,5,6,2,3,2,9,1,2,2,2,2,3,3,6,1,1,3,2,2,1,1,1,O
2,2,1,1,3,5,2,1,3,1,5,6,2,3,2,9,1,2,2,2,2,3,3,6,1,1,3,4,5,1,1,1,O
2,2,1,1,3,5,2,1,3,1,5,6,2,3,2,9,1,2,2,2,2,3,3,6,1,1,3,4,5,1,1,1,O
2,1,1,1,3,4,2,2,3,4,3,6,2,2,1,9,1,2,2,2,2,2,3,4,2,1,3,3,4,2,1,1,O
2,1,1,1,3,4,2,2,3,4,3,6,2,2,1,9,1,2,2,2,2,2,3,4,2,1,3,3,4,1,1,1,O
2,1,1,1,3,4,2,2,3,4,3,6,2,2,1,9,1,2,2,2,2,2,3,4,2,1,3,3,3,2,1,1,M
2,1,1,1,3,2,2,2,3,4,3,6,2,2,1,9,1,2,2,2,2,2,3,4,2,1,3,3,4,1,1,1,O
2,1,1,1,3,4,2,2,3,4,3,6,2,2,1,9,1,2,2,2,2,2,3,4,2,1,3,4,4,3,1,1,M
2,1,1,1,3,4,2,2,3,4,3,6,2,2,1,9,1,2,2,2,2,2,3,4,2,1,3,3,3,2,1,1,M
2,1,1,1,3,4,2,2,3,4,3,6,2,2,1,9,1,2,2,2,2,2,3,4,2,1,3,3,3,2,1,1,M
2,1,1,1,3,4,2,2,3,4,3,6,2,2,1,9,1,2,2,2,2,2,3,4,2,1,3,3,3,2,1,1,M
2,1,1,1,3,4,2,2,3,4,3,6,2,2,1,9,1,2,2,2,2,2,3,4,2,1,3,3,3,1,1,1,O
2,1,1,1,3,4,2,2,3,4,3,6,2,2,1,9,1,2,2,2,2,2,3,4,2,1,3,3,3,2,1,1,M
4,2,2,1,3,4,3,2,3,4,4,6,2,2,1,9,1,2,2,2,2,2,3,4,2,1,3,5,2,2,1,1,M
4,2,2,1,3,4,3,2,3,4,4,6,2,2,1,9,1,2,2,2,2,2,3,4,2,1,3,5,3,3,1,1,I
4,2,2,1,3,4,3,2,3,4,4,6,2,2,1,9,1,2,2,2,2,2,3,4,2,1,3,5,3,1,1,1,O
4,2,2,1,3,4,3,2,3,4,4,6,2,2,1,9,1,2,2,2,2,2,3,4,2,1,3,5,3,2,1,1,M
4,2,2,1,3,4,3,2,3,4,4,6,2,2,1,9,1,2,2,2,2,2,3,4,2,1,3,5,3,2,1,1,M
4,2,2,1,3,4,3,2,3,4,4,6,2,2,1,9,1,2,2,2,2,2,3,4,2,1,3,5,4,2,1,1,I
4,2,2,1,3,4,3,2,3,4,4,6,2,2,1,9,1,2,2,2,2,2,3,4,2,1,3,5,4,3,1,1,I
4,2,2,1,3,4,3,2,3,4,4,6,2,2,1,9,1,2,2,2,2,2,3,4,2,1,3,5,4,2,1,1,M
4,2,2,1,3,4,3,2,3,4,4,6,2,2,1,9,1,2,2,2,2,2,3,4,2,1,3,5,4,3,1,1,I
4,2,2,1,3,4,3,2,3,4,4,6,2,2,1,9,1,2,2,2,2,2,3,4,2,1,3,5,4,3,1,1,I

This is my code:

import 'package:ml_algo/ml_algo.dart';
import 'package:ml_dataframe/ml_dataframe.dart';

Future<void> main(List<String> arguments) async {
  final samples = await fromCsv('./bin/questionario.csv', headerExists: true);
  final targetColumnName = 'MORT';
  final splits = splitData(samples, [0.6]);
  final validationData = splits[0];
  final testData = splits[1];
  final validator = CrossValidator.kFold(validationData, numberOfFolds: 5);
  final createClassifier = (DataFrame samples) => LogisticRegressor(
        samples,
        targetColumnName,
        optimizerType: LinearOptimizerType.gradient,
        iterationsLimit: 90,
        learningRateType: LearningRateType.decreasingAdaptive,
        batchSize: samples.rows.length,
        probabilityThreshold: 0.7,
      );
  final scores =
      await validator.evaluate(createClassifier, MetricType.accuracy);
  final accuracy = scores.mean();

  print('accuracy on k fold validation: ${accuracy.toStringAsFixed(2)}');

  final testSplits = splitData(testData, [0.8]);
  final classifier = createClassifier(testSplits[0]);
  final finalScore = classifier.assess(testSplits[1], MetricType.accuracy);

  print(finalScore.toStringAsFixed(2));

  await classifier.saveAsJson('diabetes_classifier.json');
}

This is the error message:

Unhandled exception:
Invalid argument(s)
#0 _TypedList._setFloat32 (dart:typed_data-patch/typed_data_patch.dart:2106:36)
#1 _Float32ArrayView.[]= (dart:typed_data-patch/typed_data_patch.dart:4296:16)
#2 new Float32MatrixDataManager.fromList
package:ml_linalg/…/data_manager/float32_matrix_data_manager.dart:37
#3 MatrixFactoryImpl.fromList
package:ml_linalg/…/matrix/matrix_factory_impl.dart:21
#4 new Matrix.fromList
package:ml_linalg/matrix.dart:42
#5 DataFrameImpl.toMatrix
package:ml_dataframe/…/data_frame/data_frame_impl.dart:151
#6 CrossValidatorImpl.evaluate
package:ml_algo/…/cross_validator/cross_validator_impl.dart:31
#7 main
bin/reg_log.dart:21

#8 _startIsolate. (dart:isolate-patch/isolate_patch.dart:299:32)
#9 _RawReceivePortImpl._handleMessage (dart:isolate-patch/isolate_patch.dart:168:12)

@gyrdym
Copy link
Owner

gyrdym commented Feb 16, 2021

@Jrcodev ok, thank you very much for the feedback, I'll fix that soon

@gyrdym
Copy link
Owner

gyrdym commented Feb 18, 2021

@gaetschwartz you're trying to use LogisticRegressor for multiclass classification problem, but LogisticRegressor can only be used for binary classification. Please use SoftmaxRegressor instead.

@gyrdym
Copy link
Owner

gyrdym commented Feb 18, 2021

@Jrcodev you need to preprocess your data first, since you have strings in your dataset. To do so please refer to https://github.com/gyrdym/ml_preprocessing library. And also you have the same problem as @gaetschwartz has - please use SoftmaxRegressor to classify your records, since you have more than two classes

@jose-almir
Copy link

Thanks I will try this

@gyrdym
Copy link
Owner

gyrdym commented Feb 19, 2021

@Jrcodev you're welcome, don't hesitate to ask me if you face any troubles connected to data preprocessing

@jose-almir
Copy link

@Jrcodev you're welcome, don't hesitate to ask me if you face any troubles connected to data preprocessing

I am totally new to this subject, I received a project to create classifications. In reality my data comes from firebase, I am in doubt about the json format that Dataframe.fromJson accepts. I tried to understand how the Dataframe.fromJson function works, but because of codegen I couldn't

@gyrdym
Copy link
Owner

gyrdym commented Feb 21, 2021

@Jrcodev DataFrame.fromJson restores previously created dataframe - that's my fault, I should've named it more clearly, e.g. DataFrame.restoreFromJson, I suggest you to convert your data into list of rows, and don't forget about a header of your dataset - either add the header as the first row to the list of rows or specify parameter header. I definitely need to add some docs to https://github.com/gyrdym/ml_dataframe lib

@gaetschwartz
Copy link
Author

@gyrdym Alright, correct. I was probably tired. Another thing, when creating a SoftmaxRegressor, the second argument is targetNames. But why can't it take only one targetName ?

@gyrdym
Copy link
Owner

gyrdym commented Feb 26, 2021

@gaetschwartz the thing is that you need to encode your target column first, since it may contain raw data (e.g. string labels of classes), but SoftmaxClassifier can only deal with numeric data. Usually, the target class column turns into several columns after encoding (e.g., after one-hot encoding) - the exact number of columns is equal to the number of classes. Please, refer to https://github.com/gyrdym/ml_preprocessing#one-hot-encoding for more information. Thank you very much for writing this, I'll definitely add some documentation on this to ml_algo lib since it looks a bit vague.

@gyrdym
Copy link
Owner

gyrdym commented Mar 17, 2021

@gaetschwartz @Jrcodev hi everyone, is there anything I can help you with? Are the problems discussed above still relevant?

@jose-almir
Copy link

The problem was solved

@gyrdym gyrdym closed this as completed Mar 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants