New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PredictGender test investigate #972
Comments
Also adding note on reworking example to better align to best practices |
The reason it predicts only one class is because it needs tuning. Keeping issue open for reworking to better align with best practices. |
The maxNameLength = 88; in the code is the root of the issue. It affects the number of units in the network, and most of these have a 0 input value (because of padding). In the data I used, the average name length is 10 characters. If we cap input name to have varying lengths == 88, 50, 25, and 10, you can see the confusion matrix is affected (88 causes the network to predict only one category because most of the input values are 0). The best predictions came from a maxNameLength of 10, which was the average in the data I used. To cap the input name length, I set maxNameLength and two lines of code in the method nameToBinary() in class GenderRecordReader. Here are the two lines (one added, one changed):
|
Looks like predictions are very skewed towards one class even though training metrics look okay.
Assign me.
The text was updated successfully, but these errors were encountered: