fixing various errors on the file classification_with_grn_and_vsn #2011
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Dataset preparation errors
The example file from structured_data
classification_with_grn_and vsn.pyI think it is using the wrong dataset, i.e., the data_url:https://archive.ics.uci.edu/static/public/20/census+income.zipleads to a download of an incorrect dataset. The correct data_url, I believe should be:https://archive.ics.uci.edu/static/public/117/census+income+kdd.zipTo extract the downloaded
.tar.gz file, created during keras.utils.get_file, a fix has been added.A fix was also added to clean up the directory that the files where extracted to during download in order to run the script again without errors:
Additionally, the original script has the code snippet:
The above snippet doesn't account for the directory created during
keras.utils.get_fileextraction processcensus+income+kdd.zipwhich leads to an incorrect path for bothtrain_data_pathandtest_data_path, and a fix has been added.Additional training errors
After covering the above dataset's preparation process, the script also has an additional error encountered during model training, detailed below and an attempted solution provided:
Attempted solution:
I believe I have precisely traced the error to the following, here is a pdb script:
The function _convert_inputs_to_tensors creates a
zip iteratorpairing togetherflat_inputsandself._inputs, and as per thepdboutput above the first element (age) from flat_inputs and self._inputs hasfloat32dtype, however the second element (capital_gains) has afloat32dtype and astringdtype causing the discrepancy, and hence the error.The main issue is that
inputsdatatype to the methodFunctional.callis aOrderedDictand in the function _standardize_inputs the lineflat_inputs = tree.flatten(inputs)is not actually ordering/sorting theOrderedDictas per doc for the function tree.flatten. This contributes to the mismatch betweenself._inputs, the models inputs, and flat_inputs. Hence a fix has been provided in the script function process to convert features to dict.Fix provided, and I think tree.flatten functionality must be assessed and rectified.
Environment