Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove the parameter drop_duplicates #182

merged 4 commits into from Mar 13, 2019


None yet
2 participants
Copy link

commented Mar 8, 2019


Currently the drop_diuplicates parameter from flow_from_dataframeand ``DataFrameIterator` can deduplicate filenames from the user's input. My arguments for removing it are:

1.- In the same way that sort and has_ext have nothing to do with the logic the user expects and that should be handled by the DataFrameIterator, drop_duplicates also has nothing to do with the logic the class should handle. Look #123 and #122 for more explanation.
2.- It produces an unexpected behavior to the user, and several issues have been raised already because of this: #96 #175 and #181

Related Issues

#96 #175 #181

PR Overview

  • [n ] This PR requires new unit tests [y/n] (make sure tests are included)
  • [ y] This PR requires to update the documentation [y/n] (make sure the docs are up-to-date)
  • [ y] This PR is backwards compatible [y/n]
  • [ n] This PR changes the current API [y/n] (all API changes need to be approved by fchollet)

@rragundez rragundez requested a review from Dref360 Mar 8, 2019

@rragundez rragundez force-pushed the rragundez:remove-drop-duplicates branch from c5f2351 to 11c5a69 Mar 13, 2019

@Dref360 Dref360 merged commit 7c3e2f6 into keras-team:master Mar 13, 2019

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed

@rragundez rragundez deleted the rragundez:remove-drop-duplicates branch Mar 13, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.