consider limiting the scope of the library #2

kmike · 2018-09-26T22:09:12Z

Hey! I came here from the http://explained.ai/decision-tree-viz/index.html article (which is a great write-up 👍). One thing bothered me => unsolicited suggestion.

In the README you have:

This is the start of a python machine learning library to augment scikit-learn. At the moment, all we have is functionality for decision tree visualization and model interpretation.

From an user perspective: why not have a package just for tree visualization and interpretation, and have this planned "python machine learning library to agument scikit-learn" depending on a tree visualization & inspection package, if needed?

The task of inspecting trees looks quite specific, do you really need to add solutions for unrelated problems (some ML algorithms?) to the same package? Obviously, I don't know what you've planned for the whole library, so please excuse me if that doesn't make any sense :)

A few use cases, for having a dedicated tree viz library:

tree visualization has a few dependencies, which other code may not need; it can be the other way around as well - people who want tree visualization may have to install unrelated dependencies. It is solvable, but needs care.
https://github.com/TeamHG-Memex/eli5 is not developed actively right now, but we'd absolutely consider making your tree visualization a default, which implies having it as a dependency (probably an optional one); depending on a general-purpose ML library just for its visualization features is less nice than depending on a visualization/inspection library.
by having separate packages you may get different release schedules, different contributors, etc. In a large package some code usually gets outdated and deprecated over time. If deprecated code is a separate package, one can just leave it as-is - no need to remove it from an all-in-one library, and no need to maintain it if there is no motivation.

parrt · 2018-09-28T00:25:42Z

Hiya! An excellent question. I thought about that but I actually was planning on incorporating all of the goodies that I use regularly for machine learning into one repository, but what you are saying makes sense. Jeremy @jph00 what are your thoughts here? Should we isolate the graphics part with all of its dependencies? if we did do that then I should probably rename this repository to something like dtreeviz.

parrt · 2018-09-29T15:31:25Z

I'm going to rename to dtreeviz. :)

parrt · 2018-09-29T15:57:52Z

Fixed with 8411604 renamed library to dtreeviz, adjusted packages and notebooks and readme. @kmike

kmike · 2018-09-29T16:33:19Z

Thanks a ton @parrt!

parrt mentioned this issue Sep 28, 2018

enormous repo size (~400mb) #3

Closed

parrt closed this as completed Sep 29, 2018

parrt added the enhancement New feature or request label Sep 29, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

consider limiting the scope of the library #2

consider limiting the scope of the library #2

kmike commented Sep 26, 2018

parrt commented Sep 28, 2018

parrt commented Sep 29, 2018

parrt commented Sep 29, 2018 •

edited

kmike commented Sep 29, 2018

consider limiting the scope of the library #2

consider limiting the scope of the library #2

Comments

kmike commented Sep 26, 2018

parrt commented Sep 28, 2018

parrt commented Sep 29, 2018

parrt commented Sep 29, 2018 • edited

kmike commented Sep 29, 2018

parrt commented Sep 29, 2018 •

edited