Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify namespaces #387

Closed
breznak opened this issue Apr 9, 2019 · 43 comments
Closed

Simplify namespaces #387

breznak opened this issue Apr 9, 2019 · 43 comments
Labels
API_BREAKING_CHANGE code code enhancement, optimization, cleanup..programmer stuff question Further information is requested

Comments

@breznak
Copy link
Member

breznak commented Apr 9, 2019

I'd like to propose the following namespace reorganization in order to simplify its use.

nupic //contains all (Algorithms + NetworkAPI + basic types (UInt), not Tesing, Examples
nupic::testing // only for tests, users don't need this
nupic::examples // separate examples, not part of the library
(optional) nupic::networkapi // all of NetworkAPI, but don't expose Algorithms (for readability)
(optional) nupic::algorithms // all of nupic/algorithms,encoders, ... , but you don't see NetworkAPI; see #183 (but that did much more, this is just enough with the namespaces) 
nupic::bindings::py (later others) 
subnamespace ::internal // eg SpatialPooler is in nupic::algorithms, but 
  • simplified use of our lib, just using namespace nupic; or using TM = nupic::TemporalMemory;
  • would break API, but fix is really easy (keep only using namespace nupic;)
  • we could(should?) even keep using existing much more fine-grained namespace structure, and just expose all of them to a public namespace
// somewhere that gets imported everywhere (Log, Types?) 
namespace nupic {
using namespace nupic::algorithms::temporal_memory;
using namespace nupic::.... 
  • Q1: not necessary, but I'd like to offer separate nupic::networkapi & nupic::algorithms, because when working with these, you almost never mix the 2 layers, so the interface would be clearer.
  • Q2: do we want a subnamespace ::extras (or experimental, advanced,...)? to cleanup some of the numenta's (too public) API? Like SP: "public" ::nupic: compute, get/setXXX,...;[think of what is used in Hotgym example, what a consumer building an app uses]; ::extras (getOverlaps, TM::growSynapses, probably Connections... ) [think of what a developer needs for finetuning some algorithm, developing a new alternative implementation, etc ]
  • Q3: do we need subnamespace ::internal for stuff that is dependency, but should not be exposed to public ::nupic? //eg ::nupic::networkapi : TMRegion, while ::n::networkapi::internal : CppRegionImpl
    • or these would just keep using their specific namespaces, and only won't be exposed to "public" nupic::networkapi ? (I'm not sure I'm clear on this one, or if it really is an issue)

TL;DR: Q1,2,3 are nit necessary to solve (now), even without them this would be an improvement

@breznak breznak added question Further information is requested API_BREAKING_CHANGE code code enhancement, optimization, cleanup..programmer stuff labels Apr 9, 2019
@breznak breznak added this to the Release version 1.0 milestone Apr 9, 2019
@dkeeney
Copy link

dkeeney commented Apr 9, 2019

The purpose of namespaces is to protect ourselves from name conflicts in code that we don't control. We control all of nupic C++ code therefore only one namespace is needed.

So, in my opinion anything more than namespace nupic is an unnecessary complication.

@breznak
Copy link
Member Author

breznak commented Apr 9, 2019 via email

@dkeeney
Copy link

dkeeney commented Apr 9, 2019

I see namespaces also as organizational feature. Ie IDE would offer only code from imported ns,

Hmmm, never saw an IDE that does that.
No, the filesystem should be the organizational feature. We don't need two of them.
Also, I don't think we want access control at all. The objective is for users of the system to use this as a base for experimenting so ALL code should be accessible to them...maybe not all accessible from python but a C++ user should see anything they want.

It would be nice to identify the official API boundary but namespaces don't do that for us unless we drastically change the layout of our classes. Not worth it. Much more effective to just identify the API boundary with documentation.

@breznak breznak added question Further information is requested and removed question Further information is requested labels Apr 9, 2019
@breznak
Copy link
Member Author

breznak commented Apr 23, 2019

adding #389 (comment)

Matt wants to keep the nupic namespace for htmresearch and be separate from us. He may have been thinking PYPI namespace but I think it should also apply to the C++ namespace. The purpose of a namespace is to avoid collision of names. If they want to control the nupic namespce then we need something else. htm seems like a likely candidate if we can get it from PYPI.

We should validate this is true. If so, it must happen before v1.0, it'll also cause API breakage, although very simple to fix.

@dkeeney
Copy link

dkeeney commented Apr 24, 2019

This has been validated: https://discourse.numenta.org/t/python-3-pypi-community-fork-namespace/5819
Matt really does want us to change the namespace name. It is good in that it means we completely control the htm namespace but it will be some work.

This is going to be a big PR.
It is also not something that we can break up into multiple PR's and expect each to build properly.
So I propose the following:

  • Find a point when all on-going PR's have been pushed to Master then don't start any new PR's until this has been completed.
  • This will be basically a mass-replace of nupic to htm.
  • We start one PR and we all contribute parts of it. We divide the work so that we don't duplicate work.
    • I will take the NetworkAPI
    • @breznak would you can take the C++ algorithms.
    • @ctrl-z-9000-times would you can take all of the py/ and bindings/
  • After everyone has contributed all is pushed to that one PR it should build. I am not expecting much in terms of conflicts or munging of code that would break things.
  • We then need to change the repository name. I suggest HTM.core but I am open to most anything.
  • I will try to put together something what will allow our repository to be published to PYPI.
  • Since this is an API breaking change we will need some more documentation. Definitely in the README.md but perhaps also in a Wiki describing how to use the algorithms, both by direct calls and via the NetworkAPI. Also, building HTM apps via Python and via C++.
  • Advertise in the Forum the fact that the namespace names have changed.

@breznak
Copy link
Member Author

breznak commented Apr 24, 2019

Find a point when all on-going PR's have been pushed to Master then don't start any new PR's until ..

yes, but likely won't happen. I don't think we need to worry much, the conflict resolution will be very simple for any open PRs.

This will be basically a mass-replace of nupic to htm.

Do we go just with the substitution of nupic->htm, or we also proceed with the proposed simplification above?

breznak would you can take the C++ algorithms

👍

We then need to change the repository name. I suggest HTM.core

I'm not a fan of another repo renaming, but if needed be... Do other community projects use nupic.XXX, or htm.XXX? Eventually, htm.core sounds good to me.

This has been validated: https://discourse.numenta.org/t/python-3-pypi-community-fork-namespace/5819
Matt really does want us to change the namespace name.

Thanks for asking this, @dkeeney 👍

@dkeeney
Copy link

dkeeney commented Apr 24, 2019

Do we go just with the substitution of nupic->htm, or we also proceed with the proposed simplification above?

Most of the work is just a mass replace of nupic to htm.

As for the simplification, you know I am not a fan of layers of namespaces unless there is a real need for it. Anyone using our API would have to add all of the usings so this affects the users. Perhaps we need a discussion about this. Can you convince me it serves a purpose?

@breznak
Copy link
Member Author

breznak commented Apr 24, 2019

I am not a fan of layers of namespaces unless there is a real need for it

that's why I've proposed the simplification. But true that I would be some more spacific that just 1 namespace "htm".

Can you convince me it serves a purpose?

https://cppdepend.com/blog/?p=336
esp points 2,5.

  • for point 2, I'd like namespaces
    • ::htm (current nupic; + nupic::algorithms::*),
    • ::htm::networkapi
    • ::htm::testing, htm::examples ..as we have now for extra code, that is not necessarily needed for the library
  • for point 5, (this is really discutable, though I find it useful)
  • *::advanced to shield any extra/advanced functions. For example:
using htm; 
//you have SpatialPooler, and SP::compute() here
using htm::advanced (or if we keep matching with numenta's htm::algorithms::temporal_memory::advanced) 
would give you some "advanced" functions of that class, not only compute and get/set-ers. 

@dkeeney
Copy link

dkeeney commented Apr 24, 2019

https://cppdepend.com/blog/?p=336

  • Ok, I agree with point 1.
  • point 2 does not make since to me. The argument sounds like that just because it is a new feature of C++ we should use it. This results in namespaces with high cohesion and high modularity, and with minimal coupling between namespaces. ... I fail to see how namespace-by-feature does this.
  • point 5 is to hide details. This I agree is a good argument, but we are not trying to hide the algorithm entry points. Hiding the Connections class for example has a rational for a sub-namespace because we don't want normal users to be calling it directly.

Modularity is a good thing but that is achieved by controlling who calls what. I don't see how namespaces would prevent algorithm code from calling NetworkAPI entry points without also hiding them from the user.

What I would like is for our users to only need to know one namespace, and that should be htm.

@ctrl-z-9000-times
Copy link
Collaborator

I dont think that testing needs its own namespace, since the unit tests are not linked with the library, and there are no header files for the unit tests. The unit tests are effectively an application instead of a part of the library.

This also applies to the python bindings, which the user can not link with since they have no headr files and are not included in the nupic library.

@ctrl-z-9000-times
Copy link
Collaborator

I like the idea of simplifying namespaces, but i agree with dkeeney that the file system should be the main organizational feature, since it shows up in every include statement. Furthermore, I think that the namespaces should mirror the file system.

Also, we should not hide things from the users, esp. not the connections class.

@dkeeney
Copy link

dkeeney commented May 31, 2019

When I finish PR #490 I will be ready to switch "nupic" namespace to "htm".

This will include changing the file directory names as well....every place there is a "nupic" we change it to "htm". This will break the API big-time but this seems to be Matt's intent.

  • Do we want to change the name of the repository at the same time?
  • Do we want to decouple git from the numenta repository?
  • Do we want to change the names of anything while we are doing this? (perhaps "engine" becomes "networkapi" or something like that)
  • Do we want to re-structure anything in the process.

Give this some thought over the next few days.

@ctrl-z-9000-times
Copy link
Collaborator

ctrl-z-9000-times commented Jun 2, 2019

  1. These duplicate folders should be merged: Done

    • py/src/nupic/test/
    • py/src/nupic/tests/
  2. I think we should remove the folder py/src/, it's not needed, we can just put the python materials directly in the py folder. Done

    • py/src/nupic -> py/nupic
  3. The tests do not belong inside of the nupic library, but rather they should reside alongside of the library: Done

    • py/src/nupic/tests/ -> py/src/tests

@ctrl-z-9000-times
Copy link
Collaborator

@ctrl-z-9000-times would you can take all of the py/ and bindings/

Yes, I will take care of this.

@ctrl-z-9000-times
Copy link
Collaborator

ctrl-z-9000-times commented Jun 2, 2019

Do we want to re-structure anything in the process.

Here are some of my thoughts about this ...

We could remove the algorithms & encoders namespaces and just dump everything into nupic. This is what Breznak suggested in the first post, and I don't object to it. Done.

  • It would make it simpler to use the library (less searching for the right namespace).
  • On the other hand the names "encoder" and "algorithm" are good for organizing stuff and have clear meanings.
    • IMO "Algorithms" referes to biologically constrained algorithms, by which logic the SDR Classifier & Predictor classes should be moved to utils.

The namespace nupic::algorithms contains a namespace for every algorithm. We should definitely drop the innermost namespaces and just use at most nupic::algorithms.

  • nupic::algorithms::SpatialPooler
  • nupic::algorithms::TemporalMemory
  • Etc.

We should remove the namespace nupic::math because it only contains one file Topology. All of the other files in the folder "src/nupic/math/" use the namespace nupic. Done.

Also remove namespace nupic::sdr. This namespace has a single item in it, and every other file in the folder "src/nupic/types" uses the namespace nupic. The SDR class will be used by every application, so it should be included by default with using namespace nupic Done.


I'd like to merge the folders "src/nupic/math/" and "src/nupic/types/" into folder "src/nupic/utils/". There does not seem to be a real distinction between a type, utility, and math.

  • It's C++, everything is a class (a type)
  • It's all useful and tangential to the purpose of the library, making them all utilities
  • It's a computer, its all math

UPDATE: Math directory was merged into utils.


I'd like to clean up the python namespace too. I don't want the user to need to search through the bindings, and it's very easy to add a link from a pure python module to the bindings.

  • Add links from nupic.algorithms to nupic.bindings.algorithms
  • Add links from nupic to nupic.bindings.X for several of the miscellaneous but still useful things, like Random and SDR.

@dkeeney
Copy link

dkeeney commented Jun 4, 2019

I would like to fold ntypes into engine so that all networkAPI files are together.
And to be consistent with what @ctrl-z-9000-times is proposing, I will add networkapi namespace to all of them.

@dkeeney
Copy link

dkeeney commented Jun 4, 2019

Our nupic.cpp repository has diverged sufficiently from the Numenta's nupic.core repository that being a clone of their repository is no longer useful. Since we are making a major change to remove nupic from everything, I propose that we start a new repository... something like htm.core which is not a clone of anything. We will pre-populate it with code from nupic.cpp with nupic changed to htm.

The advantage of doing so will be to remove the git history baggage that has accumulated.

@ctrl-z-9000-times
Copy link
Collaborator

The advantage of doing so will be to remove the git history baggage that has accumulated.

+1 good idea. Since you bring up this issue I checked the size of the repo:
A new clone of nupic.cpp is 301M
After removing the version control .git/ it is 3.6M

@dkeeney
Copy link

dkeeney commented Jun 7, 2019

When PR #490 is merged to master, the next big project is to replace nupic with htm. This will change all of the file system paths and will require an entirely new repository. I will wait until all three of us have completed all of our current projects. Then we can do a mass replace of nupic to htm and place the modified files in in the new repository. This should not need to contain any real logic changes...just renaming and maybe moving a few files between directories.

So, let me know when you are ready to start on this and each of us can all grab a snapshot of the nupic.cpp at the same time and start our part of this conversion.

  • We will need to decide if we should break the git clone of the numenta nupic.core and start fresh.
  • The new repository needs to be created and setup with write permissions (@breznak )
  • Each of us Grab a snapshot of nupic.cpp
  • Each of us do the mass replace and check in out parts to the new repository with PRs. The PR's will not compile until all parts are back in place in the new repository so the merge blocking will need to be overwritten for this step.
  • Each of us will need to write a wiki for our parts to describe how to use it with the new names. Perhaps using the Numenta's docs as a guide. http://nupic.docs.numenta.org/stable/guides/index.html

Let me know what you think about this plan.

@ctrl-z-9000-times
Copy link
Collaborator

This [...] will require an entirely new repository.

Actually I think an admin can rename an existing repository.

@dkeeney
Copy link

dkeeney commented Jun 8, 2019

Actually I think an admin can rename an existing repository.

But do we want to keep the old history?

@ctrl-z-9000-times
Copy link
Collaborator

But do we want to keep the old history?

No, I dont think the git history is useful. But, I want to keep the currently open issues and PR's. Maybe there is some way to squash all of the commits into one big commit, and then do a force-push to the repo? I'd try it on a test fork first...

@dkeeney
Copy link

dkeeney commented Jun 10, 2019

squashing would be a good way to do it. Yes, the open issues we want to keep. The open PR's maybe not so much.

@dkeeney
Copy link

dkeeney commented Jun 10, 2019

We probably want to keep an official snapshot of the current repository because once we start this namespace name change the NetworkAPI will no longer be API compatible.

@dkeeney
Copy link

dkeeney commented Jun 10, 2019

One thing I would like to do before we change the namespaces is to combine the engine/ and ntypes/ folders into networkapi/. This would not break the API. I will create a PR to do that.

@breznak
Copy link
Member Author

breznak commented Jun 10, 2019

Do we want to decouple git from the numenta repository?

no, what would be an advantage? This gives still some "visibility" and makes it easier to merge/review patches around.

IMO "Algorithms" referes to biologically constrained algorithms, by which logic the SDR Classifier & Predictor classes should be moved to utils.

possibly. although it adds another confusion for a switching user.

'd like to merge the folders "src/nupic/math/" and "src/nupic/types/" into folder "src/nupic/utils/". There does not seem to be a real distinction between a type, utility, and math.

I'm ok do do whatever with Math, Math.hpp is now just dummy file, can be removed.
I'd keep types & utils, as there's a logical distinction in the two (UInt, SDR vs. support VectorFiles, StringUtils, ...)

I will add networkapi namespace to all of them.

I like the idea of networkapi namespace, and everything needed for NetworkAPI under one folder and common namespace (this was suggestion of #183 )

I propose that we start a new repository...

I don't think this is a good idea, this repo is still a logical successor of nupic.core, what advantage would this change bring? It would break any existing forks and make PRs more difficult.

A new clone of nupic.cpp is 301M

ok, but honestly, who today cares of downloading 300MB? And it's a one time event (clone)

TL;DR I tried to catch up with the ongoing discussion. Can we keep this task low, "just" rename nupic namespace to htm (or what?) and do the discussed merges of ns, ... and leave the other for separate tasks?

@dkeeney
Copy link

dkeeney commented Jun 10, 2019

Can we keep this task low, "just" rename nupic namespace to htm

Yes, I agree that the nupic namespace change PR should have nothing other than the renaming of everything nupic to htm.

These other things should be separate PR's.

@dkeeney
Copy link

dkeeney commented Jun 10, 2019

I don't think this is a good idea, this repo is still a logical successor of nupic.core, what advantage would this change bring? It would break any existing forks and make PRs more difficult.

Well, ok but I don't see the point of keeping it a fork of nupic.core if all of the names are intentionally different.

As a side note..., is there a way to fix this so the default repository is htm_comminity repository rather than Numenta's repository when I create a new branch. If I forget to change it my pr ends up in nupic.core rather than nupic.cpp.

@ctrl-z-9000-times
Copy link
Collaborator

Hi Breznak, Im glad your back.

I'm ok do do whatever with Math, Math.hpp is now just dummy file, can be removed.

Currently it contains the definition for Epsilon the floating point accuracy.

I will submit a PR to move that to "nupic/types/Types.hpp" and delete the file Math.hpp

@breznak
Copy link
Member Author

breznak commented Jun 11, 2019

I will submit a PR to move that to "nupic/types/Types.hpp" and delete the file Math.hpp

maybe have Epsilon in Connections.hpp? but Types is also OK for constants

@dkeeney
Copy link

dkeeney commented Jun 11, 2019

Would you like me to add the networkapi namespace to engine and ntypes now or wait until after the nupic to htm rename?

@ctrl-z-9000-times
Copy link
Collaborator

ctrl-z-9000-times commented Jun 11, 2019

Would you like me to [...] now or wait until after the nupic to htm rename?

You should probably go ahead and do it now. Renaming the namespace & repo can be the last thing we do before we release it. I don't think there will ever be a convenient time when there are no open PRs, so it should not hold up other necessary tasks.


In the next day or two I will submit PR's to remove the following namespaces, unless there are any objections:

  • nupic::encoders
  • nupic::algorithms
  • nupic::algorithms::spatial_pooler
  • nupic::algorithms::sdr_classifier
  • nupic::algorithms::anomaly
  • nupic::algorithms::temporal_memory
  • nupic::algorithms::connections

All of these things will be dumped into namespace nupic

EDIT, also remove

  • nupic::util

@dkeeney
Copy link

dkeeney commented Jun 11, 2019

Hmmm, if all of these are being dumped into namespace nupic, then I should leave all of networkAPI in the nupic namespace also.

@dkeeney
Copy link

dkeeney commented Jun 12, 2019

For the nupic to htm name switch I am proposing the following script be ran from the top directory:

rm -rf build
rm -rf .pytest_cache
find . -type f -name "*.py" -print0 | xargs -0 -n 1 sed -i -e 's/nupic/htm/g' 
find . -type f -name "*.hpp" -print0 | xargs -0 -n 1 sed -i -e 's/nupic/htm/g' 
find . -type f -name "*.cpp" -print0 | xargs -0 -n 1 sed -i -e 's/nupic/htm/g' 
find . -type f -name "*.md" -print0 | xargs -0 -n 1 sed -i -e 's/nupic/htm/g' 
find . -type f -name "*.txt" -print0 | xargs -0 -n 1 sed -i -e 's/nupic/htm/g' 
find . -type f -name "*.cmake" -print0 | xargs -0 -n 1 sed -i -e 's/nupic/htm/g'
find . -type f -name "*.yml" -print0 | xargs -0  -n 1 sed -i -e 's/nupic/htm/g' 
find . -type f -name "*.cfg" -print0 | xargs -0  -n 1 sed -i -e 's/nupic/htm/g'
find . -type f -name "*.bat" -print0 | xargs -0  -n 1 sed -i -e 's/nupic/htm/g'

find . -type f -name "*.bat" -print0 | xargs -0  -n 1 sed -i -e 's/NUPIC/HTM/g'
find . -type f -name "*.sh" -print0 | xargs -0  -n 1 sed -i -e 's/NUPIC/HTM/g'
find . -type f -name "*.md" -print0 | xargs -0  -n 1 sed -i -e 's/NUPIC/HTM/g'
find . -type f -name "*.txt" -print0 | xargs -0 -n 1 sed -i -e 's/NUPIC/HTM/g'
find . -type f -name "*.hpp" -print0 | xargs -0  -n 1 sed -i -e 's/NUPIC/HTM/g'
find . -type f -name "*.cpp" -print0 | xargs -0  -n 1 ksed -i -e 's/NUPIC/HTM/g'
find . -type f -name "*.in" -print0 | xargs -0  -n 1 sed -i -e 's/NUPIC/HTM/g'

git mv src/nupic src/htm
git mv py/nupic py/htm

I am doing a trial run now to see if this builds correctly.

But before we can run this for real, all currently open projects should be completed and merged to master. Git might be able to figure out the file name changes but this avoids problems if there should be conflicts in PR's started before the name switch.

@dkeeney
Copy link

dkeeney commented Jun 12, 2019

There is one more name switch related thing....In the top of every source file is the copyright notice, starting with Numenta Platform for Intelligent Computing (NuPIC).
Should this be changed to something like HTM community version of Numenta's Platform for Intelligent Computing (NuPIC). I am open to better wording.

@dkeeney
Copy link

dkeeney commented Jun 12, 2019

I missed one rename in the proposed script:
git mv bindings/py/src/nupic bindings/py/src/htm
With that fix, the unit tests all run.

@breznak
Copy link
Member Author

breznak commented Jun 12, 2019

find . -type f -name "*.py" -print0 | xargs -0 -n 1 sed -i -e 's/nupic/htm/g'

I don't think this would work, we don't want, need to replace all NuPIC (implementation, "community") for HTM (theory), just the namespace nupic { to namespace htm {

@dkeeney
Copy link

dkeeney commented Jun 12, 2019

Yes I think that is the point. If I understand correctly, what @rhyolight is looking for is a complete break.
Any place the name nupic appears should become htm. The nupic name is to be reserved for Numenta's exclusive use.

@breznak
Copy link
Member Author

breznak commented Jun 12, 2019

hmm..is that the case? i think NuPIC stands for Numenta's (community) & platform for intelligent computing (HTM): "Numenta Platform for Intelligent Computing is an implementation of Hierarchical Temporal Memory (HTM), a theory of intelligence based strictly on the neuroscience of the neocortex"

ie the whole time I was talking about nupic namespace (?)

@dkeeney
Copy link

dkeeney commented Jun 12, 2019

No, NuPIC does not include 'community'. It might have been the intent at one time but as I understand it, they now want the nupic name to be exclusive for Numenta's use. And they have the right to ask for that.

Anyway, the script as I wrote it does not change NuPIC in the copyright notice and I was wondering if it should...or the copyright notice text might need some minor rewording to make the context more clear. I think we need to defer to @rhyolight for clarification on that.

@ctrl-z-9000-times
Copy link
Collaborator

ctrl-z-9000-times commented Jun 15, 2019

@dkeeney,
I think that this weekend would be a good time to run your script and do the renaming.

Also, so that we don't forget:

  • TODO: Update the API changelog, and python's change log too
  • TODO: Update README
  • TODO: Update license notice?
  • TODO: Craft an announcement for Numentas forums, we can post drafts of it here for discussion & collaboration.
  • Remove namespaces for python bindings and unit tests.

@dkeeney
Copy link

dkeeney commented Jun 15, 2019

Ok, I will create the PR to do the C++ code.
I would like @breznak to review it to make sure he is in agreement as to what is renamed.

@ctrl-z-9000-times
Copy link
Collaborator

Resolved in many PR's. Great work guys!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API_BREAKING_CHANGE code code enhancement, optimization, cleanup..programmer stuff question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants