Skip to content
This repository has been archived by the owner on Sep 1, 2023. It is now read-only.

Document the proposed directory structure for nupic.core and nupic #591

Closed
rhyolight opened this issue Jan 29, 2014 · 73 comments
Closed

Document the proposed directory structure for nupic.core and nupic #591

rhyolight opened this issue Jan 29, 2014 · 73 comments

Comments

@rhyolight
Copy link
Member

As a part of the nupic.core extraction, we need to create some documentation of the proposed directory structure after the extraction is complete. This should include both repositories, as well as details about where the nupic.core dependency exists within nupic.

@subutai
Copy link
Member

subutai commented Jan 30, 2014

@david-ragazzi If I remember correctly you took a crack at laying out the directory structure before. Do you want to propose something?

Cheers, Subutai

@subutai subutai mentioned this issue Feb 4, 2014
@david-ragazzi
Copy link
Contributor

Yes @subutai !

I think we should have a default directories structure for each repo in order to Nupic repositories follow the most used convention on open-source projects.

PR #499 has the details about a example for Nupic repository:

But this is the structure is that I suggested later.. :-) The only difference is the location of them. Just for compare:

  • $NUPIC becomes $REPOSITORY/Source
  • $NTA becomes $REPOSITORY/Release
  • /tmp/builddir could continue the same
    and Build_System (IDE solution file or make files generated by CMake) would be put outside from /Source (which could leave still cleaner the /Source folder).

At first configuration, CMake would create default values to $NUPIC and $NTA based on current source dir which CMake was called. After that, user could feel free to change the release location just changing the $NTA value (thanks @breznak for this observation!! ).

Bellow a screenshot of this model:
screen shot 2013-12-16 at 9 35 49 pm

In the above case, $NTA is set as "~/Desktop/nupic-master/Release". But if some user want change to another location, it is just re-config $NTA.

@scottpurdy has commented in other messages something similar when suggested a "src" folder to put only code. I don't remember where.... hehe

Summary of the discussion

Proposed structure until now:

nupic.core
    |-- LICENSE.TXT
    |-- README.md
    |-- build
    |    |-- ALL_GENERATED_FILES_GO_HERE
    |    |-- release
    |    |    |-- bin
    |    |    |-- include
    |    |    |-- lib
    |    |-- scripts
    |    |    |-- BUILD_SCRIPTS_OR_IDE_GENERATED_BY_CMAKE
    |-- doc
    |    |-- BOTH_GENERATED_PLUS_MANUALLY_WRITTEN_DOCS
    |-- external
    |    |-- MIMIC_NUPIC_FOR_NOW_STOP_GAP_MEASURE
    |-- include
    |    |-- NUPIC_INCLUDE_FILES_REPRESENTING_EXTERNAL_API
    |-- src
    |    |-- CMAKELISTS.TXT
    |    |-- main
    |    |    |-- SOURCE_FILES_RELATED_TO_PROJECT
    |    |-- test
    |    |    |-- SOURCE_FILES_RELATED_TO_AUTOMATED_TESTS
    |    |-- examples
    |    |    |-- SOURCE_FILES_RELATED_HELLO_WORLDS_AND_FULL_EXAMPLES
    |    |    |-- some_example
    |    |    |-- app1
    |    |    |-- app2

In Travis file at nupic.core repository, the process is something like:


# places cursor to the source folder
cd /src
# calls CMAKE passing "/build/scripts" as destination of the Autotools scripts and passing "/build/release" as install prefix.
cmake /build/scripts -DPROJECT_BUILD_RELEASE_DIR=/build/release
# places cursor to the scripts folder, i.e. the folder with Makefiles generated by CMake
cd /build/scripts
# calls Make to build the project. 
# binaries files will be located in "/build/release" folder.
make

@scottpurdy
Copy link
Contributor

I like @david-ragazzi suggestions. I am more familiar with C++ code living inside a src directory in the root of the repo and building into a build directory before being copied to the install location (make builds into build/bin, build/lib, etc dirs and make install copies to installation locations). But open to what people think will be most obvious for newbies joining the project.

I looked for a C++ project on Github and the first one I found was https://github.com/rethinkdb/rethinkdb
It has code in src and builds to build/release. So kind of similar to David's suggestion and kind of similar to what I am used to seeing. And that is just one data point.

@david-ragazzi
Copy link
Contributor

@scottpurdy I liked the project structure in your suggested link.

Just to avoid the same confusion in PR #499 :
"build_system" folder is not the same than "build" folder. "build_system" is just to store the files generated by CMake (i.e. IDE solution or Make files), while "build" is the output compiled by the files that are in "build_system". Anyway, we could change the name "build_system" to "gen" or "generated" or something similar, to avoid this confusion with terms...

@rhyolight
Copy link
Member Author

I never liked the name of of build_system, so a would prefer a rename as well. generated sounds good to me, but is there any name that is standard for a directory that contains these types of generated file for C++ projects?

@david-ragazzi
Copy link
Contributor

@rhyolight I have no idea.. but I believe there's not default name...

Some suggestions:

  • generated
  • project (it's well intuitive for both ide solution and make scripts.. I liked it.. )
  • ide (it is intuitive although CMake generates Make scripts in case of the user doesn't want an IDE solution)
  • cmake

What do you think?

Ah.. this folder name is only for internal use (i.e. travis build), so any other name wouln't have any problem with CMake file. The user is free to choose any name. However I believe even so all should adopt this convention for avoid future confusions with names on mail list. Just a suggestion..

@deanhorak
Copy link

I've seen "work", "temp" or "scratchpad" used variously for such transient directories. I'm not aware of any standard name however.

@fergalbyrne
Copy link
Member

I'd suggest keeping all the directory names lowercase, and src, build (sith build/libs, build/bin) etc are more conventional than the other suggestions. Everyone will simply understand those names.

@subutai
Copy link
Member

subutai commented Feb 6, 2014

I like @fergalbyrne 's suggestions. With python packages that require C++ compilation, the convention is also to put everything under build. So, build/lib, build/bin, etc. They also put generated files in there. So build/temp would contain the generated files. This way the user just has one directory to delete if they want to manually clean everything.

@david-ragazzi
Copy link
Contributor

I liked @subutai idea on put everything related to build on build. Althought I think build/temp is not much intuitive, maybe build/scripts could be better. Something like:

  • doc: doc files
  • src: source files
  • build: no files
  • build/scripts: generated build scripts from CMake.
  • build/release: compiled files (libraries and binaries)

This way we combine @subutai, @scottpurdy and @fergalbyrne suggestions in a single structure.

@subutai
Copy link
Member

subutai commented Feb 6, 2014

Sounds good to me.

@deanhorak
Copy link

Yes, everything generated should definitely be placed in a subdirectory under "build". Within the build directory most projects have various directories indicating what gets placed in each rather than a catchall "temp" directory. For instance, Chromium has "master", "scripts", "site-config", "slave", "test", etc all within the build directory.

@sjmackenzie
Copy link

git clone nupic.core
mkdir build
cd build
cmake ../nupic.core

Therefore build does not need to be part of the git repo at all.
My suggestion is to keep it the standard src include doc etc format and forget about build we are moving away from autotools. This build part of repo is an autotools mentality - I believe.

Completely separating build from the repo means we never have to worry about committing generated artifacts into the repo by mistake.

As we will eventually be building nupic.core separately from nupic we don't need to worry about a build dir. Though this approach works equally well during the stop-gap period in that nupic.core can exist as part of nupic's directory structure (ie nta)
ie:

git clone nupic
cd nupic
git submodule init
git submodule update (this pulls `nupic.core` into `nta` and is abstracted out in `build.sh`)
cd ..
mkdir build
cd build
cmake ../nupic

Would plough into nupic and the generated files for nta/nupic.core would also be put into build without any fuss at all.

So we could, for example, have this directory structure:

(mkdir) numenta
----------> (git clone) nupic.python
----------> (git clone) nupic.python-test-feature1
----------> (git clone) nupic.core
----------> (git clone) nupic.core-test-feature3
----------> (mkdir) builds
------------------> (mkdir) nupic.python
------------------> (mkdir) nupic.python-test-feature1
------------------> (mkdir) nupic.core
------------------> (mkdir) nupic.core-test-feature3

Say you wanted to build numenta/builds/nupic.core one would:

cd numenta/builds/nupic.core
cmake ../../nupic.core
make

Sometimes one justs wants a clean git clone for testing.
This approach just keeps things clean.

@rhyolight
Copy link
Member Author

@david-ragazzi Can you take the suggestions we've received above and re-draft your initial proposal?

@david-ragazzi
Copy link
Contributor

@sjmackenzie I understand you concern but I don't believe this is a Autotools stuff.. And although Autotools uses similar convention, the concept is not restrained to this tool.

@ALL: The own Travis could update the build folder when it compiles the repo (i.e. we wouldn't delete these folders in each build made by Travis). This way any newbie could download diretly the generated binaries in case of he doesn't have intimacy with the source or even with C++ code!

@sjmackenzie
Copy link

Ah yes correct there is the -prefix flag. Forgive me you are correct.

@sjmackenzie
Copy link

@david-ragazzi One could easily put the build folder in the repo for travis, for whatever reason. It makes no difference. Secondly newbies wanting to dip their hands into compiling nupic.core will be reading the build instructions so the directory structure is painfully easy to follow. From a development point of view (not Travis, nor newbies) this structure is fluid and easy to follow.

That's great that Travis has a 'download binaries' feature!

@sjmackenzie
Copy link

don't forget the include folder.

@david-ragazzi
Copy link
Contributor

@sjmackenzie

Secondly newbies wanting to dip their hands into nupic.core will be reading the build instructions for so the directory structure is painfully easy to follow.

Isn't supposed that such information about how get only the binaries should be in Readme.md?

And although I'm not a GitHub expert, I believe that it has some packages management, i.e. packages only the source or the binaries which users could donwload them separately..

don't forget the include folder.

Do you mean folder wih header files when you say include? Isn't supposed that such folder is a subfolder of src?

@sjmackenzie
Copy link

I amended my comment by adding into compiling - ie into compiling nupic.core

@sjmackenzie
Copy link

@david-ragazzi

Typically you do not need to worry about binary distribution.

  • Linux distros package maintainers will handle this.
  • Also a make install can be achieved by the slightly advanced newbies
  • Numenta will most likely have a download section somewhere that allows complete newbies to download the needed "Matt's stamp of approval" binaries.

Our main concern is making life easy for developers and achieving a flexible yet standardized development environment that becomes the 'culture' of nupic development communicated via the Readme.

@david-ragazzi
Copy link
Contributor

@sjmackenzie

Maybe I am confusing the things.. but.. Isn't Nupic dependent of Nupic.Core, but not the inverse? From my understanding, Nupic should gets the only output generated by Nupic.Core, not interfere on Nupic.Core build process.
Furthermore, this is a default structure used by many projects as expressed by the majority of the members. I don't think it is a painful structure to follow (except in the special case that you cited).

@sjmackenzie
Copy link

Dependency is not part of my discussion. But I will include it now as I see a discontinuity between what we are saying.

This is what we are working towards that we do not have now:
nupic.core is an independent artifact that exists as an installed library somewhere on the system, installed manually or via a package manager
nupic.python is dependent on 'nupic.core' being installed manually or via a package manager.
nupic.python does not build or install nupic.core
nupic.core does not build or install nupic.python

What we have now ( a temporary stop-gap situation that will help us to the above goal) is to allow nupic to drive the building process.
We first must make sure nupic is completely moved over to CMake.
As a byproduct of doing the transition nupic.core should be able to be built independently.

The transition:
I suggest that nupic completes its transition to CMake. Once this is done we can start evolving the directory structure of nupic.core.

This way the transition is safer and core builds independently. Changes are done in little steps all nupic is stable.

If you want to focus only on getting nupic.core to build with cmake and change directory structure at the same time, yet make sure nupic existing build system can talk to nupic.core cmake build system then go for it. But I don't suggest it.

Hence it is better to get nupic as a whole building with cmake (in mainline).

That is why build being included in the repo or not makes absolutely no difference in cmake world. Directory structure is not important at this stage. src include doc etc is the standard layout structure. It seems obvious that we should adopt it.

@rhyolight
Copy link
Member Author

@sjmackenzie 👍

@david-ragazzi
Copy link
Contributor

@sjmackenzie Now I understood you and I agree in many points.

However, since that CMake can run in parallel to Autotools, we can implement and test this without any headache.. After that the CMake files in each repo are working ok, we just remove Autotools stuff.. and voilá.

Anyway, your suggestion is fine. It's a top-down approach for this job, while mine is bottom-up.. But remember nupic.core CMake file already will set $NTA environment variable, so since we have this variable configured we just can reference it from nupic CMake file (Travis or local machine). In this case, $NTA value will be nupic.core/build/release

To say truth, I sincerely don't now how to do it without first ensure that nupic.core is generating its output correctly and then reference it (static or dyn libs) through $NTA variable (considering that Travis could share this variable between repos, of course).
Any ideas are welcome. :-)

That is why build being included in the repo or not makes absolutely no difference in cmake world. Directory structure is not important at this stage. src include doc etc is the standard layout structure. It seems obvious that we should adopt it.

Yes, as I said in other message, the own user could choose another location, but CMAKE_INSTALL_PREFIX need have a initial value. So as $NTA still is not configured, CMAKE_INSTALL_PREFIX and $NTA is set to the subfolder builld/release.

@scottpurdy
Copy link
Contributor

I am not familiar with CMake. In my experience, doing a make will compile everything into build/... and make install will copy the bin, lib, etc files from build into appropriate system locations (or an arbitrary location specified by --prefix). So my expectation was that $NTA would point to the installation location, not the build location.

We also need to remove the need for environment variables to support simple installation mechanisms like pip or other package managers so please try not to rely on $NTA or similar.

@rhyolight
Copy link
Member Author

We also need to remove the need for environment variables to support simple installation mechanisms like pip or other package managers so please try not to rely on $NTA or similar.

This would be a good topic for the nupic-hackers mailing list.

@rhyolight
Copy link
Member Author

To say truth, I sincerely don't now how to do it without first ensure that nupic.core is generating its output correctly and then reference it (static or dyn libs) through $NTA variable (considering that Travis could share this variable between repos, of course).
Any ideas are welcome. :-)

Can we agree that the nupic.core build will output to a build directory, but that it might be controlled by an option to the build script? I'm fine with there being a build/scripts and build/release.

I think we're at this point now for nupic.core, right?

  • doc: doc files
  • src: source files
  • include: external includes
  • build: no files
  • build/scripts: generated build scripts from CMake.
  • build/release: compiled files (libraries and binaries)

Does anyone disagree strongly with this? If not, let's move to discussing the nupic directory structure.

@rhyolight
Copy link
Member Author

@david-ragazzi You're talking about #9 now. My suggestion is to make a space for tests, but don't worry about getting them running until after the src folder builds as envisioned. Then we can merge that PR and confirm it's all building in Travis before moving on to #9.

@subutai
Copy link
Member

subutai commented Feb 10, 2014

@rhyolight OK, here's the issue: numenta/nupic.core-legacy#19

@rhyolight rhyolight modified the milestones: Sprint 16, Sprint 15 Feb 14, 2014
@rhyolight rhyolight modified the milestones: Sprint 17, Sprint 16 Feb 28, 2014
@david-ragazzi
Copy link
Contributor

Just a little change: I moved test and examples folder into src folder (which is inspired partially on Apache convention [http://maven.apache.org/guides/introduction/introduction-to-the-standard-directory-layout.html] and it is more appropriate given the nature of these files). test will contain the source of the current tests, examples will contain the source of hello worlds and others, and finally main will contain the project itself.

@subutai
Copy link
Member

subutai commented Mar 6, 2014

OK, sounds good. Will test also contain all the tests currently under unittests? I assume we'll have one directory under test for each main src directory? Where will executable apps go?

@david-ragazzi
Copy link
Contributor

Will test also contain all the tests currently under unittests?

I really dont know give a concrete answer, but I'll research more how address this.. at first moment, they should continue spreaded over /src/main, only TestEverythingMain.cpp will be in /src/test/testeverything

I assume we'll have one directory under test for each main src directory?

Yes, if any project has tests, they should go to test folder at src folder in same level than main folder (I updated the structure above: #591 (comment)), because they are source files and because they are a "show apart" just like examples files (except unittests as mentioned above).

Where will executable apps go?

It's supposed that they will go into build/release/bin together with examples apps just like any binary output generated by nupic..

@subutai
Copy link
Member

subutai commented Mar 6, 2014

Thanks, sounds good!

Where will executable apps go?
It's supposed that they will go into build/release/bin together with examples apps just like any binary output generated by nupic..

Sorry, I meant the source code for executable applications. (Currently there's a separate apps directory structure.) One application could be a command line utility for loading and running HTM networks.

src/
   test
   main
   examples
      some_example
   apps
      app1
      app2

or

src/
   test
   main
   examples
      app1
      app2
      some_example

@david-ragazzi
Copy link
Contributor

I liked the second one.. At end, apps also are examples.. :-)

@subutai
Copy link
Member

subutai commented Mar 6, 2014

OK with me! :-)

@iandanforth
Copy link
Contributor

Re: examples

I find it strange that examples would be a child of src. Examples are, of course, available as source code, but they are not part of the required source code of nupic. Examples are consumers of nupic, and their primary purpose is to be an entry point.

The first time I clone a repo I start looking for examples as a way to understand what is possible and how to use the code. Many times those examples are totally external to the repo, but when they are available as part of the repo, I want to interact with them BEFORE I dive into the source of the project itself. So while it sounds like it might be non-standard I will suggest that examples should be a top level directory.

@david-ragazzi
Copy link
Contributor

@iandanforth I think the future of Nupic will be a global and all-in-one repo. Thought it, newbies will have their first contact with the HTM approach as you said. This said, what could be confusing you is the main folder. Initially it contains Python code (and partial CLA in Python code), but (I think) it will be moved to a new repo called nupic.python or nupic.core.python (see #724 and step 4 of the core extraction plan ).

So if I'm correct we won't have code related to CLA in this repo.. My feeling is that Nupic will have CLA (nupic.core and its bindings), OPF, etc, and put all in a same place, i.e. Nupic repo. But it won't contains much code, instead, it will consume these subprojects though dynamic/static libraries. The users will can see their application though examples contained in the folder with same name..

@rhyolight rhyolight modified the milestones: Sprint 18, Sprint 17 Mar 14, 2014
@h2suzuki
Copy link

Hi guys,
I'm tracking the changes mentioned by the following mail posted by Matt.
[nupic-hackers] a centralized discussion about directory structures Fri Mar 21 19:04:17 EDT 2014
I'm impressed by the progress. Congrats!

I have a small query about include directory now.
We have .h{pp} for both external and internal APIs, right? What is the supposed structure for them? If I see Linux source tree, for example, I can see /linux/include/linux, /linux/arch/x86/include, /linux/drivers/gpu/drm/nouveau/core/include, /linux/security/apparmor/include/, /linux/tools/include, ...,etc. On the other hand, a source directory usually mixes .h and .c in Linux.

Should we define a small set of rules for include file locations?
I believe nupic.core should be kept small and the rules should also be small.

@david-ragazzi
Copy link
Contributor

@h2suzuki

I believe nupic.core should be kept small and the rules should also be small.

I agree with you.

The current convention adopted by Numenta (which it's very easy to follow) is:


external
|--common
    |--include (all headers for external APIs. .h, .hpp, .cpp files are put together in same folder)
|--win32
    |-- bin (executables specific to the target platform, i.e. *.exe)
    |-- lib (static and/or dynamic libraries specific to the target platform, i.e. *.lib and *.dll)
|-- darwin64
    |-- bin (executables specific to the target platform)
    |-- lib (static and/or dynamic libraries specific to the target platform, i.e. *.a, *.so and *.dynlib)
|-- linux32
    |-- bin (executables specific to the target platform)
    |-- lib (static and/or dynamic libraries specific to the target platform, i.e. *.a and *.so)

IMO I don't see any need for separate .h from .hpp or .c from .cpp..

@h2suzuki
Copy link

@david-ragazzi
Thank you for your reply.
Though I have a little confusion now.

We have nupic.core/include, nupic.core/external/common/include, but not nupic.core/src/include. Is this what you mean? I'm wondering what kind of include files reside in each directory.

@fergalbyrne
Copy link
Member

Is external not the place where you put your dependencies? I assume the
build products of nupic.core would go in build/release/*.

On Mon, Mar 24, 2014 at 8:37 PM, Hideaki Suzuki notifications@github.comwrote:

@Davidragazzi https://github.com/DavidRagazzi
Thank you for your reply.
Though I have a little confusion now.

We have nupic.core/include, nupic.core/external/common/include, but not
nupic.core/src/include. Is this what you mean? I'm wondering what kind of
include files reside in each directory.

Reply to this email directly or view it on GitHubhttps://github.com//issues/591#issuecomment-38497502
.

Fergal Byrne, Brenter IT

http://www.examsupport.iehttp://inbits.com - Better Living through
Thoughtful Technology
http://ie.linkedin.com/in/fergbyrne/
https://github.com/fergalbyrne

e:fergalbyrnedublin@gmail.com t:+353 83 4214179
Formerly of Adnet editor@adnet.ie http://www.adnet.ie

@fergalbyrne
Copy link
Member

Sorry guys, ignore last.

Hideaki,

At the moment nupic.core depends on some common external (ie non-nupic)
libraries such as zlib. These are to go in externals/platform/lib, and
their header files in externals/common/include so that nupic.core can use
them.

nupic.core build process will output libraries (for your platform) in
build/release. A client using libnupic.so finds the header files in
/include.

On Mon, Mar 24, 2014 at 9:06 PM, Fergal Byrne
fergalbyrnedublin@gmail.comwrote:

Is external not the place where you put your dependencies? I assume the
build products of nupic.core would go in build/release/*.

On Mon, Mar 24, 2014 at 8:37 PM, Hideaki Suzuki notifications@github.comwrote:

@Davidragazzi https://github.com/DavidRagazzi
Thank you for your reply.
Though I have a little confusion now.

We have nupic.core/include, nupic.core/external/common/include, but not
nupic.core/src/include. Is this what you mean? I'm wondering what kind of
include files reside in each directory.

Reply to this email directly or view it on GitHubhttps://github.com//issues/591#issuecomment-38497502
.

Fergal Byrne, Brenter IT

http://www.examsupport.iehttp://inbits.com - Better Living through
Thoughtful Technology
http://ie.linkedin.com/in/fergbyrne/
https://github.com/fergalbyrne

e:fergalbyrnedublin@gmail.com t:+353 83 4214179
Formerly of Adnet editor@adnet.ie http://www.adnet.ie

Fergal Byrne, Brenter IT

http://www.examsupport.iehttp://inbits.com - Better Living through
Thoughtful Technology
http://ie.linkedin.com/in/fergbyrne/
https://github.com/fergalbyrne

e:fergalbyrnedublin@gmail.com t:+353 83 4214179
Formerly of Adnet editor@adnet.ie http://www.adnet.ie

@rhyolight rhyolight modified the milestones: Sprint 19, Sprint 18 Mar 28, 2014
@rhyolight rhyolight removed this from the Sprint 19 milestone Apr 11, 2014
@david-ragazzi
Copy link
Contributor

This was solved by #965 and numenta/nupic.core-legacy#47

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

9 participants