-
Notifications
You must be signed in to change notification settings - Fork 725
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split opencog into multiple github repos! #3391
Comments
I think it's a good idea. The main con is that it's a lot of work, but I suppose it could be done iteratively, taking opencog apart piece by piece. The task could be distributed across the maintainers of each part, for instance I would be in charge of moving the pattern miner and pln to their own repositories, @misgeatgit would be in charge of moving attention allocation, etc. Worth considering at the very least... |
I am not so involved into opencog repo support and it is not clear to me which particular things will be simplified. Do you mean that (1) it will allow update core component A while keeping component B using old version of A? Or that (2) components will be better decoupled when they will be in different repositories. May be we can find a way of keeping things in one repo and solving both of the issues? For the (1) I would propose looking at dependency tree first and see if there are such cases. For the (2) I would start from having one folder for each component in the root of repository. |
I would think the pros are in (2), in fact I see (1) as potential con. The reason I would be bending towards that proposal is because these components are still gonna grow, as likely the number of people involved, so more compartmentalization feels right. But maybe the granularity of the split can be much coarser than what is originally suggested, TBD. Certainly we have time to weight the pros and cons. |
I'm not sure we can rip the benefit of splitting if things stay in the same repo, though I suppose flattening on the root directory (or rather |
I just added: "Different repos can have different owners, different merge policies" -- so, for example, the space-time server, which is pre-alpha, should allow any kind of merges and breakages; a spacetime server commit should not be held up because unit tests fail. Different repos can have different owners, controllers review policies, etc. Everyone who works in repo X should be an expert in repo X. Right now, the pattern miner experts know nothing about ghost, and vice-versa. @vsgbod: right now, the different components are already almost in different root directories. There's some overlap, but not much. |
Just to be clear: "Different repos can have different owners, different merge policies" is really a people-management issue, not a code-management issue. It allows different groups of people to set different policies and different mechanisms of control, to suit their collaborative style, to suit the maturity of that component. |
Ok, if it is more about more formal code ownership and different release processes then it makes sense. Regarding code ownership I don't see big difference with using separate folders. GitHub usually suggests reviewers among people who contributed into code before. So having different repos is just a bit better. I have experience working in setup when people just owned folders. It was comfortable and using same repository were simpler because you don't need pull and rebuild many repositories. But it required all team knew who is responsible for the component. Release process is more complex thing if we need to have different release and merge policies then we need to use different repos. |
Some links on topic how to move files from one repository to another preserving history:
In few words:
|
thanks See however discussion in #3593 about the loss of git history. |
I just split out language-learning to here: https://github.com/opencog/learn and plan to replace the contents of the |
@linas, I like the idea of splitting as you know, but I'm not sure about the name. In opencog there is a |
I've convinced myself that the generic algorithm, implemented there works just fine for biology, vision, facial expressions, robot movements, whatever; its not just an NLP thing. I've kind of been saying that for five years now, and if no one wants to believe me .. I dunno. I get to say it again? At this time, I do not want to mash in other code bases into that repo, mostly because the other code-bases are distinct and stand alone. |
Regarding URE, PLN & pattern mining. perhaps it makes sense to re-group these. Perhaps they are sufficiently tightly coupled that they should move together as a group, whereas URE does not have a lot in common with the atomspace. This is partly a technical decision, and partly a marketing decision: it is kind of nice to say, in a marketing blurb: "New! Improved! The atomspace comes with an inference engine!!" But one could split it out, and then the sales brochure says: "New! Improved! The inference engine now comes with a pattern miner!" I don't have a strong opinion one way or the other. The URE did not come out the way I had hoped it would. I was assuming that the URE would implement what I call "sheaves", but instead, it implements something else. I had also hoped/planned to use the pattern miner instead of having the one-two-three step of MI-pairs->MST->disjuncts->factorization but the pattern-miner does not give me graph factors. If it did, we could just unleash the pattern miner on the raw language data, and get the syntax graph popping out. But the pattern-miner didn't come out that way either. Whatever. I still think the MI-pairs->MST->disjuncts->factorization pipeline is a valid alternative to pattern-mining, we can continue to improve both, and see which one does better at extracting structure for which kinds of problems. So anyway, we now we have multiple subsystems that are similar, share some similar goals, have some similar ideas, take some similar inputs, but really ended up quite different. Because they are more different than they are similar, it makes sense to keep them separated. |
After moving repo parts into separate repos the following functionality is not available in
To fix it additional changes should be done in |
Same about unit tests.
|
Closing. This task project is more or less completed. All the parts that seemt o be indpendently useful outside of the chatbot framework have been moved to their own repos. All that remains here is the chatbot code. (It might still be interesting to split out openpsi on it's own, or at least, the scheduling/prioritization part of openpsi.) |
This is an issue open for debate: there are various pros and cons. The proposal:
Split this repo into multiple parts. These would be:
What's left to be done:
The above are fairly tightly knit together in the current code base, so it might not be possible to split them up cleanly into distinct components. So maybe we can leave these, as they are!?
Pros:
Cons:
If you have new pros & cons to add to this list, please edit this first top-post. Other comments, including proposals for splitting along different directions, should go into comments below. I expect this to be somewhat controversial. It seems like a good idea, just right now.
The text was updated successfully, but these errors were encountered: