Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split framework and console/engine into separate projects #1598

Closed
CharliePoole opened this issue Jun 21, 2016 · 41 comments
Closed

Split framework and console/engine into separate projects #1598

CharliePoole opened this issue Jun 21, 2016 · 41 comments

Comments

@CharliePoole
Copy link
Contributor

CharliePoole commented Jun 21, 2016

This is the biggest split of the bunch! We will initially create separate repos for the framework and the engine plus console. For the moment at least the engine and console will stay together.

We will have to decide on the mechanics. Alternatives...

  1. Copy both repos and delete what is not being kept, so both have a full history.
  2. Export files to create one of the repos and delete files in the other. Only one will have the full history in that case.
  3. Copy both repos as in (1) and then delete the entire history of the removed folders. This will give a full history for both repos, but only for the portion that is being retained. Each repo will appear as if it was always separate. This requires advanced techniques and the result has to be reviewed carefully to ensure nothing is lost but what we want to lose.
@CharliePoole
Copy link
Contributor Author

@nunit/core-team @nunit/contributors

I've done a trial split to see what was involved and how well it worked. The results are mixed.

I was able to remove both the working directory files and the history for framework using git filter-branch. It's harder to actually make the objects go away in order to reduce the size of the repo. I suspect that one reason is the continued existence of tags in the repository. We need to decide if we want history in the new repositories and if that history should include tags before I try this again.

We also need to decide how many repos we want to have after this is done:

  • Will the old repo continue to exist, possibly renamed, or should the old repo become one of the new repos - e.g. nunit.framework?
  • Should we split into two repos or three? That is, should console and engine stay together?

One thing that I realized in doing the trial split is that it's actually easier to split the extensions out first. I originally thought it didn't matter, but it turned out that I would have to do some work in the extensions for both splits unless I remove them first. So that's my next priority.

There is a lot of dependency of different test projects on mock-assembly, which is built as a part of the framework build but used by engine, console and extension tests! Should we split that out into a repository?

So, summarizing the decisions we need to make:

  1. Should our target be splitting in two or splitting in three parts? (framework, engine, console) My experience leads me to think we don't want to do this twice if we can avoid it.
  2. As to the technical part of the split, how do we want to do it:
    2.1 Keep the old repo for the history and start fresh in the new repos.
    2.2 Duplicate the history in every repo.
    2.3 Use filter-branch to "fake" the history in each repo.

NOTE: Starting fresh is what I have done for the "minor" splits like teamcity extension.

  1. If we decide to fake the history in each repo, what do we do about tags? They will survive unless I remove them, but they will no longer be pointing to the same thing as before, only to parts of each release.
  2. Do we want a separate mock-assembly repo for use in all our tests? That might be a way to discourage folks from changing it, which I have been trying to do because of its widespread use.

Please think about this and let me have your opinions. I already posed some of these questions once, at the start of this issue and nobody has commented as yet. Your contribution of ideas is just as important to this project as your code!

Since I'll focus next on splitting out the extensions, we have a bit of time to discuss these items, but not a lot!

Charlie

@ChrisMaddock
Copy link
Member

I think keeping history is important if at all possible. I often look through the history (perhaps in other projects more than here) in an attempt to track down the reason for code which looks a little odd. I'd go with filter-branch - splitting the history with where the files are going feels like the tidiest soution to me.
Keeping the tags doesn't seem like an issue? They're still relevant to the assembly...there will always be a v3.0.1 of the console dll. :-)

@CharliePoole
Copy link
Contributor Author

That's the approach that makes sense to me if I can pull it off. There's the rub!

As for tags, one issue I see is that you can't pull down a tag and necessarily build it after we remove stuff from the history. That may not be a problem if we all agree that's OK.

Maybe I should create the new repos but keep the old one around for a release.

@jcansdale
Copy link

A serious consideration with GitHub is what happens to the issues and pull requests? They appear to be sequential and linked to the local repo. It seems they aren't automatically copied across when you clone a repo. That's what happened with nunit-tdnet-adaptor, and links to old issues now link to new issues on the cloned repo. 😞

I don't know if its possible to properly clone a GitHub repo, along with all of its issues and pull requests. Something to investigate anyway!

@CharliePoole
Copy link
Contributor Author

@jcansdale I planned on moving all relevant issues using ZenHub. That's what we have done in other cases. Are you saying that there were old nunit issues that became adapter issues? That shouldn't happen, since any nunit issue is supposed to be fixed for all users, not just those using the adapter.

In my experience as a wandering coach, I've noticed that developers are sometimes willing to spend weeks creating an automatic process to do something that can be done manually in a matter of hours. That's why I always consider the manual option. Since ZenHub allows us to do this by clicking twice, there would need to be a heck of a lot of them before I would want to worry about automating it further.

@jcansdale
Copy link

Are you saying that there were old nunit issues that became adapter issues?

No. There were issues when the adapter when it was hosted at github/jcansdale. When these are referenced in the commit comments, they point at new issues in the new repo location.

I think pull requests are implemented as a special kind of issue. We'll need to copy over all of the issues and pull-requests and make sure they have the same IDs in the copied repo. Does that make sense?

@CharliePoole
Copy link
Contributor Author

Got it! Not sure how you can make issues and requests have the same id without a bunch of work, though. How about just pointing to the initial issue for tracking purposes?

@jcansdale
Copy link

How about just pointing to the initial issue for tracking purposes?

Unless there is a straightforward way to amend the commit comments, copying in the original issue description probably makes sense. It's not really a huge deal, just unfortunate and a learning experience.

It looks like the NUnit repo has 1000 odd issues and 600+ pull requests! I wonder if there is a way to completely clone a repo along with its issues and requests?

@CharliePoole
Copy link
Contributor Author

Actually, I think the existence of old PRs and issues is a reason to keep the original NUnit repo - probably renamed. That would leave "only" a few hundred issues and a small handful of PRs to transfer.

@jcansdale
Copy link

I guess another possibility would be to keep nunit/nunit as a central repo for general NUnit issues (I believe this is what the xUnit folks do). When console and engine are split into separate repos, any comments with local issue/request links (i.e. # issue-number) could be converted to absolute links (i.e. nunit/nunit # issue-number). I wonder if there is a filter branch tool that will do this?

@CharliePoole
Copy link
Contributor Author

"keep nunit/nunit as a central repo for general NUnit issues"

What do @nunit/core-team and @nunit/contributors think about this?

Is this contrary to the notion of separate teams and release schedules? or is it better for users who may not know where to put an issue?

@rprouse
Copy link
Member

rprouse commented Jul 31, 2016

I was leaning towards a central repository for issues for all projects, but I have gone off the idea now for the following reasons,

1 - Different projects have their own milestones and release dates
2 - It will require work to tag all the issues based on the project they come from
3 - I auto-generate the CHANGES.txt from closed issues in a milestone
4 - We have so many issues in the main repo, it can sometimes be hard to find what you are looking for. That will be even harder with all the issues in one repo.

Why don't we keep the main NUnit repo as the framework repo rather than spinning a new repo off for it. We can rename it to NUnit.Framework if people prefer, but we shouldn't truncate history in this repo so that we have the full history.

@CharliePoole
Copy link
Contributor Author

I like Rob's idea, which is a hybrid of my listed options. I'll restate this to make sure we are on the same page:

  1. For framework, we keep the entire existing repo, changing it's name. Tags remain. Issues remain.
  2. For the "other" we use filter-branch to limit the retained history and remove all tags.

@rprouse What's your take on having separate engine and console repos rather than just the one.

@rprouse
Copy link
Member

rprouse commented Jul 31, 2016

@CharliePoole exactly what I was thinking. Basically we are just moving the engine/console back to another repo like it originally was.

As per separate engine/console repos, I think that is overkill don't you? I am also reticent to create a mockassembly repo since it is only one class.

@jcansdale
Copy link

Do we worry about legacy issues / pull requests in the "other" repo(s)? My
feeling is the links do appear quite predominantly in in the commit
comments. It would be a shame to break all these links (or worse, have them
linking to the wrong issues / requests).

On 31 July 2016 at 23:15, Rob Prouse notifications@github.com wrote:

@CharliePoole https://github.com/CharliePoole exactly what I was
thinking. Basically we are just moving the engine/console back to another
repo like it originally was.

As per separate engine/console repos, I think that is overkill don't you?
I am also reticent to create a mockassembly repo since it is only one class.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#1598 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ALLR-IYVIoN5isbaWZUUq4aJM0A_hKRIks5qbR57gaJpZM4I6y0E
.

@CharliePoole
Copy link
Contributor Author

@jcansdale Could you spell out a hypothetical link situation such as you are concerned about? I'm not quite sure I get it. :-)

@rprouse A full breakdown with engine and console in separate repos makes sense in some ways. But I would only advocate doing it now rather than later if we were sure we wanted it eventually. It can wait.

MockAssembly is small enough to copy, so that 's what I'll do.

@jcansdale
Copy link

Could you spell out a hypothetical link situation such as you are concerned about? I'm not quite sure I get it. :-)

Yes, take for example the following commit comment: Merge pull request # 1717 from nunit/suffix-change. The # 1717 is automatically hyperlinked to an issue or pull-request in the local repo.

If we split part of the repo off into a new repo, these links will now point at issues/requests in the new repo. As issues and pull-requests are added, links from old commit comments will point at these new issues.

Here is a concrete example. Take the following commit:
nunit/nunit3-tdnet-adapter@3459487

There is a link to pull-request # 1. If you click on this link, it will actually take you to issue # 1 in the copied repo. 😞

@CharliePoole
Copy link
Contributor Author

Ah... I see. You said "commit comments" but I was thinking of comments on the PR or issue, which is a different matter. PR and issue comments with links are solved by Rob's suggestion that we don't move them. Commits obviously have to be moved.

I'll try to think about a way to resolve this. We don't generally put links into commit comments, but github does it on merging PRs. If there is a relatively small number of them (say, < 100) then it's reasonable to just edit them all. More than that, I'd say no.

@CharliePoole
Copy link
Contributor Author

Considering the issues that @jcansdale brings up, would we be better to...

  1. Keep all existing history in the nunit (nunit-framework) repository.
  2. Create the new repos using export, so there is no history to start.

If we did this, there would be no ambiguity. New history would be added, obviously, as we worked in the new repos. After a certain amount of time, there would probably be little use of the older history anyway.

This is what i have done with the small repos like NUnit.System.Linq and teamcity-event-listener. Should we do it with the console/engine repo as well?

@rprouse
Copy link
Member

rprouse commented Aug 2, 2016

Maybe it is just me, but I find looking at history for deleted files a
pain, so I would prefer not truncating history in the new repos beyond some
filtering. For the console/engine, we could probably truncate the history
back to when we first imported it into this repo.

I also think we should keep all history in the nunit repo.

On Aug 1, 2016 8:55 PM, "CharliePoole" notifications@github.com wrote:

Considering the issues that @jcansdale https://github.com/jcansdale
brings up, would we be better to...

  1. Keep all existing history in the nunit (nunit-framework) repository.
  2. Create the new repos using export, so there is no history to start.

If we did this, there would be no ambiguity. New history would be added,
obviously, as we worked in the new repos. After a certain amount of time,
there would probably be little use of the older history anyway.

This is what i have done with the small repos like NUnit.System.Linq and
teamcity-event-listener. Should we do it with the console/engine repo as
well?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#1598 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAeJBFQn7AR2O-P4M0AszvgJjJzSckBYks5qbpWagaJpZM4I6y0E
.

@CharliePoole
Copy link
Contributor Author

@rprouse I guess it's possible to truncate history, but it's not anything I've looked at doing.

For each repo, there are three possibilities I see...

  1. Keep all history and just delete the working files
  2. Delete both working files and history using filter-branch
  3. Not have any history at all - just create a new repo starting with the current state of the code.

For the NUnit repo, which will become nunit framework, we have decided on option 1 I think.

For the engine/console repo I was originally thinking of doing 2, but I just suggested 3. I think you are suggesting 1... correct?

For the new extension repos, I would do 3, just to keep it simple. That's what I already did with teamcity.

@jcansdale
Copy link

I think we need a home for the old pull-requests and issues. Keeping that in nunit/nunit makes sense. If it was possible to run a regex on commit comments after doing a filter-branch, we could automatically replace all # issue with nunit/nunit # issue. That way we keep all history without breaking links. It would be a shame to forget history.

@rprouse
Copy link
Member

rprouse commented Aug 2, 2016

I was suggesting 1 for the nunit repo and 2 for the console/engine repo but
thought we might also want to delete tags and truncate the history from
before we merged the console/engine in. That probably isn't necessary if
you filter-branch though.

On 2 August 2016 at 07:37, Jamie Cansdale notifications@github.com wrote:

I think we need a home for the old pull-requests and issues. Keeping that
in nunit/nunit makes sense. If it was possible to run a regex on commit
comments after doing a filter-branch, we could automatically replace all #
issue with nunit/nunit # issue. That way we keep all history without
breaking links. It would be a shame to forget history.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#1598 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAeJBHTDZW5etgZNAHV2i2l3uzft2JfFks5qbyvjgaJpZM4I6y0E
.

Rob Prouse

I welcome VSRE emails. Learn more at http://vsre.info/

@CharliePoole
Copy link
Contributor Author

@rprouse It may take an extra filtering pass, now that I think of it. Initially, I'll filter out all commits under the NUnitFramework directory but that will leave the earlier history before the framework was moved to a subdirectory. That probably explains why my dry run didn't reduce the repo size as much as I thought it should.

@rprouse
Copy link
Member

rprouse commented Aug 4, 2016

I've been thinking a bit more about the split, specifically the idea of renaming this repository after the split. This has been our main repository for a long time now and there are probably hundreds of links to it and probably to issues within it, so I think we should keep the name the same. GitHub redirects git remotes to the renamed repository, but I don't know what else they redirect or how long it lasts.

I think if we update the description on the front page of the repo along with the README, it will be clear enough what this repo is.

@CharliePoole
Copy link
Contributor Author

I'm good with keeping the name.

I'm still a bit uncertain about keeping the history, even filtered, in the new repo. Do we really need it in two places?

@jcansdale
Copy link

I'd be reluctant to lose history. I find blame in particular very useful. Even though history would be preserved in the old repo, I don't imagine navigating between them would be at all intuitive.

@jcansdale
Copy link

It looks like there may be a way to amend the commit messages while doing a filter-branch. See:
http://stackoverflow.com/questions/5032374/accidentally-pushed-commit-change-git-commit-message/13394873#13394873

Does that make any sense to you? 😉

@CharliePoole
Copy link
Contributor Author

I guess my thinking is that I very rarely need to use old history. After a few months, I imagine the only history I care about would be in the new repo. For that short period, I'd be fine with the extra effort of looking in the old repo. In fact, I've done that already on the rare occasions where I wanted to look at ancient history, back to our SourceForge days.

Do you find you often have to go back a long way in the history? I'm generally just trying to find out what broke since the last release.

@CharliePoole
Copy link
Contributor Author

I do think this is something we need to decide before finalizing the split. We can't add the history if we don't keep it. We could keep it and subsequently delete it, but it's significantly easier to make the decision in advance.

@jcansdale
Copy link

Do you find you often have to go back a long way in the history?

It can be useful if you come across a weird bit of code. At least you can find out who wrote it. 😉

@CharliePoole
Copy link
Contributor Author

Yup. That's exactly what sent me back into the ancient history of NUnit one or two times - who decided to do this crazy thing?!? Still, it has only been one or two times for me.

As I read the comments, it seems our views on this split as follows:

  • You prefer a completely clean history with updated comments. (most work)
  • Rob prefers a filtered history but can live with the invalid comments (less work)
  • I prefer just exporting the files and starting history anew - clean but limited. (easiest)

It's only coincidence that I'm the guy doing it. :-) But seriously, I'll do it however we decide.

Any more @nunit/core-team opinions? Can we make a decision?

@ChrisMaddock
Copy link
Member

I don't do enough work on NUnit for this to really matter, but in other projects I find the history invaluable. Normal use is to trace a line of code back to a bug tracker issue, and work out what the heck the thinking was when it was written. 😉

For that to work, it seems it requires the "most work" options - but to my mind, putting that effort in at this point is justified. 😊 Sorry!

@rprouse
Copy link
Member

rprouse commented Aug 4, 2016

I think full history and updated comments would be nice, but not worth the effort if we are maintaining the history in this repo. It is a bit more work to spelunk in another repo, but we don't do it often. I also find history invaluable at times, that is why I would like to maintain it in this repo so we at least have it somewhere.

I would even be fine with option 3. Worst case, Charlie or I can usually remember who wrote something and roughly why 😄

@CharliePoole
Copy link
Contributor Author

I guess I see where the different views come from. If you've been working on the project for years, then you probably don't go back very far that often. But if you are relatively newer to it, then you may need to look back to understand something.

@ChrisMaddock If you were working on the engine and you needed to do that, would you find switching to the nunit repo to do your research too much of an effort? Note that if I correct the comments on the history in the engine/console repo, they would take you there eventually anyway.

My take on the three options is that I'd rather have no history than dubious history. Option 2 is dubious, because the merge comments are incorrect. Option 1 seems dubious as well, because the history is incomplete and the links in merge comments take you to a different repo, perhaps without your realizing what is happening. Option 3 forces you to take an extra step to find the history, but once you are there it's complete and correct.

@CharliePoole
Copy link
Contributor Author

@ChrisMaddock OTOH, we're willing to do more work if it will make you do more work. :-)

@jcansdale
Copy link

Option 1 seems dubious as well, because the history is incomplete and the links in merge comments take you to a different repo, perhaps without your realizing what is happening.

How do you mean about the history being incomplete? Won't it preserve all commits that touch a file on the retained branch?

If it takes you to the correct issue/PR without you realizing, isn't that a good thing? 😉

@ChrisMaddock
Copy link
Member

@ChrisMaddock OTOH, we're willing to do more work if it will make you do more work. :-)

Haha...clever 😉 Not sure I can promise that!

If you were working on the engine and you needed to do that, would you find switching to the nunit repo to do your research too much of an effort?

No, sure. But I would need to know that I needed to do that, which future new developers may not.

I'd rather have no history than dubious history

Agreed! Having links to the wrong issue sounds very confusing!

I'm missing the dubious part of option 1, maybe I've mis-understood? I thought option 1 was to edit all the link so that they still linked to the correct issue?

All this said, I don't really have any idea how much work this involves - I was imaging it was just a regex passed over the history - I'm not a regular git user! I can see how time may be better spent elsewhere.

@CharliePoole
Copy link
Contributor Author

Why I call option 1 dubious history...

  1. Commits that involved concurrent changes to the engine and framework (lots of that in our early history) will be falsified to look as if only the engine changed.
  2. The falsified comments will link to an issue in the full history (nunit repo).
  3. Basically, so long as you stay in the engine history, you will be seeing one thing but the moment you link to the nunit repo you'll see something else.

@ChrisMaddock As far as new guys go, if they are experienced, they know that every repo starts somewhere, with it's first import. If there is earlier history, it's elsewhere. I've run into this in lots of open source projects. OTOH, if they are inexperienced, they will be confused whatever we do. :-)

To the level of work... It's certainly doable. More a question of priorities. However, this is a basis for further progress, so I want to do the right thing.

@ChrisMaddock
Copy link
Member

If there is earlier history, it's elsewhere. I've run into this in lots of open source projects.

True - but feels like it would still be best to avoid if at all possible! But then I'm the kind of person who get's upset when he breaks up a class and looses the revision history on a method...this is a different level! 😉

Option 1 still seems the lesser of evils to me, however I don't have much understanding of how much work is involved, and equally, I won't be the person doing the work! So could understand that as a justification for going for option 3.

@CharliePoole
Copy link
Contributor Author

I'm going to give it a shot anyway. I'll just keep on filtering locally until it looks right.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants