-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problem importing CVS module #10
Comments
Hi York, Is that correct? Is the CVS pubically accessible? I never wrote code to support the "modules" file in crap-clone, unfortunately. If all the subdirectories are in one place, then you might get a good enough clone using the real server side directory:
Or you could remove the "CVSROOT/modules" file and clone the entire CVS repo:
(If you end up with the CVS repo cloned, but the directory layout wrong, then "git filter-branch" can be used to modify the git repo). Let me know if this is any use. There are a couple of other cvs-to-git converters around, you could see if they do better. Cheers, Ralph. |
Hi Ralph Loader, Thank you very much for your help.
In the file CVSROOT/modules, there is a line referring the module AB+:
Too bad that the repository is highly confidential, and nobody is allowed to disclose anything into public domain. I guess I don't have access to the entire repository either.
I've tried to clone "project", but I got the following error:
I don't think I have the access to remove the "CVSROOT/modules" file on the server. But I did try to clone the entire repo and got the error:
So far, crap-clone is the only tool that has worked (partly) for me. I didn't bother with cvs2git because as far as I know, it's requires the local account to the CVS repo which I don't have. I've also tried git cvsimport:
But I got the error:
However, I was able to clone project/1-1 using git-cvsimport:
But it was extremely slow. I also tried cvsclone which gave me the "1.32 Segmentation fault" error. I would appreciate it if you could give me some help or suggestion. Thanks a lot York |
FYI I just tried using another tool called cvsclone, to clone the entire CVS repo under "project" to my local drive. It seemed to work well until to the point when it tried to enter the directory "project/Nework". Here's the error it reported:
|
It looks like you don't have permissions to that part of the repository on the server side. You have a couple of options to work around this, neither especially pleasant: Clone each directory in project/ separately, and then use git to merge them all together. It should be possible using git-filter-branch and git-stitch-repo. You would have to work out the details yourself though. Alternatively, you could modify crap-clone (or cvsclone) to modify the 'cvs rlog' command it sends to the server; in my crap-clone.c this looks like:
Instead of sending one 'Argument' line for the top level directory (stream.module), send one for each sub-directory that you are interested in. I haven't tried this, you will need to experiment a bit to get it right... |
Actually, I wrote a quick hack to let you do this. Try the branch directory-limit from my repo. This adds a command-line option to list the directories you want to clone. So you should be able to do:
and this will include the directories 1-1, 1-2, 1-3 but ignore Network. |
This is super amazing Ralph. Thank you so much! I'm current cloning all the projects. It takes a while. I'll let you know the results tomorrow morning after arriving my office. Thanks again! York |
Hi Ralph, I seem to have successfully cloned all the projects within the module AB+ into a single git repository. However, one thing I've noticed is that each git commit contains only one file. In other words, a single CVS commit has been split into several git commits, one commit per file. I guess this is because CVS doesn't have the changeset concept right? If this is the case, I guess it would not be straightforward to re-assemble the changeset to create one git commit right? Thanks, York |
Hi York,
If you want to debug this yourself, the relevant code is in changeset.c: The create_changesets() function sorts the revisions, and then aggregates into changesets. It uses the function strings_match() compares author / commitid / branch-name / log-message (and an internal flag). (It is intentional that the strings are compared by pointer : I keep only one copy of each unique string content). |
Hi Ralph, I think you've done a really good job grouping things into changesets; looks like you did it correctly! I have checked a few cases carefully and noticed that even though those CVS commits have the same commit messages, the commit timestamps were really different, they've really been committed several times and they really have different CVS commit Ids. In the cases when several files were really committed in one go, they have exactly the same timestamps and CVS commit Ids; and you have put them into one single git commit! I was under the wrong impression because I thought the "Checkin Notice Emails" I received whenever somebody commits something were automatically generated. But turns out they were not. Those checkin notice were actually manually composed by the developers. I apologize for reporting the non-bug. I'll ask your help when I have new problem. Thank you very much! York |
Hi Ralph, In case one truly does commit multiple times with the same commit message, I think it's a good idea to combine the consecutive commits into one single git commit. Therefore, I tried your "no-commitid" branch which seems to work. Great job! On the other hand, I wanted to make sure that:
Thanks again, York |
Hi York,
The only case I've seen in real life where 3. is necessary is where my first attempt at building the commits has two versions of the same file. Presumable what happened was that someone committed, fixed a problem immediately, and then committed the fix with the same log message. |
Hi Ralph, Thank you very much for your explanation. I just took a quick look into the file changeset.c, and your code looks neat and nice. Amazing job!
Definitely, in my opinion! Maybe add a command line switch for this? Also, don't forget the extremely useful "-d DIRECTORY" option! I guess after you merges the Thanks again, York |
Importing CVS module to a single Git repository can now be achieved by passing all the directories defined by the CVS module on the command line, using the new "-d DIRECTORY" option. |
Hi Ralph Loader,
First of all, thank you so much for this amazing tool, it's really promising, except one thing which is stopping me from being able to proceed.
I'm trying to import a CVS module at work to git repositories. Let's say the module's name is AB+ which contains about 20 sub directories inside a directory called "project", say, project/1-1, project/1-2, ... project/1-20. I was able to import each directory 1-1 to 1-20 individually into 20 git repositories by using:
where N is an integer in the range [1, 20].
However, I was unable to import the module AB+ into a single git repository by:
and I always get the error:
I've tried to workaround by using 20 git repositories. But unfortunately that doesn't seem to work well because people always commit changes in different directories (1-1 to 1-20) in one CVS commit. The 20 sub directories can not be treated like 20 independent repositories. But instead, they are closely related and are really one single project.
I would really appreciate it if you would help me with this.
Thanks in advance!
York
The text was updated successfully, but these errors were encountered: