Skip to content
This repository was archived by the owner on Jan 3, 2018. It is now read-only.

Conversation

@ahmadia
Copy link
Contributor

@ahmadia ahmadia commented Oct 14, 2013

Please do not merge this Pull Request yet

This injects about 6 MB of boot-camps history into the bc repository, specifically all history for files in the tree at swcarpentry/DEPRECATED-boot-camps@5b42aeb.

I got here by following these steps:

Cleaning up boot-camps

git ls-files > keep-these.txt
git filter-branch --force --index-filter "git rm  --ignore-unmatch --cached -qr . ; cat $PWD/keep-these.txt | gxargs git reset -q \$GIT_COMMIT --" --prune-empty --tag-name-filter cat -- --all
rm -rf .git/refs/original/
git reflog expire --expire=now --all
git gc --prune=now
git gc --aggressive --prune=now

Merging in history with state reset to current state of bc

git branch bc-first-commit 328d742231cea360e11469b8a024e20d0686e723
git read-tree --reset bc-first-commit
git commit -m "Sync with bc original state"
git merge bc-first-commit
git clean -xdf
git checkout README.md
git merge --no-ff swcarpentry/gh-pages

This PR needs discussion and review before we merge it in.

We should check that it correctly:

[ ] Restores commit history/provenance of files (commands like git blame and git log --follow -p file work
[ ] Achieves the right balance of size vs. preserved history
[ ] Doesn't miss anything important

Note: that this does not automatically resolve #41 because the original reproducible_workflow file was stored in an archival branch and never merged to master.

I'll need to manually go through and merge those files in. I'm doing everything possible to avoid rebasing, since bc is now effectively in the wild. Any help in identifying files without proper provenance in bc and their history in boot-camps, testing provenance of files that exist, and verifying that tools like git blame and git log --follow -p work is appreciated.

wking and others added 30 commits December 29, 2012 14:48
.mailmap: Standardize identifiers for Lynne and Matt
This will keep things from getting too cluttered as the workshop
information fills in.
Includes license, workflow, and file formats sections.
…ps into 2013-01-chicago

Conflicts:
	python/intro/animals.txt
	python/intro/big_animals.txt
This makes the README easier to read in a terminal or editor, while
leaving the Markdown output unchanged.
There will be a lot of other stuff in the boot camp repository.  Stay
organized by keeping "how to set up and test your machine" stuff in a
subdirectory.
README.md: Line wrapping and reference-style links
@wking
Copy link
Contributor

wking commented Oct 14, 2013

I think restoring the development history lost during the boot-camps →
bc transition is a good thing. If you're feeling especially brave,
there was also some history lost during the earlier
thehackerwithin/Python2010/wiki → workshops → boot-camps transitions.
For example the root of the current thw-shell content is in
shellExample (clone
git://github.com/thehackerwithin/Python2010.wiki.git). While I agree
that rebasing bc would be awkward, the whole bc repository is
effectively a rebased version of the boot-camps repository, and we
survived the boot-camps → bc transition, so it can't be that awkward
;). I don't understand why this project keeps destroying its history
and getting into situations like this in the first place :p. By
merging or cherry-picking instead of creating copy-paste commits, we
can keep ourselves out of similar situations in the future.

I'll try to audit any of my commits in this branch against their
originals sometime this week.

@gvwilson
Copy link
Contributor

This feels like bikeshedding to me [1]: importing the history won't make any difference at all to our learners, and every hour that's put into this could instead be put into creating exercises, updating lessons, or (right now) helping instructor trainees get up to speed with Git and GitHub, all of which would actually help the scientists we're supposedly here to help.
That said, it's clearly important to some people to have the whole history in one repo, and I apologize for not being more sensitive to this earlier. If you think this is worth your time, I'll respect that; the only thing I insist on is that no patches that rewrite history be merged until after the current crop of instructor trainees have finished their work: it's unfair to ask novice GitHub users to re-roll their pull requests because the ground has shifted under their feet. I would also ask that the volume of material imported be kept to an absolute minimum, since people already find the repo overweight.
[1] http://en.wiktionary.org/wiki/bikeshedding

@ahmadia
Copy link
Contributor Author

ahmadia commented Oct 15, 2013

@wking - There were a number of problems with boot-camps that needed to be resolved, including the fact that the repository had gotten unreasonably heavy. That said, keeping the provenance of content is really important to me, so I hope this is a reasonable compromise.

@gvwilson - Ditto. I understand that you're more concerned with enabling instructors to contribute material than you are with the provenance of said material, but I disagree with your characterization of this as bikeshedding. I think we should practice what we preach with regards to version control, and understanding the who, how and why files have changed in the past is crucial for understanding how to improve them as well as giving credit where it is due.

I don't want to drag this into an extended discussion of the transition period, and I want to make sure you're both happy here. It seems that you are both on-board with this PR because it restores a reasonable amount of history (+1 from @wking) without a rebase (+1 from @gvwilson). Do I have that right?

@wking
Copy link
Contributor

wking commented Oct 15, 2013

On Tue, Oct 15, 2013 at 08:28:19AM -0700, Aron Ahmadia wrote:

@wking - There were a number of problems with boot-camps that
needed to be resolved, including the fact that the repository had
gotten unreasonably heavy. That said, keeping the provenance of
content is really important to me, so I hope this is a reasonable
compromise.

I'm ok with it, I just wish we'd used filter-branch to strip whatever
we didn't like out of boot-camps. That would effectively create a new
bc-style repository without the need for your bc tweaks and a merge
;).

I think we should practice what we preach with regards to version
control, and understanding the who, how and why files have changed
in the past is crucial for understanding how to improve them as well
as giving credit where it is due.

+1

It seems that you are both on-board with this PR because it restores
a reasonable amount of history (+1 from @wking)

Yeah, although I still want to look over my commits before this gets
merged.

@wking
Copy link
Contributor

wking commented Oct 20, 2013

On Tue, Oct 15, 2013 at 09:45:49AM -0700, W. Trevor King wrote:

On Tue, Oct 15, 2013 at 08:28:19AM -0700, Aron Ahmadia wrote:

It seems that you are both on-board with this PR because it restores
a reasonable amount of history (+1 from @wking)

Yeah, although I still want to look over my commits before this gets
merged.

Alright, it looks like this procedure is dropping history on renames.
For example:

$ git log --oneline --follow boot-camps/master -- setup/swc-installation-test-1.py | cat
f4340fb swc-installation-test-1.py: Link to s-c.o's terminal.html
e99502e setup: Move installation test scripts under setup/
cfd8549 swc-installation-test: Return 1 on failure
5bc65af swc-installation-test-1.py: Give instructions for 'python command not found'
a5200a8 swc-installation-test-1.py: Give suggested install hints
4e2cfb3 swc-installation-test: Consolidate and reorganize test scripts
$ git log --oneline --follow bc/pr/79 -- setup/swc-installation-test-1.py | cat
609ff5a Sync with bc original state
10360b0 swc-installation-test-1.py: Link to s-c.o's terminal.html
467ba36 setup: Move installation test scripts under setup/

I'll play around a bit more and see if I can find a way to restore the
old history beyond renames, but it may be hard to do automatically.

@ahmadia
Copy link
Contributor Author

ahmadia commented Oct 21, 2013

@wking - It would be appreciated. I'm guessing we would need to augment keep_these.txt with prior filenames. Is there a way to automatically extract that from the boot-camps history?

@ahmadia
Copy link
Contributor Author

ahmadia commented Oct 22, 2013

This PR is on standby while we sort out #89.

@ahmadia
Copy link
Contributor Author

ahmadia commented Nov 20, 2013

@wking - I've gotten lost on where we are with the history restoration. I'm going to close this PR for now while you work on the individual pieces.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Reproducible workflow tutorial misplaced?