Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds a little more to the SHA1 identifier explanation #51

Closed
wants to merge 1 commit into from
Closed

Adds a little more to the SHA1 identifier explanation #51

wants to merge 1 commit into from

Conversation

hlapp
Copy link
Contributor

@hlapp hlapp commented Feb 13, 2015

No description provided.

@rgaiacs
Copy link
Contributor

rgaiacs commented Feb 13, 2015

@hlapp Thanks for the pull request. We try to not add much information at the novice material because there is already too much things to students learn. But since you wrote a very brief explanation I'm 👍 to merge this.

@wking
Copy link
Contributor

wking commented Feb 13, 2015

On Fri, Feb 13, 2015 at 04:54:30AM -0800, Raniere Silva wrote:

But since you wrote a very brief explanation I'm 👍 to merge this.

I'm less sure ;). “will be globally unique” is too strong, since you
can have hash collisions (they're just not very likely). In any
case, while I love Git's hash-based implentation, I'd rather keep new
learners out of that particular rabbit-hole. Still, it's only a few
lines, so I'm fine if the consensus swings towards including this
change.

@hlapp
Copy link
Contributor Author

hlapp commented Feb 13, 2015

So, the only new thing is introduction of the SHA checksum term and the link, which are hard to use when teaching live.

What if you moved your explanation down, to the "Exploring History" section, where they see for the first time the whole beast

So, when I taught this, I did show them the whole beast right away (git log). I feel there's an important teaching objective here right away that can be well put into the context of the distributed nature of Git, and why that matters in a research context for assigning and tracking (for all intents and purposes) unique identifiers to software versions. (Each commit is a version.) This is connecting back to the motivation (as I rewrote it).

@hlapp
Copy link
Contributor Author

hlapp commented Feb 13, 2015

“will be globally unique” is too strong, since you can have hash collisions (they're just not very likely).

True. However, isn't this a detail that will be (rightfully) lost on students, and for all intents and purposes for why, where, and to whom we are teaching this, aren't these identifiers close enough to globally unique to be called that way? Linus Torvalds is quoted as saying:

I guarantee you, if you put your data in Git, you can trust the fact that five years later, after it was converted from your hard disk to DVD to whatever new technology and you copied it along, five years later you can verify that the data you get back out is the exact same data you put in.

@wking
Copy link
Contributor

wking commented Feb 13, 2015

On Fri, Feb 13, 2015 at 03:02:10PM -0800, Hilmar Lapp wrote:

“will be globally unique” is too strong, since you can have hash
collisions (they're just not very likely).

True. However, isn't this a detail that will be (rightfully) lost on
students…

Absolutely. I meant "this wording is too brief to be correct, and
mentioning the hash algorithm, or hashing at all, probably isn't
enough of a gain to be worth pointing the students in this direction.

I agree that starting with the data model is a valid approach, and I
point folks to 1 frequently because I found it so helpful. I just
don't think throwing out a one-off reference to hashing in a lesson
that otherwise ignores it is the right approach ;).

For what it's worth, the Git user manual [2,3] also talks about
exploring the history 4 before talking about generating history
5, so going in with ‘git log’ would work too (but again, that's not
the approach we've been using here).

@iglpdc
Copy link
Contributor

iglpdc commented Feb 14, 2015

So, when I taught this, I did show them the whole beast right away (git log).

Yes, but you have to do a commit first :) Right after the first commit, some learners are going to ask about the output of git commit and what's that bunch of numbers and letters there. After the commit, it feels more natural to me to do a git status, as in the lesson, to show the effects of the commit in the index.

Your changes are very valuable and I see two places to include them. We can add them after the first time the full checksum shows up, i.e. after git log. In this case, we should remove the similar text from the "Exploring History" section.

The other place to add them is the first time you need to write the commit-id at all, i.e. when git diff'ing in the "Exploring History" section.

Either is fine to me. I also think an entry in the glossary is pertinent. It could include not only the link, but also a couple of short sentences that help the learners to understand:

  • why use checksums instead of correlative numbers, as e.g. Subversion does.
  • that they don't have to write the whole thing --note that this could be really hard for someone that can't copy/paste on the command line-- when they want to refer to the commit.

@hlapp
Copy link
Contributor Author

hlapp commented Feb 14, 2015

So, when I taught this, I did show them the whole beast right away (git log).

Yes, but you have to do a commit first :)

Yes, indeed, I meant after the first commit. Sorry if that wasn't clear.

@wking
Copy link
Contributor

wking commented Feb 14, 2015

On Fri, Feb 13, 2015 at 04:17:36PM -0800, Ivan Gonzalez wrote:

So, when I taught this, I did show them the whole beast right away
(git log).

Yes, but you have to do a commit first :)...

Or you can follow the Git user manual and start with a clone 1, then
look through the log 2, and then start writing commits 3. I'm
not saying that we need to do things that way, I'm just saying that
you can take a reading-before-writing approach to Git if you want
to.

... Either is fine to me.

These both sound like reasonable places to me, but I'd consider moving
all the details to the glossary, linking from "short identifier"
(and anywhere else we discuss hash IDs), and not mentioning the word
"hash", "checksum", or anything more technical in the lesson text.
You can say "unique ID derived from the identified content" in the
lesson and explain that you can abbreviate it, but I don't think it's
worth talking about how those unique IDs are generated in the lesson
itself.

@gvwilson
Copy link
Contributor

I vote for:

  1. Putting this discussion in discussion.md (because it's useful and interesting).
  2. Putting a link to the discussion.md section into the main lesson (because it's a distraction from the main thrust of the lesson, especially for newcomers).

@iglpdc
Copy link
Contributor

iglpdc commented Mar 5, 2015

@hlapp Could you move the new discussion as a new glossary entry in discussion.md as @gvwilson said?

@hlapp
Copy link
Contributor Author

hlapp commented Mar 9, 2015

I can do that but likely not before April. Sorry, been booked solid.

@iglpdc
Copy link
Contributor

iglpdc commented Mar 9, 2015

No problem, we can wait. I'll merge other PRs then and you may need to rebase. Thanks a lot!

El 09/03/2015, a las 11:06, Hilmar Lapp notifications@github.com escribió:

I can do that but likely not before April. Sorry, been booked solid.


Reply to this email directly or view it on GitHub.

@daisieh
Copy link
Contributor

daisieh commented Mar 17, 2015

@hlapp : I merged the big reorg PR #58, so please refactor the change against that new file.

@hlapp
Copy link
Contributor Author

hlapp commented Jul 6, 2015

Redone as and superseded by #164. Closing.

@hlapp hlapp closed this Jul 6, 2015
@hlapp hlapp deleted the id-clarify branch July 6, 2015 03:56
zkamvar pushed a commit that referenced this pull request May 8, 2023
This is essentially a redo of 989aff7, in response to feedback in #51.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants