fix content length calculation in git objects blob header #1439

okhwaja · 2020-06-15T00:41:17Z

Hi, first time contributing so please bear with me if I missed something.

When testing out some code based on the contents of the internals section, I believe I ran into a small mistake. The header should contain the size in bytes of the content, but content.length in the example returns the number of characters. Using bytesize returns the right value: https://ruby-doc.org/core-2.4.0/String.html#method-i-bytesize

The example works with its content (what is up, doc?) but the issue manifests with other characters

irb(main):002:0> content = "i have €5 in my pocket"
=> "i have €5 in my pocket"
irb(main):003:0> content.length
=> 22
irb(main):004:0> content.bytesize
=> 24
irb(main):05:0> sha_w_length = Digest::SHA1.hexdigest("blob #{content.length}\0" + content)
=> "9859c2c849cc5591aa2223a6fb697aeaa9a5f7fe"
irb(main):06:0> sha_w_bytesize = Digest::SHA1.hexdigest("blob #{content.bytesize}\0" + content)
=> "3d446f4f877e1bea82e603328a845ad7b036338e"

As expected they result in different sha's. The correct one is the one using bytesize

$ echo -n 'i have €5 in my pocket' | git hash-object --stdin
3d446f4f877e1bea82e603328a845ad7b036338e

This PR just tweaks the example to use bytesize instead of length

ben · 2020-06-16T14:42:19Z

Brilliant! I'm not terribly surprised that a Unicode issue snuck in, that's a blind spot we English-speaking Americans tend to have. Thanks!

Co-Authored-By: Osman Khwaja <osman.khwaja@gmail.com>

use bytesize

d23bf5e

ben merged commit 69addf3 into progit:master Jun 16, 2020

max123kl added a commit to max123kl/progit2-de_main that referenced this pull request Jul 3, 2020

include engl Commit progit/progit2#1439

6986688

Co-Authored-By: Osman Khwaja <osman.khwaja@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix content length calculation in git objects blob header #1439

fix content length calculation in git objects blob header #1439

Uh oh!

okhwaja commented Jun 15, 2020 •

edited

Loading

Uh oh!

ben commented Jun 16, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fix content length calculation in git objects blob header #1439

fix content length calculation in git objects blob header #1439

Uh oh!

Conversation

okhwaja commented Jun 15, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ben commented Jun 16, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

okhwaja commented Jun 15, 2020 •

edited

Loading