Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stricter pointer spec #246

Merged
merged 16 commits into from Apr 24, 2015
Merged
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
40 changes: 30 additions & 10 deletions docs/spec.md
Expand Up @@ -9,23 +9,43 @@ for other tools.
The core Git LFS idea is that instead of writing large blobs to a Git repository,
only a pointer file is written.

* Pointer files are text files which MUST contain only UTF-8 characters.
* Each line MUST be of the format `{key} {value}\n` (trailing unix newline).
* Only a single space character between `{key}` and `{value}`.
* Keys MUST only use the characters `[a-z] [0-9] . -`.
* The first key is _always_ `version`.
* Lines of key/value pairs MUST be sorted alphabetically in ascending order
(with the exception of `version`, which is always first).
* Values MUST NOT contain return or newline characters.
* Pointer files SHOULD NOT have the executable bit set when checked-in in Git.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/checked-in in/checked into/


The required keys are:

* `version` is a URL that identifies the pointer file spec. Parsers MUST use
simple string comparison on the version, without any URL parsing or
normalization. It is case sensitive, and %-encoding is discouraged.
* `oid` tracks the unique object id for the file, prefixed by its hashing
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to be explicit about the prefix being followed by a ":" or is that clear from the example?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The spec should be 100% self contained: examples are not the spec.

I agree the column needs to be mentioned here.
On Apr 20, 2015 12:39 PM, "David Ginsburg" notifications@github.com wrote:

In docs/spec.md
#246 (comment):

+* Pointer files are text files which MUST contain only UTF-8 characters.
+* Each line MUST be of the format {key} {value}\n (trailing unix newline).
+* Only a single space character between {key} and {value}.
+* Keys MUST only use the characters [a-z] [0-9] . -.
+* The first key is always version.
+* Lines of key/value pairs MUST be sorted alphabetically in ascending order
+(with the exception of version, which is always first).
+* Values MUST NOT contain return or newline characters.
+* Pointer files SHOULD NOT have the executable bit set when checked-in in Git.
+
+The required keys are:
+
+* version is a URL that identifies the pointer file spec. Parsers MUST use
+simple string comparison on the version, without any URL parsing or
+normalization. It is case sensitive, and %-encoding is discouraged.
+* oid tracks the unique object id for the file, prefixed by its hashing

Do you want to be explicit about the prefix being followed by a ":" or is
that clear from the example?


Reply to this email directly or view it on GitHub
https://github.com/github/git-lfs/pull/246/files#r28721411.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a string format example here: {hash-method}:{hash}

method. Currently, only `sha256` is supported.
* `size` is in bytes.

Example of a v1 text pointer:

```
version https://git-lfs.github.com/spec/v1
oid sha256:4d7a214614ab2935c943f9e0ff69d22eadbb8f32b1258daaa5e2ca24d17e2393
size 12345
(ending \n)
```

The pointer file should be small (less than 200 bytes), and consist of only
ASCII characters. Libraries that generate this should write the file
identically, so that different implementations write consistent pointers that
translate to the same Git blob OID. This means:
For testing compliance of any tool generating its own pointer files, the
reference is this official Git LFS tool:

NOTE: exact pointer command behavior TBD!

* Use properties "version", "oid", and "size" in that order.
* Separate the property from its value with a single space.
* Oid has a "sha256:" prefix. No other hashing methods are currently supported
for Git LFS oids.
* Size is in bytes.
* Run `git lfs pointer` to generate a pointer file for the given local file.
* Run `git lfs pointer` to compare the blob OID of the generated pointer files.
* Tools that parse and regenerate pointer files MUST preserve keys that they
don't know or care about.

Note: Earlier versions only contained the OID, with a `# comment` above it.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This wasn't touched by the current PR but including this part about earlier versions and some out-of-date code for parsing older pointer files has been bothering me 😜

Any reason not to 🔥 this section through current line 66?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, then Git LFS tools don't know how to parse older pointer files. I was thinking of removing this section though, because the chance of anyone besides me seeing these old blobs is pretty minimal.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this spec is supposed to indicate how to parse older versions then it
would need one section per version, and each section should be self
contained explaining exactly what version it is, how to identify it from
the header, and of course what the format is, and at least one example file
pointer.

It would be the best way to let third parties write correct tools.
On Apr 20, 2015 12:55 PM, "risk danger olson" notifications@github.com
wrote:

In docs/spec.md
#246 (comment):

-* Use properties "version", "oid", and "size" in that order.
-* Separate the property from its value with a single space.
-* Oid has a "sha256:" prefix. No other hashing methods are currently supported
-for Git LFS oids.
-* Size is in bytes.
+* Run git lfs pointer to generate a pointer file for the given local file.
+* Run git lfs pointer to compare the blob OID of the generated pointer files.
+* Tools that parse and regenerate pointer files MUST preserve keys that they
+don't know or care about.

Note: Earlier versions only contained the OID, with a # comment above it.

Sure, then Git LFS tools don't know how to parse older pointer files. I
was thinking of removing this section though, because the chance of anyone
besides me seeing these old blobs is pretty minimal.


Reply to this email directly or view it on GitHub
https://github.com/github/git-lfs/pull/246/files#r28722784.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed support for the really old pointer files. Nothing but internal test repos ever used those. But I did add an example of the pre-release pointer. Again, the chances of any tool encountering this is pretty low...

Here's some ruby code to parse older pointer files.
Expand Down Expand Up @@ -122,4 +142,4 @@ $ cat .gitattributes
*.zip filter=lfs -crlf
```

Use the `git lfs path` command to view and add to `.gitattributes`.
Use the `git lfs track` command to view and add to `.gitattributes`.