Add a version number to the .hdt.index files #7

RubenVerborgh · 2014-11-21T10:22:50Z

Different .hdt.index files for the same .hdt file are incompatible with each other; this causes problems if different applications read them. Maybe they should receive some kind of version number.

But this then creates the problem: how to find the index?
Perhaps, instead of supplying the .hdt file name, we can provide the index file name.

RubenVerborgh · 2014-11-21T10:23:12Z

See also LinkedDataFragments/HDT-Node#1.

artob · 2015-11-13T21:12:53Z

Just encountered ERROR: Trying to read a LOGArray but data is not LogArray myself when trying to use tools/hdtSearch on some indexed HDT files prepared by a third party. Would indeed be good to figure out some strategy here--e.g., a common header with a format version number and feature flags.

mielvds · 2015-12-03T08:37:12Z

My proposal for a versioning strategy:

Given that the version number is x.y.z:
A change in the x introduces a breaking change in the HDT file that can be generated or read, including a breaking change in the generated index file
A change in the y introduces a breaking change in the generated index file, but ensures compatible HDT files
All else increment z

Original release: 1.0.0
This release: 1.1.1
Next release: 1.1.2

Index files can, for instance, be named <filename>.v11.hdt.index

Any thoughts?

RubenVerborgh · 2015-12-03T09:26:47Z

Makes sense to me (I'd just put the v11 either after the .hdt or .index to avoid any confusion).

artob · 2016-06-12T05:34:53Z

Has anyone perchance made progress on this front in recent months?

mielvds · 2016-06-28T07:59:47Z

Would it be acceptable to put the version number in the makefile?

mielvds · 2016-06-28T08:03:24Z

Nah, I guess that won't work because of other build strategies

RubenVerborgh · 2016-06-28T17:02:56Z

Simply put it in an include file?

mielvds · 2016-06-29T07:19:21Z

Sounds reasonable. Probably this will have to happen in the Java version as well for compatibility?

mielvds · 2016-06-29T07:39:44Z

I'll use this approach: http://stackoverflow.com/questions/27395120/correct-way-to-encode-embed-version-number-in-program-code

mielvds · 2016-07-07T12:56:55Z

@bendiken fixed in #36 , please review

mielvds · 2016-12-04T18:23:25Z

Fixed with merge of #36

RubenVerborgh · 2016-12-04T18:25:16Z

Excellent. Shall we publish and tag a v1.2.0 soon then?

mielvds · 2016-12-04T18:29:58Z

depends. Did the index change in a breaking way? Else it's 1.1.2 :)

We should include the versioning strategy in the readme

RubenVerborgh · 2016-12-04T18:32:08Z

Not sure I agree:

I would want to follow the SemVer convention of minor version = "new backwards-compatible features"
HDTVersion.hpp has places for HDT_VERSION, INDEX_VERSION, RELEASE_VERSION, but nothing implies that this is tied to the version number of the software itself. In my opinion, they should be separate: the software can go through minor and major releases, without changing compatibility with a certain major HDT format.

wouterbeek · 2016-12-04T18:37:53Z

Sorry, I might be going off in a very different direction from what you've been discussing here...

I would very much prefer the index and HDT file to be one and the same. It is technically trivial to do so.

IIUC then the only reason why index and HDT are not one and the same file is because you can save the size of the index file when using HDTs as a transmission format. However, the size of the index is not so big so this size benefit is not so large. If we use the HDT as a storage format, then the size difference does not matter at all (because disk is so cheap these days).

Having 1 file with no versioning/synchronization overhead between HDT and index would significantly simplify handling HDTs.

mielvds · 2016-12-04T20:01:47Z

@RubenVerborgh keeping them separate just seems confusing to me. But if this is common practice, by all means.

@wouterbeek from my experience, indexes can be quite large. But I have to admit, it would simplify things.

wouterbeek · 2016-12-04T20:11:22Z

@mielvds The index files are between 10% and 40% of the size of the HDT file. There may be exceptions, but this is the ballpark figure IINM.

Having a versioning system is of course better than the current situation where we have to do the bookkeeping ourselves.

RubenVerborgh · 2016-12-04T20:54:36Z

@mielvds What I suggested is semver, which becomes more and more common. But I don't mind too much in this case; I'd just want a new release somewhere soon.

@wouterbeek Not sure I follow the argument of an index to be the same everywhere. We recently had a commit in which the index was improved for certain lookups. And as you know, the index file is not information by itself, as it can be computed in its entirety from the HDT file. In almost all cases, it will be faster to generate it than to download it.

RubenVerborgh · 2016-12-04T21:45:51Z

@mielvds Perhaps wait with releasing a new version until I have resolved this. I suspect something went wrong recently with this codebase.

mielvds · 2016-12-05T09:34:14Z

@RubenVerborgh sounds reasonable, but we'll need to add that distinction to the code then.

@wouterbeek I couldn't remember the argument against it, and @RubenVerborgh found it. Some indexes are optional, and we don't know which ones will be added in the future.

RubenVerborgh · 2016-12-05T22:19:44Z

@mielvds The blocking issue for a new version is #43.

RubenVerborgh mentioned this issue Nov 21, 2014

.hdt.index files are incompatible with other HDT library versions LinkedDataFragments/HDT-Node#1

Closed

mielvds mentioned this issue Nov 30, 2015

Improve API to production quality #18

Closed

artob added the enhancement label Jun 14, 2016

mielvds self-assigned this Jun 27, 2016

artob mentioned this issue Nov 24, 2016

HDT-CPP is not able to load HDT file created in HDT-Java rdfhdt/hdt-java#30

Closed

mielvds closed this as completed Dec 4, 2016

webdata mentioned this issue Dec 9, 2016

HDT format identifier uses only HDT version. #43

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a version number to the .hdt.index files #7

Add a version number to the .hdt.index files #7

RubenVerborgh commented Nov 21, 2014

RubenVerborgh commented Nov 21, 2014

artob commented Nov 13, 2015

mielvds commented Dec 3, 2015

RubenVerborgh commented Dec 3, 2015

artob commented Jun 12, 2016

mielvds commented Jun 28, 2016

mielvds commented Jun 28, 2016

RubenVerborgh commented Jun 28, 2016

mielvds commented Jun 29, 2016

mielvds commented Jun 29, 2016

mielvds commented Jul 7, 2016

mielvds commented Dec 4, 2016

RubenVerborgh commented Dec 4, 2016

mielvds commented Dec 4, 2016

RubenVerborgh commented Dec 4, 2016 •

edited

Loading

wouterbeek commented Dec 4, 2016 •

edited

Loading

mielvds commented Dec 4, 2016 •

edited

Loading

wouterbeek commented Dec 4, 2016

RubenVerborgh commented Dec 4, 2016 •

edited

Loading

RubenVerborgh commented Dec 4, 2016

mielvds commented Dec 5, 2016

RubenVerborgh commented Dec 5, 2016

Add a version number to the .hdt.index files #7

Add a version number to the .hdt.index files #7

Comments

RubenVerborgh commented Nov 21, 2014

RubenVerborgh commented Nov 21, 2014

artob commented Nov 13, 2015

mielvds commented Dec 3, 2015

RubenVerborgh commented Dec 3, 2015

artob commented Jun 12, 2016

mielvds commented Jun 28, 2016

mielvds commented Jun 28, 2016

RubenVerborgh commented Jun 28, 2016

mielvds commented Jun 29, 2016

mielvds commented Jun 29, 2016

mielvds commented Jul 7, 2016

mielvds commented Dec 4, 2016

RubenVerborgh commented Dec 4, 2016

mielvds commented Dec 4, 2016

RubenVerborgh commented Dec 4, 2016 • edited Loading

wouterbeek commented Dec 4, 2016 • edited Loading

mielvds commented Dec 4, 2016 • edited Loading

wouterbeek commented Dec 4, 2016

RubenVerborgh commented Dec 4, 2016 • edited Loading

RubenVerborgh commented Dec 4, 2016

mielvds commented Dec 5, 2016

RubenVerborgh commented Dec 5, 2016

RubenVerborgh commented Dec 4, 2016 •

edited

Loading

wouterbeek commented Dec 4, 2016 •

edited

Loading

mielvds commented Dec 4, 2016 •

edited

Loading

RubenVerborgh commented Dec 4, 2016 •

edited

Loading