Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change interfaces to work with larger data (required when merging long-dict-id branch to devel) #63

Closed
webdata opened this issue May 30, 2017 · 7 comments

Comments

@webdata
Copy link
Contributor

webdata commented May 30, 2017

Hi all,

We have recently created the LOD-a-lot HDT dataset including 28B triples. @MarioAriasGa performed some datatype changes in the branch https://github.com/rdfhdt/hdt-cpp/tree/long-dict-id in order to manage such an amount of triples.

We should then merge this branch into develop (https://github.com/rdfhdt/hdt-cpp/tree/develop) and later on into master. Given that the interface has changed, we suggest to create a new major revision.

Opinions?

See the required changes (some of them are only for the HDT-it app, please filter by hdt-lib/): https://github.com/rdfhdt/hdt-cpp/compare/develop...long-dict-id?diff=unified&name=long-dict-id

@wouterbeek
Copy link
Contributor

wouterbeek commented May 30, 2017

For @JanWielemaker and me it would be easier to have these changes into the development branch.

I will be using LOD-a-lot intensively over the coming month or so, so I can contribute in bug hunting / testing the development branch.

@JanWielemaker
Copy link
Contributor

@wouterbeek Is fine with a branch too. It just implies creating a branch in the Prolog interface and moving the hdt-lib submodule to a branch. No problem.

@wouterbeek
Copy link
Contributor

@JanWielemaker Oops, I stand corrected!

@RubenVerborgh
Copy link
Member

  • happy with merging this into develop
  • +1 for major version change, but careful: this should not change the HDT format version number
  • Is the index file affected?

@RubenVerborgh
Copy link
Member

Ah, and of course on the condition that the CI works (fails now on head of long-dict-id ).

@MarioAriasGa
Copy link
Member

The HDT format stays the same (including the index), but can potentially break compilations of software compiling against the library. The main change is using size_t for dictionary ids all around the place.

@RubenVerborgh
Copy link
Member

Fine with me; please create a pull request for a rebased long-dict-id.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants