Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify in the README how the data is modeled in LevelGraph #1

Open
mcollina opened this issue Jul 18, 2013 · 11 comments
Open

Clarify in the README how the data is modeled in LevelGraph #1

mcollina opened this issue Jul 18, 2013 · 11 comments

Comments

@mcollina
Copy link
Collaborator

From @lanthaler:

Makes sense.. You should probably clarify that in the readme though as I
believe a lot of people would otherwise assume that they get the same
document back in a get.

elf-pavlik pushed a commit to elf-pavlik/levelgraph-jsonld that referenced this issue Dec 8, 2013
@elf-pavlik
Copy link
Member

@lanthaler do you know if we could extract frame from input document before/while deserializing it into triples/quads with jsonld.js toRDF() ? we could then save it and provide option to use it on get as we start brainstorming framing in #2

@lanthaler
Copy link

Sorry, but I don’t think I understand what you are saying. Why don’t you simply store the input document as is if you want to restore its structure?

@elf-pavlik
Copy link
Member

good point! 😄

i posted hopefully more clearly stated question to json-ld mailing list: http://lists.w3.org/Archives/Public/public-linked-json/2013Dec/0005.html

@elf-pavlik
Copy link
Member

Closing since I already have added paragraph saying:

Please keep in mind that LevelGraph-JSONLD doesn't store the original JSON-LD document but decomposes it into triples!

framing we can discuss in #2 and #8
more detailed information we can add after resolving #4 and #5

@elf-pavlik
Copy link
Member

I start realizing usefulness of storing somewhere original document which we decompose into triples. We could maybe use triple properties to somehow reference document from which given triple origins. No clue ATM where we could store those original documents...

@mcollina
Copy link
Collaborator Author

I agree! You can store them using they id, but you would need to get down
one level and just use levelup, e.g storing them using the main @id. obj::
http://iamaniri.org/matteo

However, as we are duplicating data, we need to take great care in avoiding
differences.

Il giorno domenica 22 dicembre 2013, ☮ elf Pavlik ☮ ha scritto:

I start realizing usefulness of storing somewhere original document which
we decompose into triples. We could maybe use triple properties to somehow
reference document from which given triple origins. No clue ATM where we
could store those original documents...


Reply to this email directly or view it on GitHubhttps://github.com//issues/1#issuecomment-31096508
.

@elf-pavlik
Copy link
Member

It can get tricky with using @id IRI since some document may not have main resource at all eg. serialized array. At this moment I can think of few possible approaches:

  • if document comes for making HTTP request one can just use it's IRI (maybe + timestamp or HTTP ETag?)
  • generating UUIDs
  • creating hash of original document

At first I would consider starting with something simple like generating UUIDs or hash just so we can somehow reference the original document for which given statement came from! Which might get even more interesting once we get to implementing SPARQL...

We may also need to give a good thought on handling cases of documents with embedded (nested) resources. I can imagine case of someone first creating a resource using big number of statements and then inserting another document with same resource embedded but only including few relevant statements. We can NOT just delete all the statement about this resource and replace them with just those few statements from embedded one! I would prefer not to get into this topic in more depth here and maybe I'll create separate issue for it... previous issue relevant this topic #8 I would prefer to leave for discussing blank nodes -- a big topic on its own 😉

BTW I have impression that we deal lately here with many general RDF specific issues relevant not only to JSON-LD serialization...

@mcollina
Copy link
Collaborator Author

JSON-LD is RDF, and we are developing an RDF data storage on top of LevelGraph. I think having a separate issue is better. Also, can you create a 'v0.3.0-wip' pull-request with all the stuff that you plan to release there?

@elf-pavlik
Copy link
Member

I've just renamed #10 to v0.3.0-wip and would only like to include #11 in it before merging!

Discussion about which issues relate to RDF in general and which issues stay specific to its various serializations we can continue elsewhere...

@elf-pavlik
Copy link
Member

I just play around with parsing my maildir with http://npm.im/mailparser , generating UUID based Skolem IRI for @id, adding @context just with some bogus @vocab and saving it in LevelGraph.

@mcollina could you imagine writing small snippet suggesting how would you go about storing original source of each email in LevelDB maybe simply using generated UUID as key? For now I would just look at adding an additional property with this UUID to each stored triple...

@mcollina
Copy link
Collaborator Author

mcollina commented Jan 2, 2014

Cool project!

I suggest to use sublevel, and you can use one sublevel for the graph in which you store the email's metadata, plus one sublevel for the email parts. How does it sound?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants