Refactor/valid expanded json ld by zonotope · Pull Request #44 · fluree/json-ld

zonotope · 2025-07-08T03:47:54Z

This patch updates the expand output to be valid JSON-LD instead of our proprietary format while retaining keyword context functionality that allows for namespaced keywords as compact iris. It changes the :id, :value, :type, :list, and :language keyword keys found in the expanded output to @id, @value, @type , @list, and @language keys, respectively, and it also removes the :idx key from the expanded output while still reporting on the current index when any errors occur.

This changes the output contract, so we should bump the library version to 2.0.0 if we choose to merge this patch.

@id

This change updates the JSON-LD expansion implementation to produce output that conforms to the W3C JSON-LD specification, replacing the previous proprietary format with standard JSON-LD keywords. Key changes: - Use "@value" instead of ":value" for literal values - Use "@id" instead of ":id" for node identifiers - Use "@type" instead of ":type" for data types - Use "@language" instead of ":language" for language tags - Use "@list" instead of ":list" for ordered lists - Remove proprietary ":idx" metadata from expanded output The expansion now produces valid JSON-LD expanded form that can be processed by other JSON-LD processors, while maintaining backward compatibility for input formats (both compact JSON-LD and Clojure keyword forms are still supported). Example transformation: Before: {:id "uri", "prop" [{:value "text", :type nil, :idx [...]}]} After: {"@id" "uri", "prop" [{"@value" "text"}]} Updated corresponding tests to expect the new standard format.

- Remove unused binding parameters (idx) from parse-node-val methods - Replace unnecessary str wrapping of string literals in error messages - All methods now use underscore (_) for unused parameters to satisfy clj-kondo - Fixes 6 linting warnings with no functional changes to the code

@id

…rmat - Update all test expectations from old format (:id, :idx, :type, :value) to JSON-LD format (@id, @type, @value, @graph) - Remove :idx fields entirely as they are no longer part of the expanded output - Change :json type to "@JSON" string to match JSON-LD specification - Update graph tests to use "@graph" instead of :graph keyword - All tests now pass with the refactored expansion that produces valid JSON-LD This completes the migration from the proprietary data format to standard JSON-LD expansion output, ensuring compatibility with JSON-LD processors and tools.

bplatz · 2025-07-13T20:02:08Z

I'm curious the motivation for this change, and downstream impacts from it. There are other standards-compliant libraries out there that were passed up because there were tradeoffs that were important at least at the time. If the time and need has changed then that may be the current situation.

The original tradeoffs were:
a) speed - keyword based lookups (although I'm not sure what real world difference would be at this point and perhaps minimal)
b) optimizations [speed again] - the :type-id made short work of nested contexts (a context map in a context map) and if I recall proved useful beyond that
c) metadata - the :idx key allows us to give specific user feedback exactly where issues we discover downstream originated from - something that is completely lost in standard expansion
d) clojure keywords for the main "special values" that make it easy to check for conditions in idiomatic clojure code instead of a bunch of string-based lookups

If these things are no longer needed (or no longer the case, e.g. it isn't actually any faster) then we should decide if we want to have a library that does the same thing other libraries already do instead of continuing to maintain this.

zonotope · 2025-07-14T20:37:31Z

I think on balance we should make this change, but I don't feel particularly strongly about it. We've had a number of bugs related to our expanded data not being valid json-ld, but those bugs have been fixed.

The motivation for this change expanded data no longer being valid json-ld breaks json-ld semantics. We currently can't expand anything twice, for example. This makes the code base harder to work with and I expect that in the future it will make it more difficult for new people to work on and understand the code base because lots of the pre-existing W3C documentation won't apply.

I remember off of the top of my head bugs due to us serializing expanded documents and treating them as json-ld upon de-serialization, transmitting expanded documents between servers and treating them as json-ld on the receiver side, and attempting to expand a document that was already expanded throwing errors because it wasn't valid json-ld. Now, in each of these cases the best thing to do was to either compact the document beforehand or avoid trying to expand it more than once, but that won't always be practical, especially if the document is small and the compression benefits won't matter as much or we're dealing with input from multiple paths.

I'll respond to the tradeoffs you mentioned in line.

a) speed - keyword based look ups (although I'm not sure what real world difference would be at this point and perhaps minimal)

I think the real world differences here are minimal if anything. The difference in speed between keyword based and string based look ups come from the runtime being able to compare memory addresses for keywords because they're interned vs a character by character equality check for strings. Both comparisons are fast, but keyword checks are slightly faster since the equality check complexity is constant instead of linear. I think this would matter slightly more if we converted arbitrary iris into keywords, but we only convert a limited set of iris, all of which happen to be short strings.

b) optimizations [speed again] - the :type-id made short work of nested contexts (a context map in a context map) and if I recall proved useful beyond that

I'm not sure what you mean by :type-id. Could you elaborate?

c) metadata - the :idx key allows us to give specific user feedback exactly where issues we discover downstream originated from - something that is completely lost in standard expansion

I think this matters most during the expansion process, and we still keep track of the current index in the document during the expansion, and any expansion errors we throw include the index into the document where the error occurred. We don't currently use the idx metadata for any other type of error reporting, and if we have downstream code that runs into a semantically different type of error post expansion, we'll also have much more context related to that specific error to include in the error message, including the expanded and compact form of any iris related to the error.

d) clojure keywords for the main "special values" that make it easy to check for conditions in idiomatic clojure code instead of a bunch of string-based look ups

We'll never be able to avoid string based look ups because we can't convert arbitrary iris to keywords, and I think managing maps with mixed key types is even less idiomatic then having all string keys. Another problem that comes from this is that some input data we receive could already be in expanded form, so we could always have "@id" as the id key anyway, so we still always have to check for it.

If these things are no longer needed (or no longer the case, e.g. it isn't actually any faster) then we should decide if we want to have a library that does the same thing other libraries already do instead of continuing to maintain this.

I usually prefer to use a well maintained library if one exists instead of re-implementing that functionality ourselves, but our library is relatively stable at this point, so I think we should probably stick to it for now unless we find thorny bugs or large gaps in the feature set.

Like I said, I don't feel particularly strongly about this, but if we do decide to stick with the current functionality, then I think we should at least change the name of our function. expand has a well established meaning in a json-ld context, and what we currently do is different, so it should have a different name.

dpetran · 2025-07-15T16:23:10Z

As to the availability of other libraries, the ones that I tested were about an order of magnitude slower than ours, though they cover all of the json-ld processor api.

…ed-json-ld

zonotope · 2025-08-14T14:56:08Z

@fluree/core this is ready for review

dpetran · 2025-08-14T15:35:18Z

CHANGELOG.md

+### Changed
+- **BREAKING**: Expansion now produces valid JSON-LD output conforming to JSON-LD 1.1 specification
+  - Expansion output now uses standard JSON-LD keywords (`@id`, `@type`, `@value`, `@graph`, `@list`) instead of proprietary format
+  - Removed `:idx` metadata from expansion output - expansion now produces clean JSON-LD


Can we update this - it looks like we actually do retain :idx as metadata. Nice idea, btw, I like that we don't have to deal with it while processing the data but that it's still there if we ever want to use it.

dpetran

🍇

zonotope added 7 commits July 7, 2025 15:47

cljfmt

18b79f1

remove redundant ifs

37bc1b1

use when-let and if-let where appropriate

3dbdd56

remove redundant when-let

a32b5ba

zonotope requested review from a team and bplatz July 8, 2025 03:47

zonotope self-assigned this Jul 8, 2025

update readme and changelog

34327d4

Merge remote-tracking branch 'origin/main' into refactor/valid-expand…

bca4c3a

…ed-json-ld

zonotope mentioned this pull request Aug 13, 2025

Use valid expanded json ld fluree/db#1089

Merged

zonotope added 5 commits August 14, 2025 11:45

save index into the document for each node as metadata

0d730df

fix formatting

229421e

add get-idx api function

3c93fb1

update docstring to reflect new library behavior

a4871f1

update graalvm compatibility tests

1db2f86

zonotope mentioned this pull request Aug 14, 2025

Refactor/valid json ld fluree/server#170

Merged

dpetran reviewed Aug 14, 2025

View reviewed changes

dpetran approved these changes Aug 14, 2025

View reviewed changes

update changelog

fa242da

zonotope merged commit 7408353 into main Aug 14, 2025
10 checks passed

zonotope deleted the refactor/valid-expanded-json-ld branch August 14, 2025 18:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor/valid expanded json ld#44

Refactor/valid expanded json ld#44
zonotope merged 15 commits intomainfrom
refactor/valid-expanded-json-ld

zonotope commented Jul 8, 2025

Uh oh!

bplatz commented Jul 13, 2025

Uh oh!

zonotope commented Jul 14, 2025 •

edited

Loading

Uh oh!

dpetran commented Jul 15, 2025

Uh oh!

zonotope commented Aug 14, 2025

Uh oh!

dpetran Aug 14, 2025

Uh oh!

dpetran left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

zonotope commented Jul 8, 2025

Uh oh!

bplatz commented Jul 13, 2025

Uh oh!

zonotope commented Jul 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dpetran commented Jul 15, 2025

Uh oh!

zonotope commented Aug 14, 2025

Uh oh!

dpetran Aug 14, 2025

Choose a reason for hiding this comment

Uh oh!

dpetran left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

zonotope commented Jul 14, 2025 •

edited

Loading