-
Notifications
You must be signed in to change notification settings - Fork 195
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance of JSON-LD framing for larger documents #248
Comments
It's possible you're the first to try framing with docs that size. Our use cases have been smaller docs than that. Perhaps someone else in the community has tried larger docs? If you're running in node, a good first step would be to start up in debug mode and hook in the chrome dev tools. Then you can do a quick profile and maybe it'll be obvious some code has gone exponential. Maybe also write a quick test using the ruby implementation to see if it has the same problems. Perhaps there's some insight in that code on better data structures to use. https://github.com/ruby-rdf/json-ld Is the data available somewhere for others to test? |
Thanks for the hints! I'm running the JSON-LD framing in Node via the jsonld-cli. The data is unfortunately internal and thus unavailable. I can however try generating synthetic data of similar size and structure to see whether it suffers from the same problems. |
Another thing to try is to cut data size in half, and in half again, etc and check timing. I'm guessing it's not going to be a linear performance graph. What is the structure of your dataset? If it's a collection of many similar small items, and just fails when the number of them is large, should be fairly easy to make similar test with algorithmically generating test data set of any size. If it's some social graph like thing, where the links are the problem, maybe harder to simulate. |
I encountered a similar framing problem where even 200kB document might be enough to have to wait for several tens of seconds. Below is example data and four frames. Processing all four of them takes about 70 seconds in Chrome on an average Core i5. I thought I was doing something wrong, but if @jindrichmynarz also thinks framing might be slow, maybe there actually is something suboptimal in the algorithm? Processing similar documents of sizes up to 150kB takes just a few seconds, maybe the problem is higher amount of interlinking in this one, but I haven't investigated that yet. @jindrichmynarz, have you found any workarounds in your case? |
I haven't investigated this much more. I tried to frame the larger documents using jsonld-java, but it had similar performance problems and while I tried profiling the code, I haven't found a clear cause of the problems. I think the key question here is to what extent is the poor performance caused by size of input data and by its structure. |
Hello, With this document : https://gist.github.com/jblemee/41a5c8fa56fffc17896d3b58f42adf43 I got 52% of my cpu time in the function "removedependents" on the playgroung (and in my app) Here is the function :
The problem is exponential. with 1/4 of the json it works. each time you add a element in the json list it's kind of double the execution time |
With a quick glance, I'm not sure if that |
Did we just broke SoLiD ? :-) |
No attempt has been made to optimize the remove embed code -- so I suspect there is much that could be gained. We'd be very happy to accept a PR that improved performance. |
I have a larger JSON-LD document (24 MB expanded). Framing it gets stuck with 1 CPU fully used (little memory is used). I have a few questions:
The text was updated successfully, but these errors were encountered: