Tags: APIDevTools/json-schema-ref-parser
Tags
fix(BREAKING CHANGE): dereference caching to prevent infinite loops o… …n circular schemas (#380) We've come across an interesting case with one of our customers OpenAPI definitions where due to the amount of circular references they are using it ends up sending `$RefParser.dereference()` into an infinite loop that has unfortunately been causing us some pain. The following is a video of running the `dereference.circular=ignore` test in this PR without my eventual fix: https://github.com/user-attachments/assets/077129bd-997a-40b7-aa57-8129bd7df87f I've killed the process after 20 seconds but we've seen cases where it can run for almost an hour trying to dereference this circular API definition. In investigating this we've identified a couple issues: 1. When dereferencing this specific schema the event loop gets blocked in the process preventing the `options.timeoutMs` check and exception from ever getting hit. * We were able to resolve this by adding a supplementary timeout check when the dereference crawler processes each property within an object. 2. The dereference cache isn't being fully taken advantage of. In investigating the cache issues we noticed a couple more issues: #### Core issues with `Set` caches The `Set` objects that are used for `parents` and `processedObjects` don't appear to be fully utilized because these are often setting objects, and `Set` does not do object deuping: ```js const set = new Set(); set.add({ type: 'string' }); set.add({ type: 'string' }); console.log({ set: Array.from(set), has: set.has({ type: 'string'} )}) > {set: Array(2), has: false} ``` I'm not convinced that any of the `.has()` checks being executed on these stores are currently working and I made an attempt at pulling in [flatted](https://npm.im/flatted)[^1] to create consistent and unique keys off of these values however a couple unit tests broken in unexpected ways and ended up moving on. I would love to spend some more time investigating this because I think in extreme cases like ours we could really improve memory usage during dereferencing. #### The dereference cache is only being used for non-circular objects After crawling to dereferencing an object and either updating the original schema or resetting it to the original `$ref` if `dereference.circular=ignore` is set that resulting object is saved into the dereferenced cache `Map`. This map is referenced at the beginning of the `dereference$Ref` function however the cache is only utilized if the found object is **not** circular. I was unable to uncover a reason why this was the case but without this cache being utilized this dereferencer would continuously, even despite `dereference.circular=ignore` being configured, crawl and re-dereference objects it had already seen. Changing this logic to always return the cache if we found something brought dereferencing our use case down from ∞ to less than a second. https://github.com/user-attachments/assets/4d38a619-7b0b-4ec8-8f10-a3e728855a84 <sub>Ignore the logs in this video, it was recorded while I was still working on this fix.</sub> [^1]: Because `JSON.stringify()` cannot serialize circular objects. BREAKING CHANGE: dereference caching to prevent infinite loops on circular schemas
PreviousNext