You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
iteratively load the shelf file, and dump it iteratively to a JSON file.
The first step uses the shelf as a cache for the depth-first traversal (typically implemented recursively) of the BDD to find which nodes to dump. We cannot know which nodes to dump without first finding those reachable from the roots we want to dump. Naively, we could traverse the BDD and so dump all reachable nodes. Due to a BDDs structure, this can result in an exponential amount of duplicate work.
This is why visited nodes are memoized, which corresponds to maintaining a set of "visited" nodes in a graph search. In other algorithmic applications, one would simply mark the visited nodes as visited. In CUDD, there isn't space for "marking" the nodes. We could instead add the visited nodes to a separate "set" in main memory. In demanding use cases, CUDD fills most of the main memory, so this isn't possible, because it would essentially duplicate within main memory the BDD we want to save.
I think that DDDMP approaches this problem by removing and adding nodes to the unique table. I consider this undesirable, because it affects the existing cache (hashing information, repeatability, etc.). A dumping operation is extraneous to the BDD manager, so it shouldn't have side effects on the manager.
Main memory cannot serve for storing "visited" information during traversal without interfering with CUDD, but there's the disk. Nowadays, the disk is vastly larger than main memory. So, why not use the disk to store the "visited" status of each node? (For example, the enumerative model checker TLC uses the disk to store the state space.)
Even better, since all we want to do is dump to the disk, why not use the target file itself as the store of "visited" information? The only challenge is a dict-like interface for quickly checking containment of nodes in the file. The shelve module from Python's standard library provides exactly this interface.
The second step is just a conversion of the entire shelf file to a JSON file. In other words, the first step identifies and isolates the information that we want to store, and the second step puts this information in the target file format.
ijson seems suitable for step 2. Compared to json-streamer, ijson is preferred:
Addressed in 61b60bf, specifically by the functions dump_json and load_json. The shelve file is used as the cache during the traversal that dumps nodes from the BDD manager to the JSON file.
Dump BDDs from
dd.cudd.BDD
to a JSON file, and load them too. A two-stage approach seems most promising:shelve
The first step uses the shelf as a cache for the depth-first traversal (typically implemented recursively) of the BDD to find which nodes to dump. We cannot know which nodes to dump without first finding those reachable from the roots we want to dump. Naively, we could traverse the BDD and so dump all reachable nodes. Due to a BDDs structure, this can result in an exponential amount of duplicate work.
This is why visited nodes are memoized, which corresponds to maintaining a set of "visited" nodes in a graph search. In other algorithmic applications, one would simply mark the visited nodes as visited. In CUDD, there isn't space for "marking" the nodes. We could instead add the visited nodes to a separate "set" in main memory. In demanding use cases, CUDD fills most of the main memory, so this isn't possible, because it would essentially duplicate within main memory the BDD we want to save.
I think that DDDMP approaches this problem by removing and adding nodes to the unique table. I consider this undesirable, because it affects the existing cache (hashing information, repeatability, etc.). A dumping operation is extraneous to the BDD manager, so it shouldn't have side effects on the manager.
Main memory cannot serve for storing "visited" information during traversal without interfering with CUDD, but there's the disk. Nowadays, the disk is vastly larger than main memory. So, why not use the disk to store the "visited" status of each node? (For example, the enumerative model checker TLC uses the disk to store the state space.)
Even better, since all we want to do is dump to the disk, why not use the target file itself as the store of "visited" information? The only challenge is a
dict
-like interface for quickly checking containment of nodes in the file. Theshelve
module from Python's standard library provides exactly this interface.The second step is just a conversion of the entire shelf file to a JSON file. In other words, the first step identifies and isolates the information that we want to store, and the second step puts this information in the target file format.
ijson
seems suitable for step 2. Compared tojson-streamer
,ijson
is preferred:yajl
Previous comments that are relevant:
The text was updated successfully, but these errors were encountered: