Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serialize memory improvement #729

Merged
merged 3 commits into from
Dec 9, 2017
Merged

Conversation

dhalperi
Copy link
Member

@dhalperi dhalperi commented Dec 9, 2017

Instead of stopping in-memory, serialize directly to disk. On a small-scale testbed (~1100 configs), these two changes let me reduce memory requirements to under 500MiB from a couple of gigs.


This change is Reviewable

Minimal time-to-serialize change (4s vs 3s, likely a rounding error),
but a huge memory improvement by not keeping a serialized copy of
everything in memory.

Time with change (one step, direct to disk, about 4 seconds):

```
.... Fri Dec 08 18:04:07 PST 2017: Convert configurations to vendor-independent format: 1086/1086
.... Fri Dec 08 18:04:13 PST 2017: Serializing 'org.batfish.datamodel.Configuration' instances to disk: 1086/1086
.... Fri Dec 08 18:04:17 PST 2017: TERMINATEDNORMALLY
```

Time without change (two steps, serialize then pack/write; about 3 seconds).:

```
.... Fri Dec 08 18:15:28 PST 2017: Convert configurations to vendor-independent format: 1086/1086
.... Fri Dec 08 18:15:34 PST 2017: Serializing 'org.batfish.datamodel.Configuration' instances: 1086/1086
.... Fri Dec 08 18:15:37 PST 2017: Packing and writing 'org.batfish.datamodel.Configuration' instances to disk: 1086/1086
.... Fri Dec 08 18:15:37 PST 2017: TERMINATEDNORMALLY
```
Don't stop in-memory first
@arifogel
Copy link
Member

arifogel commented Dec 9, 2017

:lgtm:


Reviewed 1 of 1 files at r1.
Review status: all files reviewed at latest revision, all discussions resolved, some commit checks failed.


Comments from Reviewable

The JsonFactory override does not use our custom printer on all paths.
@dhalperi
Copy link
Member Author

dhalperi commented Dec 9, 2017

Reviewed 1 of 1 files at r2.
Review status: all files reviewed at latest revision, all discussions resolved.


Comments from Reviewable

@dhalperi dhalperi merged commit db75d92 into master Dec 9, 2017
@dhalperi dhalperi deleted the serialize-memory-improvement branch December 9, 2017 05:36
dhalperi added a commit that referenced this pull request Dec 12, 2017
* Batfish: serialize directly to disk

Minimal time-to-serialize change (4s vs 3s, likely a rounding error),
but a huge memory improvement by not keeping a serialized copy of
everything in memory.

Time with change (one step, direct to disk, about 4 seconds):

```
.... Fri Dec 08 18:04:07 PST 2017: Convert configurations to vendor-independent format: 1086/1086
.... Fri Dec 08 18:04:13 PST 2017: Serializing 'org.batfish.datamodel.Configuration' instances to disk: 1086/1086
.... Fri Dec 08 18:04:17 PST 2017: TERMINATEDNORMALLY
```

Time without change (two steps, serialize then pack/write; about 3 seconds).:

```
.... Fri Dec 08 18:15:28 PST 2017: Convert configurations to vendor-independent format: 1086/1086
.... Fri Dec 08 18:15:34 PST 2017: Serializing 'org.batfish.datamodel.Configuration' instances: 1086/1086
.... Fri Dec 08 18:15:37 PST 2017: Packing and writing 'org.batfish.datamodel.Configuration' instances to disk: 1086/1086
.... Fri Dec 08 18:15:37 PST 2017: TERMINATEDNORMALLY
```

* Write JSON directly to output files

Don't stop in-memory first

* BatfishObjectMapper: override PrettyPrinter properly

The JsonFactory override does not use our custom printer on all paths.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants