Pretty print json for human readable files #2706

berland · 2022-01-12T13:32:59Z

Issue
Resolves #318 once more..

Approach
Add indent=4 to occurences of json.dump() whenever it writes to a file that a human might read.

There are some occurences of indent=1 already in the code base, not sure if consistency with that is important, or if we should try to be consistent with how black would format the same dictionaries in source code

Changed existing indent=1 into indent=4.

Considered to add sort_keys=True also, but skipped it, might confuse.

Occurences of json.dump() in tests are not touched.

Pre review checklist

Added appropriate labels

Adding labels helps the maintainers when writing release notes, see sections and the
corresponding labels here: https://github.com/equinor/ert/blob/main/.github/release.yml

markusdregi · 2022-01-12T13:34:45Z

There are some occurences of indent=1 already in the code base, not sure if consistency with that
is important

I would say that is not important and that you should feel free to bump those to 4 as well in this PR...

jondequinor · 2022-01-12T14:09:26Z

job_runner/reporting/file.py

@@ -198,4 +198,4 @@ def _dump_ok_file(self):

    def _dump_status_json(self):
        with open(self.STATUS_json, "w") as fp:
-            json.dump(self.status_dict, fp, indent=1)
+            json.dump(self.status_dict, fp, indent=4)


There's a home made JSON parser somewhere. Just make sure that this file either is not read by that parser, or if it is, that the parser can deal with this indentation. 😅

I was hoping for the test to detect that 😁

I'm not even sure the C-layer reads this file anymore... 🤔 It definitively checks whether it exists (and removes it upon starting a forward model) to deduce running state, but I'm not sure it actually loads the file. It appears that this is all done in the Python layer... If so, there is no worries related to parsing it :)

codecov-commenter · 2022-01-13T10:33:34Z

Codecov Report

Merging #2706 (837eb66) into main (19ba42c) will decrease coverage by 0.09%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main    #2706      +/-   ##
==========================================
- Coverage   65.05%   64.95%   -0.10%     
==========================================
  Files         651      651              
  Lines       53907    53657     -250     
  Branches     4610     4732     +122     
==========================================
- Hits        35069    34854     -215     
+ Misses      17358    17323      -35     
  Partials     1480     1480

Impacted Files	Coverage Δ
ert/serialization/_serializer.py	`90.90% <100.00%> (ø)`
ert3/workspace/_workspace.py	`89.88% <100.00%> (ø)`
ert_shared/services/_base_service.py	`93.20% <100.00%> (ø)`
libres/lib/enkf/block_fs_driver.cpp	`81.35% <0.00%> (-0.57%)`	⬇️
res/enkf/config_keys.py	`100.00% <0.00%> (ø)`
libres/lib/python/init.cpp	`0.00% <0.00%> (ø)`
libres/lib/enkf/config_keys.cpp	`0.00% <0.00%> (ø)`
libres/lib/res_util/block_fs.cpp	`53.38% <0.00%> (+0.14%)`	⬆️
libres/lib/enkf/time_map.cpp	`54.00% <0.00%> (+0.91%)`	⬆️
res/enkf/enkf_fs_manager.py	`94.00% <0.00%> (+1.00%)`	⬆️
... and 4 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 19ba42c...837eb66. Read the comment docs.

markusdregi · 2022-01-14T07:48:48Z

ert/serialization/_serializer.py

@@ -31,6 +31,8 @@ async def decode_from_path(

 class _json_serializer(Serializer):
    def encode(self, obj: Any, *args: Any, **kwargs: Any) -> str:
+        if "indent" not in kwargs:


Is it possible to move this into the function declaration as an optional argument and then afterwards passed explicitly to dumps. It should yield the same behaviour and maybe look a bit less wacky :)

Yes, that worked! Looks better now.

Nope, pylint complained:

************* Module ert.serialization._serializer ert/serialization/_serializer.py:33:4: W0221: Number of parameters was 4 in 'Serializer.encode' and is now 5 in overridden '_json_serializer.encode' method (arguments-differ)

So either go wacky, add pyllint exception, or add indent as a argument to all encode() implementations.

markusdregi · 2022-01-14T07:51:12Z

tests/ert_tests/ert3/data/test_serializer.py

-    )
+
+    # Most compact json serialization:
+    assert json_serializer.encode(


An alternative method here would be to just drop all whitespaces both from obj_json and the result of json_serializer. It might be easier to read in the future and it removes the need to ever toggle indent.

I am not sure I got your question? There are multiple ways to reach a compact json serialization, one way is through the options to encode(). This test is basically testing that separators is passed on while we are overriding indent.

Sorry for the late reply. I retract my point :)

jondequinor · 2022-01-14T09:15:18Z

An argument against this, is that raw JSON will use more space (a lot of and \n). Some of these indented JSON files will surely be large, and they will touch "hot" cloud storage, which is expensive.

Another argument against, it that transport will marginally be slowed down due to more stuff to compress.

Both of these arguments are also against the use of JSON at all, but that's beyond the scope of this PR.

Wouldn't cat some.json|jsonpp take care of the human readable part of this PR?

berland · 2022-01-14T12:23:23Z

That is correct. jq . < somefile.json | less works on-prem, but is not as user-friendly as less somefile.json. The extra whitespace shuffled around might be acceptable until ert3 is mature enough for few users needing to debug in the terminal.

Avoiding whitespace for speed reasons is possibly premature optimization.

jondequinor · 2022-01-14T13:36:11Z

That is correct. jq . < somefile.json | less works on-prem, but is not as user-friendly as less somefile.json. The extra whitespace shuffled around might be acceptable until ert3 is mature enough for few users needing to debug in the terminal.

Fair enough.

berland · 2022-01-14T14:05:19Z

Setting indent=1 as default may be a compromise though. indent=4 does not improve readability by much compared to 1.

jondequinor · 2022-01-14T14:53:59Z

No, the debugging argument is good, and 4 spaces is the sweet spot.

markusdregi

There are good arguments both ways, but this is also easy to change at any point based on new data. I suggest we merge this as is now :)

berland added feature ert3 labels Jan 12, 2022

berland force-pushed the json_pp branch from 4abe650 to c4cf899 Compare January 12, 2022 13:41

jondequinor reviewed Jan 12, 2022

View reviewed changes

Pretty print json for human readable files

6f88042

berland force-pushed the json_pp branch from c4cf899 to 6f88042 Compare January 13, 2022 10:13

berland enabled auto-merge (rebase) January 13, 2022 11:54

markusdregi reviewed Jan 14, 2022

View reviewed changes

berland added 2 commits January 14, 2022 09:24

address reviewer comments

837eb66

Add pylint exception

1e4d727

markusdregi approved these changes Jan 19, 2022

View reviewed changes

berland merged commit 2a7245e into equinor:main Jan 19, 2022

markusdregi mentioned this pull request Jan 19, 2022

Clean-up status parsing logic in the C-implementation of the queue #2741

Open

berland deleted the json_pp branch February 10, 2022 14:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pretty print json for human readable files #2706

Pretty print json for human readable files #2706

berland commented Jan 12, 2022 •

edited

markusdregi commented Jan 12, 2022

jondequinor Jan 12, 2022 •

edited

markusdregi Jan 12, 2022

berland Jan 13, 2022

markusdregi Jan 14, 2022

codecov-commenter commented Jan 13, 2022 •

edited

markusdregi Jan 14, 2022

berland Jan 14, 2022

berland Jan 14, 2022

markusdregi Jan 14, 2022

berland Jan 14, 2022

markusdregi Jan 19, 2022

jondequinor commented Jan 14, 2022 •

edited

berland commented Jan 14, 2022

jondequinor commented Jan 14, 2022 •

edited

berland commented Jan 14, 2022

jondequinor commented Jan 14, 2022

markusdregi left a comment

Pretty print json for human readable files #2706

Pretty print json for human readable files #2706

Conversation

berland commented Jan 12, 2022 • edited

Pre review checklist

markusdregi commented Jan 12, 2022

jondequinor Jan 12, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov-commenter commented Jan 13, 2022 • edited

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jondequinor commented Jan 14, 2022 • edited

berland commented Jan 14, 2022

jondequinor commented Jan 14, 2022 • edited

berland commented Jan 14, 2022

jondequinor commented Jan 14, 2022

markusdregi left a comment

Choose a reason for hiding this comment

berland commented Jan 12, 2022 •

edited

jondequinor Jan 12, 2022 •

edited

codecov-commenter commented Jan 13, 2022 •

edited

jondequinor commented Jan 14, 2022 •

edited

jondequinor commented Jan 14, 2022 •

edited