Skip to content

turn exponential notation back on for config dump #929

@stas00

Description

@stas00

Currently with zero3's huge numbers in its params the config dump looks like:

[2021-04-05 20:09:01,945] [INFO] [config.py:741:print]   zero_config .................. {
    "allgather_bucket_size": 500000000,
    "allgather_partitions": true,

    "zero_optimization":{
        "contiguous_gradients":true,
        "cpu_offload":true,
        "cpu_offload_params":true,
        "cpu_offload_use_pin_memory":true,
        "overlap_comm":true,
        "reduce_bucket_size":262144,
        "stage":3,
        "stage3_gather_fp16_weights_on_model_save":true,
        "stage3_max_live_parameters":1000000000.0,
        "stage3_max_reuse_distance":1000000000.0,
        "stage3_param_persistence_threshold":5120,
        "stage3_prefetch_bucket_size":235929.6,
        "sub_group_size":100000000000000.0
    }
}

100000000000000 isn't quite readable, is it? so if the intention of this dump is to debug problems, this output isn't very human readable.

I adapted the formatting function:

import json
from collections.abc import Mapping, Sequence

# adapted from https://stackoverflow.com/a/50701137/9201239
class ScientificNotationEncoder(json.JSONEncoder):
    def iterencode(self, o, _one_shot=False, level=0):
        indent = self.indent if self.indent is not None else 4
        prefix_close = " " * level * indent
        level += 1
        prefix = " " * level * indent
        if isinstance(o, float):
            return f"{o:e}"
        elif isinstance(o, Mapping):
            x = [f'\n{prefix}"{k}": {self.iterencode(v, level=level)}' for k,v in o.items()]
            return "{" + ', '.join(x)  + f"\n{prefix_close}}}"
        elif isinstance(o, Sequence) and not isinstance(o, str):
            return f"[{ f', '.join(map(self.iterencode, o)) }]"
        return "\n, ".join(super().iterencode(o, _one_shot))

print(json.dumps(x, indent=4, cls=ScientificNotationEncoder))

Now we get back the more readable scientific notation format:

    "zero_optimization": {
        "stage": 3, 
        "cpu_offload": true, 
        "cpu_offload_params": true, 
        "cpu_offload_use_pin_memory": true, 
        "overlap_comm": true, 
        "contiguous_gradients": true, 
        "sub_group_size": 1.000000e+14, 
        "reduce_bucket_size": 1.000000e+06, 
        "stage3_prefetch_bucket_size": 9.487879e+05, 
        "stage3_param_persistence_threshold": 1.000000e+04, 
        "stage3_max_live_parameters": 1.000000e+09, 
        "stage3_max_reuse_distance": 1.000000e+09, 
        "stage3_gather_fp16_weights_on_model_save": true
    }, 

1.000000e+14 is much more human readable than 100000000000000 in the current output ;)

Not sure if you want it or not, but since I spent time hacking this together, I thought I'd leave it here for posterity.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions