# Compressor Module Test

## 1. Compress the Structured Array


**Step 1. Create a Structured Array.**

We first create a random 3D array with shape (255, 255, 100) 

to simulate a structured array with dimensions (longitude, latitude, time).

 The array is cast to float32 type.

In [2]:
import scicodec as sc
import numpy as np

# Create a Structured Array. (lon, lat, time)
example_array = np.random.rand(100,255, 255).astype(np.float32)


**Step 1. Select the proper compressor and error bound.**


**i. Available Compression Methods**



| Compression Method| Mode| Explanation| Supported Error Mode| Extended Package| Reference |
|---|---|---|---|---|---|
|SZ3|Lossy|Prediction-based lossy compressor|abs_precision|libpressio|https://github.com/szcompressor/sz3|
|MGARD|Lossy|Prediction-based lossy compressor|abs_precision|libpressio|https://github.com/LLNL/mgard|
|ZFP|Lossy|Prediction-based lossy compressor|fixed_ratio,bit_precision[0-64]|libpressio|https://github.com/LLNL/zfp|
|MBT2018|Lossy|Deep learning based compressor|level[1-8]|None|https://github.com/mbt2018/compression|
|bmshj2018_factorized|Lossy|Deep learning based compressor|level[1-8]|None|https://github.com/InterDigitalInc/CompressAI|
|cheng2020_anchor|Lossy|Deep learning based compressor|level[1-6]|None|https://github.com/cheng2020/compression|
|Zlib|Lossless|General purpose compressor|level[0-9]|numcodecs|Built-in|
|Blosc|Lossless|High-performance compressor|level[0-9]|numcodecs|None|
|LZ4|Lossless|Fast compression|level[0-12]|numcodecs|None|
|Zstd|Lossless||level[0-9]|numcodecs|https://numcodecs.readthedocs.io/en/stable/compression/zstd.html|
|GZip|Lossless|gzip compression using zlib|level[0-9]|numcodecs||
|FPZip|Lossy|Floating-point compressor|rel_precision,bit_precision[0-64]|FPZip,numcodecs|https://github.com/LLNL/fpzip|




**ii. Error Bound**

SkyShrink supports multiple error bounds for lossy compression. The error bound determines how much error is allowed during compression.
**Supported Error Bounds**

|Error Bound| Parameter in Code|Formula| Range|
|---|---|---|---|
|Absolute Error| abs_precision|max(\|x - x'\|) ≤ ε|0 < ε < 1|
|Relative Error|rel_precision|max(\|x - x'\|/\|x\|) ≤ ε|0 < ε < 1|
|Bit Precision|bit_precision||0<bit_precision<64|
|Mean Absolute Error| mae|mean(\|x - x'\|) ≤ ε|0 < ε < 1|
|Fixed Ratio|fixed_ratio|size(x')/size(x) ≤ ratio|0 < ratio < 1|
|Level of Detail|level|1 is fastest; 9 is slowest and produces the most compression; 0 is no compression.|
1 (Z_BEST_SPEED) is fastest and produces the least compression, 9 (Z_BEST_COMPRESSION) is slowest and produces the most. 0 (Z_NO_COMPRESSION) is no compression. 
Where x is the original data, x' is the compressed data, and ε is the error bound.


In [3]:
# Encode and Decode the Single Structured Array.
print("Available compressors:", sc.compression.supported_compressors)
# Supported Error Bounds
print(sc.compression.supported_error_bounds['sz3'])

Available compressors: ['zlib', 'gzip', 'bz2', 'lzma', 'lz4', 'blosc', 'fpzip', 'zstd', 'sz3', 'mgard', 'zfp', 'mbt2018', 'cheng2020_anchor', 'bmshj2018_factorized']
['abs_precision']


In [7]:
# Initialize compressor with SZ3 method and absolute tolerance of 1e-3

#compressor = sc.compression.Compressor(method='sz3', abs_precision=1e-3)
compressor = sc.compression.Compressor(method='zlib', level=1)
compressor = sc.compression.Compressor(method='LZ4', level=1)
compressor = sc.compression.Compressor(method='gzip', level=1)
compressor = sc.compression.Compressor(method='zstd', level=1)
compressor = sc.compression.Compressor(method='blosc', level=1)
compressor = sc.compression.Compressor(method='fpzip', rel_precision=0.0001)
compressor = sc.compression.Compressor(method='sz3', abs_precision=1)
compressor = sc.compression.Compressor(method='mgard', abs_precision=1)

compressor = sc.compression.Compressor(method='cheng2020_anchor', level=5)
compressor = sc.compression.Compressor(method='mbt2018', level=5)
compressor = sc.compression.Compressor(method='bmshj2018_factorized', level=5)
compressor = sc.compression.Compressor(method='zfp', bit_precision=5)

**Step 2. Encode the Structured Array.**


In [8]:
# Initialize compressor with SZ3 method and absolute tolerance of 1e-3
bitstrings = compressor.encode(example_array)
len(bitstrings)

1414904

**Step 3. Decode**

In [29]:
# Decode the bitstrings back to array
decompressed_array= compressor.decode(bitstrings)
decompressed_array.shape



{'quality': 5, 'pretrained': True, 'device': 'cpu'}


(100, 255, 255)

**Step 4. Check the efficiency metrics.**

|Metric| Unit| Explanation|
|---|---|---|
|Compression Ratio| >1 | Higher values mean better compression. Ratio of original size to compressed size|
|Encoding Speed |(MB/s)| Speed of compression in megabytes per second|
|Decoding Speed| (MB/s)| Speed of decompression in megabytes per second|
 |Bitrate |(bits/byte)| Number of bits needed to represent each byte of original data after compression|


In [6]:
compressor.efficiency_metrics 

{'encoding_speed': 33.65233617532337,
 'compression_ratio': 765000.0,
 'original_bits': 26010000,
 'decoding_speed': 4.0303783843694816e-05}

## 2. Compress the Xarray


In [7]:
import xarray as xr

# Load a example Xarray dataset.
example_ds=sc.dataset.download_benchmark("wrf_short")
example_ds

Dataset wrf_short already exists at ./datasets/wrf_short.nc, reloaded




**Step 1. Create Encoding Dictionary**
The compressor can be initialized with different modes and parameters for different variables:

- compressor: Compression method ('SZ3', 'ZFPY', etc.)
- abs_tol: Absolute error tolerance



The encoding dictionary specifies compression parameters for each variable:
```python
encode_dict = {
    "var1": {"compressor": "SZ3", "abs_tol": 1e-3},
    "var2": {"compressor": "ZFPY", "abs_tol": 1e-4}
}



In [8]:
# Compress the Xarray dataset.
encode_dict = {
    "T2": {"compressor": "zlib", "level": 1},
    'U10': {'compressor': "fpzip", 'rel_precision': 1e-4},
    'V10': {'compressor': "fpzip", 'rel_precision': 1e-4},    
    "PSFC": {"compressor": "zlib", "level": 1},
    'LAI': {'compressor': "fpzip", 'rel_precision': 1e-4},
    'ALBEDO': {'compressor': "fpzip", 'rel_precision': 1e-4},     
    'RAINC': {'compressor': "fpzip", 'rel_precision': 1e-4},     
}


# Compress the Xarray dataset.
encode_dict = {
    "T2": {"compressor": "zlib", "level": 1},
    'U10': {'compressor': "sz3", 'abs_precision': 1},
    'V10': {'compressor': "bmshj2018_factorized", 'level': 1},    
    "PSFC": {"compressor": "zlib", "level": 1},
    'LAI': {'compressor': "cheng2020_anchor", 'level': 1},
    'ALBEDO': {'compressor': "mbt2018", 'level': 1},     
    'RAINC': {'compressor': "fpzip", 'rel_precision': 1e-4},     
}



compressor = sc.compression.Compressor(method=encode_dict)


**Step 2. Compress and Decompress**

In [9]:
# Encode the dataset to get compressed bitstrings and efficiency metrics
saved_dir="./datasets/Output"
compressor.encode(example_ds, save_dir=saved_dir)


Processing variables: 100%|██████████| 7/7 [00:10<00:00,  1.52s/it]


'./datasets/Output'

The structure under the given save directory:
```
saved_dir/
├── variable1/
│   └── bin
├── variable2/
│   └── bin 
├── meta.nc
── efficiency_metrics.pkl
└── encoding_list.yaml
```

In [10]:

# During decoding, the compressor will automatically detect the compression method and error bound. 
# So we don't need to specify the compressor.
# We could clarify the compressor.
compressor = sc.compression.Compressor(method=encode_dict)  # or
compressor = sc.compression.Compressor()  # or
compressor = sc.compression.Compressor(method=None)

decompressed_ds = compressor.decode(saved_dir)


# Check the efficiency performance of the compressor
print("\nCompression Results:")
compressor.efficiency_metrics  # (its a dataframe)




Compression Results:


Unnamed: 0,variable,method,encoding_speed,decoding_speed,compression_ratio,original_bits
0,T2,zlib,52.997772,121.9721,1.447572,14773500
1,U10,sz3,248.906973,5.624439,110.983,14773500
2,V10,bmshj2018_factorized,41.559505,1.459105e-05,2110500.0,14773500
3,PSFC,zlib,47.392278,121.706,1.42698,14773500
4,LAI,cheng2020_anchor,3.252694,3.268071e-07,2110500.0,14773500
5,ALBEDO,mbt2018,2.79392,5.123799e-07,2110500.0,14773500
6,RAINC,fpzip,446.073251,0.01251549,26475.81,14773500


2. (Optional) Encode and Decode use single method for all variable.

In [11]:

compressor=sc.compression.Compressor(method='zlib',level=5)

# Compress the Xarray dataset.
compressor.encode(example_ds, save_dir=saved_dir)

# Decode the Xarray dataset.
decompressed_ds = compressor.decode(saved_dir)

# Check the efficiency performance of the compressor.
print(decompressed_ds["T2"].attrs["encoding_speed"])
# or
compressor.efficiency_metrics

Processing variables: 100%|██████████| 7/7 [00:02<00:00,  2.42it/s]


23.88225941838571


Unnamed: 0,variable,method,encoding_speed,decoding_speed,compression_ratio,original_bits
0,T2,zlib,23.882259,128.37797,1.445869,14773500
1,U10,zlib,26.47462,175.943204,1.1277,14773500
2,V10,zlib,27.17869,179.016489,1.118393,14773500
3,PSFC,zlib,24.469109,133.017528,1.398495,14773500
4,LAI,zlib,38.120857,136.183765,2.012842,14773500
5,ALBEDO,zlib,60.880351,99.34679,3.55642,14773500
6,RAINC,zlib,430.591887,1.033761,1027.65025,14773500


# Find the optimal compression parameters for each variable.

In [1]:
import scicodec as sc
example_ds = sc.dataset.download_benchmark("wrf")
compressor=sc.compression.Compressor(method="sz3", abs_precision=0.0001)
config=compressor.find_config_with_rate(example_ds, compression_rate=100)
print(config)


{'T2': {'abs_precision': 0.72265625, 'compression_rate': 100.25322664538045}, 'U10': {'abs_precision': 1.07421875, 'compression_rate': 99.92056462868702}, 'V10': {'abs_precision': 1.19140625, 'compression_rate': 100.04473330080002}, 'PSFC': {'abs_precision': 9.990234375, 'compression_rate': 18.34971023180687}, 'LAI': {'abs_precision': 0.048828125, 'compression_rate': 178.12729148501174}, 'ALBEDO': {'abs_precision': 0.009765625, 'compression_rate': 849.2306243919788}, 'RAINC': {'abs_precision': 0.009765625, 'compression_rate': 1663604.5081967213}}

Dataset wrf already exists at ./datasets/wrf.nc, reloaded


Finding optimal configs:   0%|          | 0/7 [00:00<?, ?it/s]

Variable: T2, Compression Rate: 5144.538230485533, Absolute Precision: 5.0
Variable: T2, Compression Rate: 696.8000920095855, Absolute Precision: 2.5
Variable: T2, Compression Rate: 192.59743044952424, Absolute Precision: 1.25
Variable: T2, Compression Rate: 86.0904166817779, Absolute Precision: 0.625
Variable: T2, Compression Rate: 133.95661914883718, Absolute Precision: 0.9375
Variable: T2, Compression Rate: 109.08582995037492, Absolute Precision: 0.78125
Variable: T2, Compression Rate: 97.38423492392207, Absolute Precision: 0.703125
Variable: T2, Compression Rate: 103.22229311332623, Absolute Precision: 0.7421875


Finding optimal configs:  14%|█▍        | 1/7 [00:15<01:35, 15.95s/it]

Variable: T2, Compression Rate: 100.25322664538045, Absolute Precision: 0.72265625
Variable: U10, Compression Rate: 1645.250524882256, Absolute Precision: 5.0
Variable: U10, Compression Rate: 350.07002789041843, Absolute Precision: 2.5
Variable: U10, Compression Rate: 120.53086261388862, Absolute Precision: 1.25
Variable: U10, Compression Rate: 55.67369894525422, Absolute Precision: 0.625
Variable: U10, Compression Rate: 85.29510710336056, Absolute Precision: 0.9375
Variable: U10, Compression Rate: 102.08249347963364, Absolute Precision: 1.09375
Variable: U10, Compression Rate: 93.4526648960858, Absolute Precision: 1.015625
Variable: U10, Compression Rate: 97.70873316206041, Absolute Precision: 1.0546875


Finding optimal configs:  29%|██▊       | 2/7 [00:32<01:20, 16.16s/it]

Variable: U10, Compression Rate: 99.92056462868702, Absolute Precision: 1.07421875
Variable: V10, Compression Rate: 971.3641998056886, Absolute Precision: 5.0
Variable: V10, Compression Rate: 267.2543258028602, Absolute Precision: 2.5
Variable: V10, Compression Rate: 105.73491220438495, Absolute Precision: 1.25
Variable: V10, Compression Rate: 52.089852539800496, Absolute Precision: 0.625
Variable: V10, Compression Rate: 77.08804269777336, Absolute Precision: 0.9375
Variable: V10, Compression Rate: 91.02728063512865, Absolute Precision: 1.09375
Variable: V10, Compression Rate: 98.24738854576162, Absolute Precision: 1.171875


Finding optimal configs:  43%|████▎     | 3/7 [00:48<01:04, 16.16s/it]

Variable: V10, Compression Rate: 102.00579135879812, Absolute Precision: 1.2109375
Variable: V10, Compression Rate: 100.04473330080002, Absolute Precision: 1.19140625
Variable: PSFC, Compression Rate: 18.535765001718325, Absolute Precision: 5.0
Variable: PSFC, Compression Rate: 19.50120251638462, Absolute Precision: 7.5
Variable: PSFC, Compression Rate: 20.315630372385282, Absolute Precision: 8.75
Variable: PSFC, Compression Rate: 20.335556126444512, Absolute Precision: 9.375
Variable: PSFC, Compression Rate: 20.49222597432219, Absolute Precision: 9.6875
Variable: PSFC, Compression Rate: 20.839050021122976, Absolute Precision: 9.84375
Variable: PSFC, Compression Rate: 20.582979955909206, Absolute Precision: 9.921875
Variable: PSFC, Compression Rate: 20.022335744086295, Absolute Precision: 9.9609375
Variable: PSFC, Compression Rate: 19.956168349995203, Absolute Precision: 9.98046875


Finding optimal configs:  57%|█████▋    | 4/7 [01:16<01:02, 20.73s/it]

Variable: PSFC, Compression Rate: 18.34971023180687, Absolute Precision: 9.990234375
Variable: LAI, Compression Rate: 3657.9210597458773, Absolute Precision: 5.0
Variable: LAI, Compression Rate: 932.3402476474412, Absolute Precision: 2.5
Variable: LAI, Compression Rate: 872.2829765790701, Absolute Precision: 1.25
Variable: LAI, Compression Rate: 292.11058650310986, Absolute Precision: 0.625
Variable: LAI, Compression Rate: 462.15813724720005, Absolute Precision: 0.3125
Variable: LAI, Compression Rate: 283.8413305987792, Absolute Precision: 0.15625
Variable: LAI, Compression Rate: 235.72670488596927, Absolute Precision: 0.078125
Variable: LAI, Compression Rate: 36.92962101875914, Absolute Precision: 0.0390625
Variable: LAI, Compression Rate: 124.439337127953, Absolute Precision: 0.05859375


Finding optimal configs:  71%|███████▏  | 5/7 [01:30<00:37, 18.54s/it]

Variable: LAI, Compression Rate: 178.12729148501174, Absolute Precision: 0.048828125
Variable: ALBEDO, Compression Rate: 1663604.5081967213, Absolute Precision: 5.0
Variable: ALBEDO, Compression Rate: 1663604.5081967213, Absolute Precision: 2.5
Variable: ALBEDO, Compression Rate: 1663604.5081967213, Absolute Precision: 1.25
Variable: ALBEDO, Compression Rate: 270432.7115256496, Absolute Precision: 0.625
Variable: ALBEDO, Compression Rate: 49928.59778597786, Absolute Precision: 0.3125
Variable: ALBEDO, Compression Rate: 2671.858955793686, Absolute Precision: 0.15625
Variable: ALBEDO, Compression Rate: 280.1367145040907, Absolute Precision: 0.078125
Variable: ALBEDO, Compression Rate: 660.3441628477865, Absolute Precision: 0.0390625
Variable: ALBEDO, Compression Rate: 1333.6646252513438, Absolute Precision: 0.01953125


Finding optimal configs:  86%|████████▌ | 6/7 [01:44<00:16, 16.80s/it]

Variable: ALBEDO, Compression Rate: 849.2306243919788, Absolute Precision: 0.009765625
Variable: RAINC, Compression Rate: 1663604.5081967213, Absolute Precision: 5.0
Variable: RAINC, Compression Rate: 1663604.5081967213, Absolute Precision: 2.5
Variable: RAINC, Compression Rate: 1663604.5081967213, Absolute Precision: 1.25
Variable: RAINC, Compression Rate: 1663604.5081967213, Absolute Precision: 0.625
Variable: RAINC, Compression Rate: 1663604.5081967213, Absolute Precision: 0.3125
Variable: RAINC, Compression Rate: 1663604.5081967213, Absolute Precision: 0.15625
Variable: RAINC, Compression Rate: 1663604.5081967213, Absolute Precision: 0.078125
Variable: RAINC, Compression Rate: 1663604.5081967213, Absolute Precision: 0.0390625
Variable: RAINC, Compression Rate: 1656814.2857142857, Absolute Precision: 0.01953125


Finding optimal configs: 100%|██████████| 7/7 [01:56<00:00, 16.65s/it]

Variable: RAINC, Compression Rate: 1663604.5081967213, Absolute Precision: 0.009765625
{'T2': {'abs_precision': 0.72265625, 'compression_rate': 100.25322664538045}, 'U10': {'abs_precision': 1.07421875, 'compression_rate': 99.92056462868702}, 'V10': {'abs_precision': 1.19140625, 'compression_rate': 100.04473330080002}, 'PSFC': {'abs_precision': 9.990234375, 'compression_rate': 18.34971023180687}, 'LAI': {'abs_precision': 0.048828125, 'compression_rate': 178.12729148501174}, 'ALBEDO': {'abs_precision': 0.009765625, 'compression_rate': 849.2306243919788}, 'RAINC': {'abs_precision': 0.009765625, 'compression_rate': 1663604.5081967213}}





In [1]:
import scicodec as sc
example_ds = sc.dataset.download_benchmark("wrf")
compressor=sc.compression.Compressor(method="bmshj2018_factorized", level=1)
config=compressor.find_config_with_rate(example_ds, compression_rate=100)
print(config)


Dataset wrf already exists at ./datasets/wrf.nc, reloaded


  @amp.autocast(enabled=False)
Finding optimal configs:   0%|          | 0/7 [00:00<?, ?it/s]

Variable: T2, Compression Rate: 565.0106900065142, Level: 3
Variable: T2, Compression Rate: 270.0117737837083, Level: 5


Finding optimal configs:  14%|█▍        | 1/7 [00:36<03:38, 36.40s/it]

Variable: T2, Compression Rate: 188.64814697768492, Level: 6
Variable: U10, Compression Rate: 510.59313505979907, Level: 3
Variable: U10, Compression Rate: 242.95500251382603, Level: 5


Finding optimal configs:  29%|██▊       | 2/7 [01:13<03:03, 36.77s/it]

Variable: U10, Compression Rate: 171.3564752665839, Level: 6
Variable: V10, Compression Rate: 499.60306910659165, Level: 3
Variable: V10, Compression Rate: 245.05417616490226, Level: 5


Finding optimal configs:  43%|████▎     | 3/7 [01:52<02:30, 37.72s/it]

Variable: V10, Compression Rate: 174.92368195153566, Level: 6
Variable: PSFC, Compression Rate: 559.1146880733439, Level: 3
Variable: PSFC, Compression Rate: 287.12538974744933, Level: 5


Finding optimal configs:  57%|█████▋    | 4/7 [02:31<01:54, 38.13s/it]

Variable: PSFC, Compression Rate: 207.8368495106161, Level: 6
Variable: LAI, Compression Rate: 234.630124158979, Level: 3
Variable: LAI, Compression Rate: 120.79715147871941, Level: 5


Finding optimal configs:  71%|███████▏  | 5/7 [03:08<01:15, 37.90s/it]

Variable: LAI, Compression Rate: 88.04744485965577, Level: 6
Variable: ALBEDO, Compression Rate: 259.4431093328561, Level: 3
Variable: ALBEDO, Compression Rate: 131.17704657385505, Level: 5


Finding optimal configs:  86%|████████▌ | 6/7 [03:45<00:37, 37.53s/it]

Variable: ALBEDO, Compression Rate: 95.15581925248017, Level: 6
Variable: RAINC, Compression Rate: 707.2507579189462, Level: 3
Variable: RAINC, Compression Rate: 366.537269604603, Level: 5


Finding optimal configs: 100%|██████████| 7/7 [04:23<00:00, 37.62s/it]

Variable: RAINC, Compression Rate: 265.46718410334086, Level: 6
{'T2': {'level': 6, 'compression_rate': 188.64814697768492}, 'U10': {'level': 6, 'compression_rate': 171.3564752665839}, 'V10': {'level': 6, 'compression_rate': 174.92368195153566}, 'PSFC': {'level': 6, 'compression_rate': 207.8368495106161}, 'LAI': {'level': 6, 'compression_rate': 88.04744485965577}, 'ALBEDO': {'level': 6, 'compression_rate': 95.15581925248017}, 'RAINC': {'level': 6, 'compression_rate': 265.46718410334086}}





In [1]:
import scicodec as sc
example_ds = sc.dataset.download_benchmark("wrf")
compressor=sc.compression.Compressor(method="zfp", bit_precision=15)
config=compressor.find_config_with_rate(example_ds, compression_rate=100)
print(config)



Dataset wrf already exists at ./datasets/wrf.nc, reloaded


Finding optimal configs:   0%|          | 0/7 [00:00<?, ?it/s]

Variable: T2, Compression Rate: 14.590827292874842, Bit Precision: 16
Variable: T2, Compression Rate: 87.81741691170998, Bit Precision: 7


Finding optimal configs:  14%|█▍        | 1/7 [00:05<00:33,  5.51s/it]

Variable: T2, Compression Rate: 134.6533415247108, Bit Precision: 3
Variable: T2, Compression Rate: 106.30528447278883, Bit Precision: 5
Variable: T2, Compression Rate: 96.180885305006, Bit Precision: 6
Variable: U10, Compression Rate: 3.6065673369128, Bit Precision: 16
Variable: U10, Compression Rate: 26.830770023954187, Bit Precision: 7


Finding optimal configs:  29%|██▊       | 2/7 [00:11<00:29,  5.82s/it]

Variable: U10, Compression Rate: 137.53679669520952, Bit Precision: 3
Variable: U10, Compression Rate: 65.97948248823838, Bit Precision: 5
Variable: U10, Compression Rate: 101.14609289345161, Bit Precision: 4
Variable: V10, Compression Rate: 3.592215339769547, Bit Precision: 16
Variable: V10, Compression Rate: 26.57481784128833, Bit Precision: 7
Variable: V10, Compression Rate: 136.6092053825278, Bit Precision: 3
Variable: V10, Compression Rate: 64.5517534206492, Bit Precision: 5


Finding optimal configs:  43%|████▎     | 3/7 [00:17<00:23,  5.93s/it]

Variable: V10, Compression Rate: 99.17620999425348, Bit Precision: 4
Variable: PSFC, Compression Rate: 15.27793669313862, Bit Precision: 16
Variable: PSFC, Compression Rate: 87.81741691170998, Bit Precision: 7
Variable: PSFC, Compression Rate: 134.6533415247108, Bit Precision: 3
Variable: PSFC, Compression Rate: 106.30528447278883, Bit Precision: 5


Finding optimal configs:  57%|█████▋    | 4/7 [00:23<00:17,  5.72s/it]

Variable: PSFC, Compression Rate: 96.180885305006, Bit Precision: 6
Variable: LAI, Compression Rate: 9.90500388763097, Bit Precision: 16
Variable: LAI, Compression Rate: 65.64342342388963, Bit Precision: 7


Finding optimal configs:  71%|███████▏  | 5/7 [00:27<00:10,  5.11s/it]

Variable: LAI, Compression Rate: 198.29738195546315, Bit Precision: 3
Variable: LAI, Compression Rate: 123.00829470632158, Bit Precision: 5
Variable: LAI, Compression Rate: 89.1832838260625, Bit Precision: 6
Variable: ALBEDO, Compression Rate: 12.364888328398582, Bit Precision: 16
Variable: ALBEDO, Compression Rate: 59.013788703858346, Bit Precision: 7


Finding optimal configs:  86%|████████▌ | 6/7 [00:31<00:05,  5.00s/it]

Variable: ALBEDO, Compression Rate: 135.23185201768624, Bit Precision: 3
Variable: ALBEDO, Compression Rate: 96.67604245062829, Bit Precision: 5
Variable: ALBEDO, Compression Rate: 117.12063278458456, Bit Precision: 4
Variable: RAINC, Compression Rate: 2019.7411631239552, Bit Precision: 16
Variable: RAINC, Compression Rate: 2019.7411631239552, Bit Precision: 24
Variable: RAINC, Compression Rate: 2019.7411631239552, Bit Precision: 28
Variable: RAINC, Compression Rate: 2019.7411631239552, Bit Precision: 30


Finding optimal configs: 100%|██████████| 7/7 [00:33<00:00,  4.81s/it]

Variable: RAINC, Compression Rate: 2019.7411631239552, Bit Precision: 31
Variable: RAINC, Compression Rate: 2019.7411631239552, Bit Precision: 32
{'T2': {'bit_precision': 6, 'compression_rate': 96.180885305006}, 'U10': {'bit_precision': 4, 'compression_rate': 101.14609289345161}, 'V10': {'bit_precision': 4, 'compression_rate': 99.17620999425348}, 'PSFC': {'bit_precision': 6, 'compression_rate': 96.180885305006}, 'LAI': {'bit_precision': 6, 'compression_rate': 89.1832838260625}, 'ALBEDO': {'bit_precision': 4, 'compression_rate': 117.12063278458456}, 'RAINC': {'bit_precision': 32, 'compression_rate': 2019.7411631239552}}





# 3. Batch Compression

In [1]:

import scicodec as sc
example_ds = sc.dataset.download_benchmark("wrf")


# Compress the Xarray dataset.
sz3_dict={'T2': {'abs_precision': 0.72265625, 'compressor': 'sz3'}, 
 'U10': {'abs_precision': 1.07421875, 'compressor': 'sz3'}, 
 'V10': {'abs_precision': 1.19140625, 'compressor': 'sz3'}, 
 'PSFC': {'abs_precision': 9.990234375, 'compressor': 'sz3'}, 
 'LAI': {'abs_precision': 0.048828125, 'compressor': 'sz3'}, 
 'ALBEDO': {'abs_precision': 0.009765625, 'compressor': 'sz3'}, 
 'RAINC': {'abs_precision': 0.009765625, 'compressor': 'sz3'}}

    
zfp_dict={'T2': {'bit_precision': 6, 'compressor': 'zfp'}, 
 'U10': {'bit_precision': 4, 'compressor': 'zfp'}, 
 'V10': {'bit_precision': 4, 'compressor': 'zfp'}, 
 'PSFC': {'bit_precision': 6, 'compressor': 'zfp'}, 
 'LAI': {'bit_precision': 6, 'compressor': 'zfp'}, 
 'ALBEDO': {'bit_precision': 4, 'compressor': 'zfp'}, 
 'RAINC': {'bit_precision': 32, 'compressor': 'zfp'}}



batch_dict = {
    "work1": sz3_dict,
    'work2': {'compressor': "bmshj2018_factorized", 'level': 7},   
    'work3': zfp_dict,    
}

saved_dir="./datasets/batch_output"

compressor=sc.compression.BatchCompressor(method= batch_dict)

# Compress the Xarray dataset.
compressor.encode(example_ds, save_dir=saved_dir)

# Decode the Xarray dataset.
decompressed_ds = compressor.decode(saved_dir,output_dir=saved_dir)




Dataset wrf already exists at ./datasets/wrf.nc, reloaded


  @amp.autocast(enabled=False)
Batch Decoding: 100%|██████████| 3/3 [04:58<00:00, 99.35s/it] 
