Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Binary output #37

Open
rikigigi opened this issue Oct 2, 2020 · 1 comment
Open

Binary output #37

rikigigi opened this issue Oct 2, 2020 · 1 comment

Comments

@rikigigi
Copy link
Member

rikigigi commented Oct 2, 2020

@lorisercole
Right now, the default binary output is a pickle dumped blob that, for a first time user, I think it is difficult to understand. Its content is:

['KAPPA_SCALE',
 'TEMPERATURE',
 'TSKIP',
 'UNITS',
 'VOLUME',
 '__class__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 'cepstral_log',
 'j_DT_FS',
 'j_Nyquist_f_THz',
 'j_PSD_FILTER_W_THz',
 'j_cospectrum',
 'j_fcospectrum',
 'j_flogpsd',
 'j_fpsd',
 'j_freqs_THz',
 'j_logpsd',
 'j_psd',
 'jf_DT_FS',
 'jf_Nyquist_f_THz',
 'jf_dct_Kmin_corrfactor',
 'jf_dct_aic_Kmin',
 'jf_dct_kappa',
 'jf_dct_kappa_THEORY_std',
 'jf_dct_logpsd',
 'jf_dct_logpsdK',
 'jf_dct_logpsdK_THEORY_std',
 'jf_dct_logtau',
 'jf_dct_logtau_THEORY_std',
 'jf_dct_psd',
 'jf_flogpsd',
 'jf_fpsd',
 'jf_freqs_THz',
 'jf_logpsd',
 'jf_psd',
 'jf_resample_log',
 'kappa_Kmin',
 'kappa_Kmin_std',
 'units',
 'write_old_binary']

Is it used by anyone or anywhere in the code? Is it safe to change the default binary output to the one equivalent to the human readable one but with numpy arrays?

@lorisercole
Copy link
Member

The content of the default bin format is simply an object with those attributes.
However, I would also avoid splitting the binary output in many files: it does not make sense.

I think we can simplify this by saving many arrays/variables in a numpy or json file (we need to test this). Like this:

tc_dict = {
    'j': {
        'DT_FS': j.DT_FS,
        'KAPPA_SCALE': j.KAPPA_SCALE,
        'psd': j.psd,
         ...
    },
    'jf': {
        'DT_FS': j.DT_FS,
        'KAPPA_SCALE': j.KAPPA_SCALE,
        'psd': j.psd,
         ...
    },
    ...
}

Or with less-readable code:

tc_dict = {
    'j': {},
    'jf': {},
    ...
}
attrs_to_save = ['DT_FS', 'KAPPA_SCALE', 'psd', ...]
for key in tc_dict.keys():
    for attr in attrs_to_save:
        tc_dict[key][attr] = getattr(locals()[key], attr)

(we should find a smarter solution if the dictionary is more deeply-nested)

Then save it using numpy.save('binary_output.npy', **tc_dict) or json.dump(open('binary_output.json', 'w')).

We will then need functions to reconstruct the Currents objects, etc, from this binary file...

What do you think?

lorisercole added a commit that referenced this issue Nov 10, 2020
Working on #37.
A first draft of SportranBinaryFile.
The useful storable attributes (input_parameters, settings, current,
current_resampled, output_results) should be further defined/corrected.

TODO:
- Define proper data structures: SportranInput, SportranSettings,
SportranOutput, ...
  these can be seen as input/outputs of a "Workflow".
  A workflow for example is defined in analysis.py
- Define functions that collect all the data useful to save the current
calculation/namespace and dumps it into a SportranBinaryFile.
- Define functions that extract and use the data of a SportranBinaryFile
  to restore a calculation/data namespace.
@lorisercole lorisercole mentioned this issue Nov 10, 2020
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants