Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using JSON to import comsol results #4064

Merged
merged 11 commits into from
May 10, 2024

Conversation

santacodes
Copy link
Contributor

@santacodes santacodes commented May 6, 2024

Description

    comsol_variables = json.load(open(comsol_results_path, "rb"))
    variable = np.array(comsol_variables["variable_name"])
    

The default imported data from JSON has list datatype so a bit of pre-processing is done before it can be used to plot. np.array() converts the multi-dim list to multi-dim numpy array.

Addresses a part of #4026

Type of change

Please add a line in the relevant section of CHANGELOG.md to document the change (include PR #) - note reverse order of PR #s. If necessary, also add to the list of breaking changes.

  • New feature (non-breaking change which adds functionality)
  • Optimization (back-end change that speeds up the code)
  • Bug fix (non-breaking change which fixes an issue)

Key checklist:

  • No style issues: $ pre-commit run (or $ nox -s pre-commit) (see CONTRIBUTING.md for how to set this up to run automatically when committing locally, in just two lines of code)
  • All tests pass: $ python run-tests.py --all (or $ nox -s tests)
  • The documentation builds: $ python run-tests.py --doctest (or $ nox -s doctests)

You can run integration tests, unit tests, and doctests together at once, using $ python run-tests.py --quick (or $ nox -s quick).

Further checks:

  • Code is commented, particularly in hard-to-understand areas
  • Tests added that prove fix is effective or that feature works

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@kratman
Copy link
Contributor

kratman commented May 6, 2024

If this works then I am happy with it. It looks like the lychee thing is unrelated

@kratman
Copy link
Contributor

kratman commented May 6, 2024

Lychee is fixed here #4065

Copy link

codecov bot commented May 6, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 99.58%. Comparing base (fd7d1fd) to head (1b7a7a1).

Additional details and impacted files
@@           Coverage Diff            @@
##           develop    #4064   +/-   ##
========================================
  Coverage    99.58%   99.58%           
========================================
  Files          260      260           
  Lines        21358    21358           
========================================
  Hits         21270    21270           
  Misses          88       88           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@santacodes
Copy link
Contributor Author

import json
import pickle
import numpy as np

files = ["comsol_01C.json","comsol_05C.json", "comsol_1C.json", "comsol_1plus1D_3C.json","comsol_2C.json", "comsol_3C.json"]


for file in files:
    with open(file.split('.')[0] + ".pickle", 'rb') as infile:
        pickleobj = pickle.load(infile)

        for i in pickleobj:

            if i == 'solution_time':
                pickleobj[i] = None
                continue
            else:
                pickleobj[i] = pickleobj[i].tolist()

    with open(file, 'w', encoding='utf-8') as outfile:
        json.dump(pickleobj, outfile, ensure_ascii=False, indent=4)

For reference, I am attaching the code I used to convert the pickle data with multi-dim arrays to JSON. JSON doesn't allow ndarray serialization so I had to convert the multi-dim arrays to a multi-dim lists to write to JSON. Only key name with 'solution_time' had a NoneType as value and not array which is why the extra if statement is added to include the edge case.

Copy link
Member

@agriyakhetarpal agriyakhetarpal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @santacodes! The Windows failure is unrelated, I just triggered a re-run for it. I haven't verified all of the JSON files, but I did so for one or two of them (one can't see the diffs on GitHub – they are quite big!1) on my machine, and after the conversion to arrays, the elements seem to be matching one-for-one (this should be better than the previous iteration where I felt there were missing values due to the presence of ... in the strings) which is great.

For the example notebooks tests, it looks like there is one more file that you need to fix :) Otherwise, the changes look quite good to me.

Footnotes

  1. GitHub displays the number of lines changed as +878,463 −36. I am assuming that this is because 2D data was converted to 1D and values from keys such as c_n_surf, c_e, c_p_surf, phi_n, phi_p and so on are being displayed on a single line with multiple []-style JSON arrays

MANIFEST.in Outdated Show resolved Hide resolved
.gitignore Outdated Show resolved Hide resolved
Copy link
Member

@valentinsulzer valentinsulzer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me other than Agriya's changes. Would be good to keep the COMSOL results in a separate location e.g. using pooch as discussed on Slack. That can be a separate PR

@agriyakhetarpal
Copy link
Member

Would be good to keep the COMSOL results in a separate location e.g. using pooch as discussed on Slack. That can be a separate PR

The only trouble that I feel this will cause is that it would require an active internet connection for those who would wish to use such data for their experiments, but then for someone to download and install PyBaMM, doing that requires an internet connection anyway... so I agree that this can be done overall – @santacodes, would you be willing to do this further? I just created a new repository for this at https://github.com/pybamm-team/pybamm-data, and named it as such rather than giving a COMSOL-specific name because we can move other CSV files for drive cycles, Enertech cells, etc. as well. I'll copy the files over there once this PR gets merged.

@agriyakhetarpal
Copy link
Member

In the absence of a network connection, we will have to skip relevant tests that use these data files. The scikit-image test suite has examples of this behaviour and pytest can be configured with fixtures, i.e., @pytest.mark.network and then configuring the test suite to run pytest -m "not network"

@santacodes
Copy link
Contributor Author

Would be good to keep the COMSOL results in a separate location e.g. using pooch as discussed on Slack. That can be a separate PR

The only trouble that I feel this will cause is that it would require an active internet connection for those who would wish to use such data for their experiments, but then for someone to download and install PyBaMM, doing that requires an internet connection anyway... so I agree that this can be done overall – @santacodes, would you be willing to do this further? I just created a new repository for this at https://github.com/pybamm-team/pybamm-data, and named it as such rather than giving a COMSOL-specific name because we can move other CSV files for drive cycles, Enertech cells, etc. as well. I'll copy the files over there once this PR gets merged.

Sure I am willing to work on it, I'll get started once this PR is merged.

@kratman
Copy link
Contributor

kratman commented May 9, 2024

In the absence of a network connection, we will have to skip relevant tests that use these data files.

Just an aside, but I am not able to use nox for testing without connection anyway. I think if the files were handled in a way that they are stored elsewhere and downloaded only if they don't exist would solve most issues for that

@agriyakhetarpal
Copy link
Member

Just an aside, but I am not able to use nox for testing without connection anyway.

This is because nox is installing the dependencies for you besides running the tests, so if you have your pip downloads cached and your virtual environments in place in .nox/ , then it should be able to function without an internet connection. You can also do nox -s unit --no-install to skip the installation step that requires internet access, that will start the test runner without executing any session.install("<...>") statements.

I think if the files were handled in a way that they are stored elsewhere and downloaded only if they don't exist would solve most issues for that

Yes, pooch will handle this. It will store these files in your OS cache (or a specified location if needed, but this is not recommended) and will raise an error if those files are tampered with (by verifying the SHA-256 checksum for the files). It will re-download the files only if they don't exist.

@santacodes
Copy link
Contributor Author

pouch_cell_model.ipynb was failing doctests which seems like an unrelated issue. I left out manifest.in as we would be moving the JSON files to a separate repository for pooch.

Copy link
Member

@agriyakhetarpal agriyakhetarpal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @santacodes! All of the notebook outputs look identical. Yes, leaving out changes to MANIFEST.in is alright because these files will be moved to a separate repository soon anyway

Copy link
Member

@arjxn-py arjxn-py left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add this commit hash to .git-blame-ignore-revs?

@agriyakhetarpal
Copy link
Member

Should we add this commit hash to .git-blame-ignore-revs?

I don't think we should, because these are relevant code changes and they will not spoil the git blame (the revisions are not scattered across the codebase but are rather in a specific directory).

@agriyakhetarpal
Copy link
Member

I edited the PR description so that it does not close the linked issue, we can track the second half there. I'll edit the issue description.

@agriyakhetarpal agriyakhetarpal merged commit edb139e into pybamm-team:develop May 10, 2024
42 checks passed
@santacodes santacodes deleted the comsol branch May 10, 2024 13:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants