Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate _parse_atomic_densities in BaderAnalysis and fix Bader test setup #3656

Merged
merged 10 commits into from
Feb 26, 2024

Conversation

DanielYang59
Copy link
Contributor

@DanielYang59 DanielYang59 commented Feb 26, 2024

Summary

@DanielYang59

This comment was marked as resolved.

@DanielYang59 DanielYang59 changed the title Deprecate _parse_atomic_densities in BaderAnalysis Deprecate _parse_atomic_densities in BaderAnalysis and fix Bader test setup Feb 26, 2024
@DanielYang59
Copy link
Contributor Author

Test setup for Bader seems to have been fixed using https instead of http. Then every test workflow would get the following error:

_____________________ TestBaderAnalysis.test_atom_parsing ______________________

self = <tests.command_line.test_bader_caller.TestBaderAnalysis testMethod=test_atom_parsing>

    def test_atom_parsing(self):
        # test with reference file
        analysis = BaderAnalysis(
            chgcar_filename=f"{TEST_FILES_DIR}/CHGCAR.Fe3O4",
            potcar_filename=f"{TEST_FILES_DIR}/POTCAR.Fe3O4",
            chgref_filename=f"{TEST_FILES_DIR}/CHGCAR.Fe3O4_ref",
            parse_atomic_densities=True,
        )
    
        assert len(analysis.atomic_densities) == len(analysis.chgcar.structure)
    
>       assert np.sum(analysis.chgcar.data["total"]) == approx(
            np.sum([dct["data"] for dct in analysis.atomic_densities])
        )

Maybe I still should zero-pad them in the test (hopefully fix the issue)? @janosh (Or just remove/supress this test if no one is using it anyway?)

@janosh
Copy link
Member

janosh commented Feb 26, 2024

Maybe I still should zero-pad them in the test

yes, we should fix the test instead of removing it while the functionality is still available

@DanielYang59
Copy link
Contributor Author

DanielYang59 commented Feb 26, 2024

Should be good for now, please review @janosh .

Though everything seems to be working, there is still a minor issue here: As Bader itself cannot work directly on compressed files like .gz, BaderAnalysis would need to decompress corresponding files during runtime:

with ScratchDir("."):
if chgcar_filename:
self.is_vasp = True
# decompress the file if compressed
fpath = chgcar_fpath = decompress_file(filepath=chgcar_filename) or chgcar_filename
self.chgcar = Chgcar.from_file(chgcar_fpath)
self.structure = self.chgcar.structure
self.potcar = Potcar.from_file(potcar_filename) if potcar_filename else None
self.natoms = self.chgcar.poscar.natoms
chgref_fpath = decompress_file(filepath=chgref_filename) or chgref_filename
self.reference_used = bool(chgref_filename)
# List of nelects for each atom from potcar
potcar_indices = []
for idx, val in enumerate(self.natoms):
potcar_indices += [idx] * val
self.nelects = (
[self.potcar[potcar_indices[i]].nelectrons for i in range(len(self.structure))]
if self.potcar
else []
)
else:
self.is_vasp = False
fpath = cube_fpath = decompress_file(filepath=cube_filename) or cube_filename
self.cube = VolumetricData.from_cube(cube_fpath)
self.structure = self.cube.structure
self.nelects = []
chgref_fpath = decompress_file(filepath=chgref_filename) or chgref_filename
self.reference_used = bool(chgref_filename)

This leads to some issues:

  • file.gz would be converted to file after BaderAnalysis initialization.
  • Run unit test with test_bader_caller would decompress test files too (As we're working towards compressing test files).

Meanwhile I think it might be better not to alter files in place, both during runtime and test time. As such I think it might make sense to alter the behaviour of BaderAnalysis a bit to decompress the test file in to a duplicate (and remove it afterwards). I would pin myself here to do this after cleaning up test files.

@janosh
Copy link
Member

janosh commented Feb 26, 2024

decompress the test file in to a duplicate (and remove it afterwards). I would pin myself here to do this after cleaning up test files.

that sounds good! by "after cleaning up test files" do you mean in or after #3653?

@DanielYang59
Copy link
Contributor Author

by "after cleaning up test files" do you mean in or after #3653?

I think after #3653, still to make each PR on a dedicated topic 😃

Copy link
Member

@janosh janosh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great work @DanielYang59! love all your contributions 👍

@janosh janosh merged commit d490eff into materialsproject:master Feb 26, 2024
21 of 22 checks passed
@DanielYang59 DanielYang59 deleted the deprecate-bader branch February 26, 2024 08:51
@DanielYang59
Copy link
Contributor Author

Thanks for reviewing and those helpful suggestions @janosh

@janosh janosh added tests Issues with or changes to the pymatgen test suite cli Command line interface charge Electric charge analysis, partitioning, integrations, etc. labels Mar 1, 2024
holds all the data within a precision.
"""
total = np.sum(data)
for idx in range(np.max(data.shape)):
Copy link
Contributor Author

@DanielYang59 DanielYang59 Mar 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another note for myself:

  • This method bader_caller is overall slow, or the bottleneck could be calling bader with subprocess itself, and might need some profiling to identify the bottleneck.

  • Try replacing linear search with bisection method to see if performance could be improved. However it's also important to note this should search for the MINIMUM encompassing volume, pure bisection might not be suitable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
charge Electric charge analysis, partitioning, integrations, etc. cli Command line interface tests Issues with or changes to the pymatgen test suite
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants