Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Breaking: Return True for Element in Composition if Species.symbol matches Element #3184

Merged
merged 7 commits into from Jul 27, 2023

Conversation

janosh
Copy link
Member

@janosh janosh commented Jul 25, 2023

Closes #3185.

Element in Composition is currently False if that Element appears in Composition as a Species with an oxidation state.

This PR changes the __contains__ method to return True for x in Composition if

  1. x exactly matches a Species (both element symbol and oxidation state) in the composition (same as before)
  2. or if x is an Element and a Species with matching element symbol is present, regardless of oxidation state (returned False before).

It does not change behavior for the reverse, i.e. if a Species in Composition is still False if only its Element without oxidation state is in the Composition.

This is strictly speaking a breaking change but an improvement over previous behavior as it prevents unexpected behavior like Element('O/S') in comp returning False which caused MP 2020 correction scheme to not apply oxide/sulfide corrections if comp has oxidation states (see #3185).

I added unit tests for all edge cases incl. DummySpecies:

# Test Species in Composition
comp = Composition({Species("Fe2+"): 2})
assert Species("Fe2+") in comp
assert Species("Fe3+") not in comp
assert "Fe" in comp
assert Element("Fe") in comp

# Test Element in Composition with Species
comp = Composition({Species("Fe2+"): 2})
assert Element("Fe") in comp
assert Element("O") not in comp
assert "Fe" in comp
assert "O" not in comp

# Test str in Composition with Element
comp = Composition({"Fe": 2})
assert "Fe" in comp
assert "O" not in comp
assert Element("Fe") in comp
assert Element("O") not in comp

# Test int (atomic number) in Composition
comp = Composition({Element("Fe"): 2})
assert 26 in comp  # atomic number for Fe
assert 8 not in comp  # atomic number for O

# Test float in Composition
comp = Composition({Element("Fe"): 2})
with pytest.raises(TypeError, match="expected string or bytes-like object, got 'float'"):
    assert 1.5 in comp

# Test DummySpecies in Composition
comp = Composition({DummySpecies("X"): 2})
assert DummySpecies("X") in comp
assert DummySpecies("A") not in comp
assert "X" in comp
assert "Y" not in comp
assert Element("Fe") not in comp
assert Species("Fe2+") not in comp

- Species in Composition
- Element in Composition with Species
- str in Composition with Element
- int (atomic number) in Composition
- float in Composition
- DummySpecies in Composition
@janosh janosh added fix Bug fix PRs core Pymatgen core breaking Breaking change labels Jul 25, 2023
>       assert len(processed_entry.energy_adjustments) == 1
E       AssertionError: assert 2 == 1
expected wrong error msg
@janosh janosh changed the title Fix Composition.__contains__ Breaking: Return True for Element in Composition if Species.symbol matches Element Jul 26, 2023
@rkingsbury
Copy link
Contributor

rkingsbury commented Jul 26, 2023

Yikes, this is a pretty significant bug. So if I understand #3185 correctly, if one applies MP2020Compatibility to an entry that has been pre-processed such that the structure is decorated with oxidation states (as opposed to just carrying oxi_states in the .data attribute, then the oxide (or other) corrections would potentially be missed. Is that right?

I agree this should definitely be fixed as you've shown. Tagging @awvio in case she has any thoughts about impact on the correction scheme. My recollection is that when we developed and fit the new corrections, we always used un-decorated structures, and then either populated oxi_states ourselves or let oxi_state_guesses do it, so I don't believe this bug would have affected the fitted values or anything. I also don't think it should have affected anything in production (unless the build pipeline has recently changed to generate these Species containing structures)

@janosh
Copy link
Member Author

janosh commented Jul 26, 2023

if one applies MP2020Compatibility to an entry that has been pre-processed such that the structure is decorated with oxidation states (as opposed to just carrying oxi_states in the .data attribute, then the oxide (or other) corrections would potentially be missed. Is that right?

@rkingsbury That's correct. This came up in the context of CHGNet which can infer oxidation states on structures from predicted magmoms. Applying the correction scheme to these structures did not include any oxide/sulfide corrections (search for process_entries in this notebook for details).

I also don't think it should have affected anything in production (unless the build pipeline has recently changed to generate these Species containing structures)

I think so too. @esoteric-ephemera kindly checked all MP ComputedStructureEntries and found none containing oxidation states.

@janosh janosh merged commit e1f260c into master Jul 27, 2023
30 checks passed
@janosh janosh deleted the fix-composition-contains branch July 27, 2023 14:37
@awvio
Copy link
Contributor

awvio commented Jul 27, 2023

I believe @rkingsbury is correct, this should not have affected the corrections we fit since the entries should have had the oxidation states added during the fitting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Breaking change core Pymatgen core fix Bug fix PRs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

MaterialsProject2020Compatibility doesn't apply oxide/sulfide corrections if O/S have oxidation states
3 participants