Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chemical Element #49

Open
Tracked by #101
cortner opened this issue Aug 4, 2022 · 12 comments
Open
Tracked by #101

Chemical Element #49

cortner opened this issue Aug 4, 2022 · 12 comments

Comments

@cortner
Copy link
Member

cortner commented Aug 4, 2022

At the moment, the Atom type is defined as follows:

struct Atom{D, L<:Unitful.Length, V<:Unitful.Velocity, M<:Unitful.Mass}
    position::SVector{D, L}
    velocity::SVector{D, V}
    atomic_symbol::Symbol
    atomic_number::Int
    atomic_mass::M
    data::Dict{Symbol, Any}  # Store arbitrary data about the atom.
end

I notice two things: why do we need atomic symbol AND atomic number? Doesn't it define the same thing? Secondly, Symbol is not an isbbits type (which really tripped me up a little while ago).

I therefore propose to implement a (could be named something else of course)

struct ChemicalSpecies
    Z::Int 
end 

which is now isbits, and could be displayed as the symbol for readability.

@rkurchin
Copy link
Collaborator

rkurchin commented Aug 4, 2022

I agree.

The original reason for separating out some of these fields was to allow for a single particle to not necessarily represent an atom, but instead be more general to be e.g. a molecule or some other coarse-grained object. I think that's still possible, and one would then just not make use of the Atom type at all (and, correspondingly, species_type on the system would return something different.

We may still want to hold onto the data field, though, since otherwise ChemicalSpecies would essentially just be a rebrand of Element from PeriodicTable.jl.

@cortner
Copy link
Member Author

cortner commented Aug 4, 2022

Maybe this now goes back to the #43 discussion. Personally, I would really prefer something like this

@cortner
Copy link
Member Author

cortner commented Aug 4, 2022

We may still want to hold onto the data field, though, since otherwise ChemicalSpecies would essentially just be a rebrand of Element from PeriodicTable.jl

I would want the species to be ultra-light-weight for speed. So yes it represents the same thing but it is an entirely different implementation. In fact it could be used to index into the list of elements.

@cortner
Copy link
Member Author

cortner commented Aug 4, 2022

btw, the point about wrapping an integer is simply to distinguish the atomic number from an integer. There is no arithmetic on atomic numbers, one may want to dispatch on indexing, etc. I did this in JuLIP because my codes started getting confused when an integer was an index and when was it an atomic number.

@rkurchin
Copy link
Collaborator

rkurchin commented Aug 4, 2022

I think this also comes back to the question of "Does everyone need to use the same atom type?" Some people might prioritize it being lightweight over everything else, while others might favor ease of accessing associated data. At one point, we were addressing this by having an AbstractAtom type, but we got rid of it to make the overall interface lighter-weight. The main thing I wouldn't want to sacrifice in any case would be interoperability, i.e. my system should always be able to be converted into your system if they're both made of atoms and it otherwise makes sense...

Maybe there's some clever intermediate solution on the backend where the same type can be a minimalist lightweight thing but also be "lazily" instantiated to store additional stuff?

@cortner
Copy link
Member Author

cortner commented Sep 27, 2023

I don't think there has to be a compromise. If people are happy with the following idea, I can transfer this in from JuLIP:

struct ChemicalSymbol    # or ChemicalElement or .... I don't care about the name. 
    z::Int
end 

ChemicalElement(:Si) 
# ChemicalElement(14)
# but display as 
#  <Si> 

It would just need a list of symbols, and a dictionary for reverse lookup.

We really shouldn't be using integers for this, nor symbols, having a separate type gives us dispatch. Integers can be confused with indices (e.g. iteration) and you definitely don't want to do arithmetic on Chemical Elements.

@rkurchin
Copy link
Collaborator

rkurchin commented Oct 2, 2023

I like this...but I see I also liked it a year ago, so would be good for one or two other people to chime in, too... @mfherbst or @jgreener64, any opinions?

@mfherbst
Copy link
Member

mfherbst commented Oct 3, 2023

Personally I don't care how we represent element types. I agree a separate struct would be nice for dispatch and as long it is light weight and flexible it really does not matter how it stores things.

Regarding the old question of the atoms: The issue is that some file formats do allow you to use different symbols (e.g. to identify deuterium versus hydrogen) or change the atomic mass, which was originally the motivation for having these separated entries instead of a lookup in the default (feature-rich and slow) atom implementation. I think it makes a lot of sense to have a slimmer version, that essentially ties all identification of the atom to the atomic number and that's it.

@cortner
Copy link
Member Author

cortner commented Oct 3, 2023

My point is that it does matter. If you store them as symbols then anything wrapping that will not be isbits and that can have real performance implications.

The point about different symbols outside the standard list is important. This would also become relevant, e.g. for coarse-grained models where you don't have the standard chemical symbols.

@mfherbst
Copy link
Member

mfherbst commented Oct 9, 2023

Yes sure, I understand that. But then you could just make your atom type that just stores the number and only implements the functions that return atomic_symbol by looking up the symbol using periodic_tables or something. Would that not solve the isbits issue ?

@cortner
Copy link
Member Author

cortner commented Oct 9, 2023

Yes that’s essentially what I’m proposing.

EDIT : above was a quick response - slightly longer : I suppose we can debate whether there is a point in having ChemicalSymbol type that can be used/reused in multiple ways. or just implement a newAtom type with that kind of functionality. I prefer the former. E.g. it allows us to create categorical variables in ML models.

@cortner
Copy link
Member Author

cortner commented May 19, 2024

How many people use isotopes right now with AtomsBase?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants