# Isopy Tutorial 2 - Key Strings, Key Lists and Flavour

In [1]:
import isopy

## Key Strings

---
Key strings are used as column identifiers in arrays and as keys for isopy dictionaries. There are six different types each with its own flavour which describes the type of data it represents. These are, in order of priority, `mass`, `element`, `isotope`, `molecule`, `ratio` and `general`. 

You can create key string using the {func}`isopy.keystring()` and {func}`isopy.askeystring` functions. The difference being that the former will always return the key string of the highest priority even it the input already is a key string of lower priority. The latter meanwhile will preserve the flavour of the input. 

The representation, {func}`repr`, of a key string in jupyter will render as HTML.

### Mass key string

---
The `mass` flavour represent a mass number. It must be a positive integer but will accept both integers and strings as input. E.g.

In [2]:
isopy.keystring('102')

MassKeyString('102')

In [3]:
isopy.keystring(102),

(MassKeyString('102'),)

---
Mass key string have the following attributes:
- `flavour` - Always `"mass"`
- `mass_number` - Simply a reference for itself. Useful for filtering.

In [4]:
key = isopy.keystring(102)
key.flavour, key.mass_number

(mass, MassKeyString('102'))

---
The key strings are compatible with the `>`,`>=`, `<`, and `<=` operators

In [5]:
isopy.keystring(102) > 100, isopy.keystring(102) >= 102.5, isopy.keystring(102) < '100'

(True, False, False)

### Element key string

---
The `element` flavour represent an element symbol. It is limited to at most two letters and is always capitalised, however, the input can be any case. It also accept the full name of the element. E.g.

In [6]:
isopy.keystring('Pd')

ElementKeyString('Pd')

In [7]:
isopy.keystring('PD'), isopy.keystring('pd'), isopy.keystring('palladium'), 

(ElementKeyString('Pd'), ElementKeyString('Pd'), ElementKeyString('Pd'))

---
Element key string have the following attributes:
- `flavour` - Always `"element"`.
- `element_symbol` - Simply a reference to itself. Useful for filtering.
- `isotopes` - Return a list of all the naturally occuring isotopes of this element

In [8]:
key = isopy.keystring('Pd')
key.flavour, key.element_symbol, key.isotopes

(element,
 ElementKeyString('Pd'),
 IsopyKeyList('102Pd', '104Pd', '105Pd', '106Pd', '108Pd', '110Pd', flavour='isotope'))

### Isotope key string

---
The `isotope` flavour represent an isotope comprised of a mass number followed by an element symbol. You can swap the mass number and element symbol around and `-` can be used to separate the two when creating these key strings. E.g.

In [9]:
isopy.keystring('102Pd')

IsotopeKeyString('102Pd')

In [10]:
isopy.keystring('102pd'), isopy.keystring('palladium-102')

(IsotopeKeyString('102Pd'), IsotopeKeyString('102Pd'))

---
Isotope key strings have the following attributes:
- `flavour` - Always `"isotope"`
- `mass_number` - The mass number of this isotope.
- `element_symbol` - The element symbol of this isotope.
- `isotopes` - A keylist containing only this isotope.
- `mz` - The mass over charge ratio. Always equal to the mass number.

In [11]:
key = isopy.keystring('102Pd')
key.flavour, key.mass_number, key.element_symbol, key.mz, key.isotopes

(isotope,
 MassKeyString('102'),
 ElementKeyString('Pd'),
 102.0,
 IsopyKeyList('102Pd', flavour='isotope'))

---
You can use the `in` operator to check if an isotope key string contains a given mass number of element symbol e.g.

In [12]:
key = isopy.keystring('102Pd')
102 in key, 'pd' in key

(True, True)

### Molecule key string

---
The `molecule` flavour represents a molecule made up of elements and/or isotopes. A molecule consists of components, which can be elements, isotopes or another molecule, a multiplied and a charge. When listing multiple elements they must be capitalised and isotopes should be enclosed brackets to differentiate them from multiples. You can group components using either `()` or `[]`. Positive charges are represented by `+` and negative charges by `-`.

In [13]:
isopy.keystring('H2O')

MoleculeKeyString('[H2O]')

In [14]:
isopy.keystring('H2(16O)'), isopy.keystring('[OH]-'), isopy.keystring('137ba++'), 

(MoleculeKeyString('[H2[16O]]'),
 MoleculeKeyString('[(OH)-]'),
 MoleculeKeyString('137Ba++'))

---
Molecule key strings have the following attributes:
- `flavour` - The molecule flavour contains the flavour of the components within brackets e.g. `"molecule[element]"`. 
- `element_symbol` - Converts the molecule into a molecule containing only element symbols.
- `isotopes` - Expands all element components to create a list of all the isotope variations of this molecule
- `mz` - The mass over charge ratio of the molecule. Negative charges will return negative numbers
- `components` - The components of the molecule.
- `n` - The multiplier of the components.
- `charge` - The charge of the components.

In [15]:
isopy.keystring('H2O').flavour, isopy.keystring('[H2[16O]]').flavour

(molecule[element], molecule[element|isotope])

In [16]:
isopy.keystring('[H2[16O]]').element_symbol

MoleculeKeyString('[H2O]')

In [17]:
isopy.keystring('[H2[16O]]').isotopes

IsopyKeyList('[([1H][1H])[16O]]', '[([2H][1H])[16O]]', '[([1H][2H])[16O]]', '[([2H][2H])[16O]]', flavour='molecule[isotope]')

In [18]:
isopy.keystring('137ba++').mz

68.5

In [19]:
key = isopy.keystring('[OH]-')
key.components, key.n, key.charge

((ElementKeyString('O'), ElementKeyString('H')), 1, -1)

### Ratio key string
---
The `ratio` flavour represents a ratio between two other key strings. The numerator and denominator key string should be separated by a `/`. To create nested ratios in use multiple `/` to denote higher order ratios. E.g.

In [20]:
isopy.keystring('108pd/105pd')

RatioKeyString('108Pd/105Pd')

In [21]:
isopy.keystring('rh/ru//pd///cd')

RatioKeyString('Rh/Ru//Pd///Cd')

---
Ratio key string contain the following attributes:
- `flavour` - The ratio flavour containts the flavour of the numerator and denominator within brackets e.g. `"ratio[isotope, isotope]"`.
- `numerator` - The numerator key string.
- `denominator` - The denominator key string.

In [22]:
key = isopy.keystring('108pd/105pd')
key.flavour, key.numerator, key.denominator

(ratio[isotope, isotope], IsotopeKeyString('108Pd'), IsotopeKeyString('105Pd'))

---
The `in` operator can be used to check if a key string is equal to either the numerator or denominator key string. E.g.

In [23]:
'ru' in isopy.keystring('ru/pd')

True

---
The `/` operator can be used to create a ratio from any key string e.g.

In [24]:
isopy.keystring('ru') / 'pd', 'ru' / isopy.keystring('pd/cd')

(RatioKeyString('Ru/Pd'), RatioKeyString('Ru//Pd/Cd'))

### General key string

---
The `general` flavour accepts any string. E.g.

In [25]:
isopy.keystring('anything')

GeneralKeyString('anything')

---
Ratio key string have the following attributes:
- `flavour` - Always `"general"`

### keystring vs askeystring

---
The `general` flavour has the lowest priority but can store string that are compatible with higher priority flavours. In order to preserve the flavour of such key string we need to use {func}`isopy.askeystring` instead of {func}`isopy.keystring`. E.g. 

In [26]:
key = isopy.keystring('pd', flavour='general')
key, isopy.keystring(key), isopy.askeystring(key)

(GeneralKeyString('pd'), ElementKeyString('Pd'), GeneralKeyString('pd'))

## Key list

---
A key list is a sequence of one or more key strings. They can be created using the {func}`isopy.keylist` and {func}`isopy.askeylist` functions. The difference between the functions is the same as for {func}`isopy.keystring` and {func}`isopy.askeystring`. The former will always return the highest priority flavour compatible with each item in the list while the latter will preserve the existing flavour of the item if possible.

The representation, {func}`repr`, of key lists in jupyter will render as HTML.

Both functions accepts a variety of inputs e.g.

In [27]:
isopy.keylist(['ru', 'pd', 'cd'])

IsopyKeyList('Ru', 'Pd', 'Cd', flavour='element')

In [28]:
isopy.keylist('ru', 'pd', 'cd')

IsopyKeyList('Ru', 'Pd', 'Cd', flavour='element')

In [29]:
isopy.keylist('ru pd cd') # Will split lone strings at whitespace

IsopyKeyList('Ru', 'Pd', 'Cd', flavour='element')

In [30]:
isopy.keylist(['ru pd']), isopy.keylist('ru pd') # Will not split strings in other objects like lists.

(IsopyKeyList('ru pd', flavour='general'),
 IsopyKeyList('Ru', 'Pd', flavour='element'))

In [31]:
isopy.keylist(['ru', 'rh'], 'pd', 'ag cd') # Accepts multiple items

IsopyKeyList('Ru', 'Rh', 'Pd', 'Ag', 'Cd', flavour='element')

In [32]:
a = isopy.ones(1, ['ru', 'pd', 'cd'])
isopy.keylist(a)

IsopyKeyList('Ru', 'Pd', 'Cd', flavour='element')

---
Key lists have the following attributes:
- `flavour` - The flavour of the key list. 
- `flavours` - The flavour of each item in the key list.
- `mass_numbers` - The `mass_number` attribute for each item in the key list*.
- `element_symbols` - The `element_symbol` attribute for each item in the key list*.
- `isotopes` - The combined items of the `isotopes` attribute for each item in the key list*.
- `mz` - The `mz` attribute for each item in the key list*.
- `numerators` - The `numerator` attribute for each item in the key list*.
- `denominators` - The `denominator` attribute for each item in the key list*.
- `common_denominator` - This value is equal to the common denominator of all key string in the list. If there is no common denominator or not all items in the list are ratio key strings the value of this attribute is `None`.

\* if one or more items in the key list do not have the corresponding attribute the value of this attribute is `None`

In [33]:
keys = isopy.keylist(['ru', '108pd', 'H2(16O)'])
keys.flavour, keys.flavours

(element|isotope|molecule[element|isotope],
 (element, isotope, molecule[element|isotope]))

In [34]:
keys.mass_numbers, keys.element_symbols # element key string does not have a mass number attribute

(None, IsopyKeyList('Ru', 'Pd', '[H2O]', flavour='element|molecule[element]'))

In [35]:
keys.isotopes

IsopyKeyList('96Ru', '98Ru', '99Ru', '100Ru', '101Ru', '102Ru', '104Ru', '108Pd', '[([1H][1H])[16O]]', '[([2H][1H])[16O]]', '[([1H][2H])[16O]]', '[([2H][2H])[16O]]', flavour='isotope|molecule[isotope]')

In [36]:
isopy.keylist(['66zn', '137ba++', '70zn']).mz

(66.0, 68.5, 70.0)

In [37]:
keys = isopy.keylist('ru/pd', 'cd/pd')
keys.numerators, keys.denominators, keys.common_denominator

(IsopyKeyList('Ru', 'Cd', flavour='element'),
 IsopyKeyList('Pd', 'Pd', flavour='element'),
 ElementKeyString('Pd'))

### Combining key lists

---
You can add and remove items from key lists using the `-` and `+` operator. *Note* that these operations returns new lists and do not change the originals.

In [38]:
isopy.keylist('ru pd cd') + 'ag'

IsopyKeyList('Ru', 'Pd', 'Cd', 'Ag', flavour='element')

In [39]:
'ru pd cd' - isopy.keylist('ru ag') # Missing keystrings are ignored.

IsopyKeyList('Pd', 'Cd', flavour='element')

---
You can also combine key lists using the `|`,  `&` and `^` operators. Note that these functions will remove any duplicate keys.

In [40]:
keys1 = isopy.keylist('ru pd ag')
keys2 = isopy.keylist('ru rh pd')
keys1 | keys2 # or - Keys in either list

IsopyKeyList('Ru', 'Pd', 'Ag', 'Rh', flavour='element')

In [41]:
keys1 & keys2 # and - Keys on both lists

IsopyKeyList('Ru', 'Pd', flavour='element')

In [42]:
keys1 ^ keys2 # xor - Keys in only one of the lists

IsopyKeyList('Ag', 'Rh', flavour='element')

### Sorting key lists

---
You can sort key lists using the `sort` argument for {func}`isopy.keylist` and {func}`isopy.askeylist` or using the {meth}`isopy.core.IsopyKeyList.sorted` method.

Mass key strings are sorted according to the mass number, element key strings are sorted first by the atomic number and then alphabetically, isotope key strings are sorted by the mass number then by the atomic mass of the element symbol, molecule key string are first sorted by the mass over charge ratio then by the atomic mass, ratio key strings are sorted first sorted by the numerator then by the denominator, general key strings are sorted alphabetically.

When mixing flavours they will generally be grouped by their priority with the exception of isotope and molecule key strings which will be mixed. 

In [43]:
isopy.keylist('pd cd ru', sort=True) # Sorted by atomic mass

IsopyKeyList('Ru', 'Pd', 'Cd', flavour='element')

In [44]:
isopy.keylist('110pd 102pd 102ru 110cd').sorted() # Sorted by mass number then by the element symbol

IsopyKeyList('102Ru', '102Pd', '110Pd', '110Cd', flavour='isotope')

In [45]:
isopy.keylist('64zn 70zn 137ba++', sort=True) # Sorted by mass number and mass over charge ratio

IsopyKeyList('64Zn', '137Ba++', '70Zn', flavour='isotope|molecule[isotope]')

### Filtering key lists

---
You can filter key lists based on the attributes of items in the list using the {meth}`isopy.core.IsopyKeyList.filter` method. A key filter is passed as a keyword argument where the keyword should consists of two parts, first the attribute to be filtered and followed by the comparison to be performed e.g. `<attribute>_<comparison>`.

The comparisons available are:
- `eq`  - For single items `==` operator is used. If a list or tuple of arguments is supplied the `in` operator is used.
- `neq` - For single items `!=` operator is used. If a list or tuple of arguments is supplied the `not in` operator is used.
- `lt` - Represents the `<` operator.
- `le` - Represents the `<=` operator.
- `gt` - Represents the `>` operator.
- `ge` - Represents the `>=` operator.

To base a key filter on the key string itself use `key` instead of an attribute. To create a key filter for the numerator or denominator key string in a ratio prefix the key filter with `numerator_` and `denominator_` respectively.

In [46]:
isopy.keylist('110pd 102pd 102ru 110cd').filter(key_eq = ['102ru', '104rh', '109ag', '110pd'])

IsopyKeyList('110Pd', '102Ru', flavour='isotope')

In [47]:
isopy.keylist('110pd 102pd 102ru 110cd').filter(mass_number_ge = 105)

IsopyKeyList('110Pd', '110Cd', flavour='isotope')

In [48]:
isopy.keylist('110pd 102pd 102ru 110cd').filter(element_symbol_eq = ['ru', 'rh', 'cd'])

IsopyKeyList('102Ru', '110Cd', flavour='isotope')

In [49]:
isopy.keylist('110pd/cd 102pd/ru 102ru/pd 110cd/pd').filter(numerator_mass_number_lt=110)

IsopyKeyList('102Pd/Ru', '102Ru/Pd', flavour='ratio[isotope, element]')

In [50]:
isopy.keylist('110pd/cd 102pd/ru 102ru/pd 110cd/pd').filter(denominator_key_eq='pd')

IsopyKeyList('102Ru/Pd', '110Cd/Pd', flavour='ratio[isotope, element]')

---
The {meth}`isopy.core.IsopyArray.filter` method will return a view of the array containing only the columns that match the key filters. Many of the `to_<type>` method also accept key filters to only include certain columns in the returned object.

In [51]:
a = isopy.random(5, keys='110pd 102pd 102ru 110cd'); a

(row),${}^{110}\mathrm{Pd}$ (f8),${}^{102}\mathrm{Pd}$ (f8),${}^{102}\mathrm{Ru}$ (f8),${}^{110}\mathrm{Cd}$ (f8)
0,0.13081,-1.26552,-0.38709,0.93295
1,-1.35951,-1.18374,-2.39698,0.01295
2,-0.82233,0.42927,0.32346,1.88376
3,-1.38854,1.06066,0.52635,1.76373
4,-2.09217,1.34563,1.6975,-0.95455


In [52]:
a.filter(mass_number_ge = 105)

(row),${}^{110}\mathrm{Pd}$ (f8),${}^{110}\mathrm{Cd}$ (f8)
0,0.13081,0.93295
1,-1.35951,0.01295
2,-0.82233,1.88376
3,-1.38854,1.76373
4,-2.09217,-0.95455


In [53]:
a.to_dict(mass_number_ge = 105)

{'110Pd': [0.13080930916078037,
  -1.3595079420221965,
  -0.822328200920095,
  -1.388539219532741,
  -2.092174235183127],
 '110Cd': [0.9329542643367572,
  0.012945945779588288,
  1.8837569305353523,
  1.763731151061314,
  -0.9545541254701979]}

---
Many of the {ref}`array functions<Array functions>` included in the isopy namespace, including the numpy functions, accept key filters to limit the operation to certain columns in the input.

In [54]:
isopy.sd(a, mass_number_ge = 105) # Isopy function

(row),${}^{110}\mathrm{Pd}$ (f8),${}^{110}\mathrm{Cd}$ (f8)
,0.82565,1.20341


In [55]:
isopy.sum(a, mass_number_ge = 105) # Enhanced numpy function

(row),${}^{110}\mathrm{Pd}$ (f8),${}^{110}\mathrm{Cd}$ (f8)
,-5.53174,3.63883


## Flavour

---
You can specify the desired flavour(s) when creating creating key strings, key lists and isopy arrays using the flavour argument. To specify more than one possible flavour separate them with by `|`, e.g. `element|isotope`. The order of priority is always the same irregardless in which order the flavours is specified. By default all possible flavours is used which can also be written as `any` for simplicity.

The priority order for flavours is `mass`, `element`, `isotope`, `molecule`, `ratio` and last `general`. 

In [56]:
isopy.keystring('pd', flavour='isotope|general')

GeneralKeyString('pd')

In [57]:
isopy.keylist('pd ru', flavour='isotope|general')

IsopyKeyList('pd', 'ru', flavour='general')

In [58]:
isopy.array([1,2], 'pd ru', flavour='isotope|general')

(row),$\mathrm{pd}$ (f8),$\mathrm{ru}$ (f8)
,1.0,2.0


In [59]:
isopy.random(None, keys='pd ru', flavour='isotope|general')

(row),$\mathrm{pd}$ (f8),$\mathrm{ru}$ (f8)
,0.09845,-1.96544


---
You can compare the flavour of any isopy item using the `==`, `!=`, `in` and `not in` operators e.g.

In [60]:
isopy.keystring('102pd').flavour == 'isotope', isopy.keystring('102pd').flavour != 'isotope'

(True, False)

In [61]:
isopy.keystring('102pd').flavour in ['element', 'isotope']

True

You can also specify the subflavours for molecule and ratio key strings e.g.

In [62]:
isopy.keystring('a/b').flavour == 'ratio', isopy.keystring('a/b').flavour == 'ratio[element, element]'

(True, True)

You can convert a string into a flavour object using the {func}`isopy.asflavour` function which is useful in conjunction with the `in` and `not in` operators e.g.

In [63]:
isopy.keystring('a/b').flavour in isopy.asflavour('ratio[element|isotope, element|isotope]')

True

Rather than using

In [64]:
isopy.keystring('a/b').flavour in ['ratio[element, element]', 'ratio[element, isotope]', 
                                   'ratio[isotope, element]', 'ratio[isotope, isotope]']

True