New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recording form to support dynamic occurrence attributes #222

Open
DavidRoy opened this Issue Jun 13, 2017 · 28 comments

Comments

Projects
None yet
5 participants
@DavidRoy
Collaborator

DavidRoy commented Jun 13, 2017

There may be an issue to cover this already, but I can't find it (although related to #13.
Also linked to same requirement for the iRecord App
NERC-CEH/irecord-app#14

There is a requirement for a multi-taxa species list recording form for iRecord to capture occurrence attribute values dynamically, e.g.

  • user selects taxon to record
  • the attributes and attribute values to record for the occurrence are relevant to species group of taxa selected. Attributes are defined by Recording Scheme

The requirements are:

  • each taxon to be assigned to a Recording Scheme (requires some additional matching)
  • for each Recording Scheme, a set of occurrence attributes need to be defined

Could do with some discussion as to the best way to implement this

@kazlauskis

This comment has been minimized.

Show comment
Hide comment
@kazlauskis

kazlauskis Jun 13, 2017

Member

As a guess, it might be a good idea to assign each taxa group to different surveys altogether and then create a form that dynamically builds itself depending on the survey's attributes. In general, such a form would be handy as all I would then need to do is just specify the survey ID and the form would build itself.

Member

kazlauskis commented Jun 13, 2017

As a guess, it might be a good idea to assign each taxa group to different surveys altogether and then create a form that dynamically builds itself depending on the survey's attributes. In general, such a form would be handy as all I would then need to do is just specify the survey ID and the form would build itself.

@JimBacon

This comment has been minimized.

Show comment
Hide comment
@JimBacon

JimBacon Jun 14, 2017

Collaborator

Karolis suggests multiple surveys with fixed attributes as opposed to one survey having multiple attributes that are dynamically selected. An advantage of this is that it is what we already have configured. A consequence (disadvantage ?) is that it leads to a proliferation of surveys.

To pursue this option would need a look up from species on a website to survey. Since it is the warehouse that knows the species list I envisage something like a taxon_meanings_websites table. It would have columns taxon_meaning_id and website_id to perform the look up and return the survey_id to use.

We could also add a url column to the table to allow redirection to an existing form rather than constructing a dynamically built form which is likely to be less well organised.

It would be great to be able to use taxon_groups rather than taxon_meanings as the table would be so much smaller and easier to construct. Could all beetle schemes agree a single set of attributes though?!

Question to David - is the requirement really per scheme rather than per taxonomic group?

Collaborator

JimBacon commented Jun 14, 2017

Karolis suggests multiple surveys with fixed attributes as opposed to one survey having multiple attributes that are dynamically selected. An advantage of this is that it is what we already have configured. A consequence (disadvantage ?) is that it leads to a proliferation of surveys.

To pursue this option would need a look up from species on a website to survey. Since it is the warehouse that knows the species list I envisage something like a taxon_meanings_websites table. It would have columns taxon_meaning_id and website_id to perform the look up and return the survey_id to use.

We could also add a url column to the table to allow redirection to an existing form rather than constructing a dynamically built form which is likely to be less well organised.

It would be great to be able to use taxon_groups rather than taxon_meanings as the table would be so much smaller and easier to construct. Could all beetle schemes agree a single set of attributes though?!

Question to David - is the requirement really per scheme rather than per taxonomic group?

@DavidRoy

This comment has been minimized.

Show comment
Hide comment
@DavidRoy

DavidRoy Jun 14, 2017

Collaborator

It is definitely recording schemes, which clearly hasa relationship to taxonomic groups (and taxon_groups within UKSI).

We are currently linking species on the UKSI to recording_schemes, and are in discussion with Chris Raper about this becoming part of the UKSI. I'm not sure how this might be accommodated within Indicia but you can work on the basis that this will exist.

Collaborator

DavidRoy commented Jun 14, 2017

It is definitely recording schemes, which clearly hasa relationship to taxonomic groups (and taxon_groups within UKSI).

We are currently linking species on the UKSI to recording_schemes, and are in discussion with Chris Raper about this becoming part of the UKSI. I'm not sure how this might be accommodated within Indicia but you can work on the basis that this will exist.

@kazlauskis

This comment has been minimized.

Show comment
Hide comment
@kazlauskis

kazlauskis Jun 14, 2017

Member

As an alternative thought @japonicus has mentioned that it might be easier to have a giant survey that would present itself through a form where attributes are simply enabled or disabled depending on the taxa groups.

One way or another, my guess is that the form would have to request the taxon-specific form model (maybe something like this) on every taxon update, and that it will become very dynamic. Something like Angular or React could help here as these two must have lots of tools for such dynamic forms.

Here is an example of how Tom has done his dynamic recording form using description of occurrence model. Note: click on the green bits - it will open up different parts of the form.

Member

kazlauskis commented Jun 14, 2017

As an alternative thought @japonicus has mentioned that it might be easier to have a giant survey that would present itself through a form where attributes are simply enabled or disabled depending on the taxa groups.

One way or another, my guess is that the form would have to request the taxon-specific form model (maybe something like this) on every taxon update, and that it will become very dynamic. Something like Angular or React could help here as these two must have lots of tools for such dynamic forms.

Here is an example of how Tom has done his dynamic recording form using description of occurrence model. Note: click on the green bits - it will open up different parts of the form.

@kitenetter

This comment has been minimized.

Show comment
Hide comment
@kitenetter

kitenetter Jun 14, 2017

Collaborator

Do we need to consider the split between location, sample and occurrence attributes? I can imagine that some attributes (date, location) will likely be needed for all forms, but others (including some sample attributes) might vary, e.g. method, or source of record or specimen (for those schemes that commonly deal with records from specimen collections). And I imagine that having variable sample attributes might be more problematic than having variable occurrence attributes?

I would also expect a requirement that even if several schemes wanted the same attribute, they might want to populate that attribute from differing termlists (again method is an obvious case).

We do have some consensus on a single data structure for all beetle recording schemes, but the consensus has only been sought from a subset of scheme organisers so far!

We could also usefully revisit previous discussions (e.g. #195 ) about how we can enable terms to be made active for data entry or not, in order to futureproof ourselves for the inevitable changes in approach over time.

Collaborator

kitenetter commented Jun 14, 2017

Do we need to consider the split between location, sample and occurrence attributes? I can imagine that some attributes (date, location) will likely be needed for all forms, but others (including some sample attributes) might vary, e.g. method, or source of record or specimen (for those schemes that commonly deal with records from specimen collections). And I imagine that having variable sample attributes might be more problematic than having variable occurrence attributes?

I would also expect a requirement that even if several schemes wanted the same attribute, they might want to populate that attribute from differing termlists (again method is an obvious case).

We do have some consensus on a single data structure for all beetle recording schemes, but the consensus has only been sought from a subset of scheme organisers so far!

We could also usefully revisit previous discussions (e.g. #195 ) about how we can enable terms to be made active for data entry or not, in order to futureproof ourselves for the inevitable changes in approach over time.

@DavidRoy

This comment has been minimized.

Show comment
Hide comment
@DavidRoy

DavidRoy Jun 14, 2017

Collaborator

I think we should focus initially on occurrence attributes but keep sample (and location) attributes in scope. I agree about the need to consider #195

Collaborator

DavidRoy commented Jun 14, 2017

I think we should focus initially on occurrence attributes but keep sample (and location) attributes in scope. I agree about the need to consider #195

@JimBacon

This comment has been minimized.

Show comment
Hide comment
@JimBacon

JimBacon Jun 14, 2017

Collaborator

Responding to David:

Okay, if it is definitely recording schemes then we definitely can't use the taxon_groups.

Let us continue to consider the case of a survey per scheme, as Karolis first suggested. If we assume that, in future, there will be scheme/taxon data in the warehouse then what we will need is a scheme/survey look up. That could be a custom attribute on the survey.

The user selects a species from which we know the scheme. The warehouse is queried for surveys belonging to our website with the matching scheme attribute. Knowing the survey we then build a generic form dynamically. Alternatively/additionally, there could be a custom survey attribute containing a path to an existing recording form specifically designed for the scheme.

(Note that, for building the scheme/taxon data, the Aquatic Coleoptera scheme is a 'beetle in the ointment' as it covers species also in other schemes.)

Collaborator

JimBacon commented Jun 14, 2017

Responding to David:

Okay, if it is definitely recording schemes then we definitely can't use the taxon_groups.

Let us continue to consider the case of a survey per scheme, as Karolis first suggested. If we assume that, in future, there will be scheme/taxon data in the warehouse then what we will need is a scheme/survey look up. That could be a custom attribute on the survey.

The user selects a species from which we know the scheme. The warehouse is queried for surveys belonging to our website with the matching scheme attribute. Knowing the survey we then build a generic form dynamically. Alternatively/additionally, there could be a custom survey attribute containing a path to an existing recording form specifically designed for the scheme.

(Note that, for building the scheme/taxon data, the Aquatic Coleoptera scheme is a 'beetle in the ointment' as it covers species also in other schemes.)

@DavidRoy

This comment has been minimized.

Show comment
Hide comment
@DavidRoy

DavidRoy Jun 14, 2017

Collaborator

"Re: Alternatively/additionally, there could be a custom survey attribute containing a path to an existing recording form specifically designed for the scheme"
Note that a potential requirement is for a data grid to allow species to be entered across recording schemes, with the attributes/termlists being made available dynamically. I realise this is potentially complex, so one option is to have a relatively fixed set of attributes (e.g. stage, status, sex) but the terms vary depending on the species selected.

Collaborator

DavidRoy commented Jun 14, 2017

"Re: Alternatively/additionally, there could be a custom survey attribute containing a path to an existing recording form specifically designed for the scheme"
Note that a potential requirement is for a data grid to allow species to be entered across recording schemes, with the attributes/termlists being made available dynamically. I realise this is potentially complex, so one option is to have a relatively fixed set of attributes (e.g. stage, status, sex) but the terms vary depending on the species selected.

@johnvanbreda

This comment has been minimized.

Show comment
Hide comment
@johnvanbreda

johnvanbreda Jun 15, 2017

Collaborator

Coming in late to the discussion, a few thoughts of my own:

  1. Having a single record form that dynamically loads attributes from an appropriate survey definition might work. I don't see a multi-record grid working though. For example I might want to enter a plant record plus the bees that visit it and I would want these to belong to the same survey dataset, not 2 different ones. So I think the idea of a super-survey with a range of attributes picked from when you choose the species is more flexible in the long run.
  2. Although taxon groups might not be an appropriate way to divide the attributes up, what about using the family (which is included handily in various cache tables so convenient and fast)?
  3. Part of this requirement might be met by having a form where the attributes themselves remain consistent, but the termlists used to populate the lookup are switched according to the species chosen. So, you would have an abundance attribute which uses a different termlist depending on the species chosen. I'm not sure if this helps if dynamic attributes really are required but it might be easier to implement if the requirement is really dynamic drop down terms.

If the requirement is dynamic attributes then I could imagine an implementation being along the lines of:

  1. Populate the groups table with a list of recording schemes (setting group type term appropriately).
  2. Either use the group filter definition to list the appropriate families, or have a new families_groups table with a UI to populate it.
  3. A new table occurrence_attributes_groups to join between groups (recording schemes) and their relevant attributes (plus one for sample attributes if we get that far) and a UI to configure it.
  4. A big survey with lots of occurrence attributes. Dynamic ones are linked to their groups (recording schemes) and therefore to the families.
  5. A configuration to enable dynamic attributes in the species checklist grid.
  6. The species checklist then performs an attribute lookup when you pick a species, based on the family. Attributes with a shared system function could all load into the same column (e.g. abundance) whereas other more specific attributes would presumably need to go in the extra attributes row.
Collaborator

johnvanbreda commented Jun 15, 2017

Coming in late to the discussion, a few thoughts of my own:

  1. Having a single record form that dynamically loads attributes from an appropriate survey definition might work. I don't see a multi-record grid working though. For example I might want to enter a plant record plus the bees that visit it and I would want these to belong to the same survey dataset, not 2 different ones. So I think the idea of a super-survey with a range of attributes picked from when you choose the species is more flexible in the long run.
  2. Although taxon groups might not be an appropriate way to divide the attributes up, what about using the family (which is included handily in various cache tables so convenient and fast)?
  3. Part of this requirement might be met by having a form where the attributes themselves remain consistent, but the termlists used to populate the lookup are switched according to the species chosen. So, you would have an abundance attribute which uses a different termlist depending on the species chosen. I'm not sure if this helps if dynamic attributes really are required but it might be easier to implement if the requirement is really dynamic drop down terms.

If the requirement is dynamic attributes then I could imagine an implementation being along the lines of:

  1. Populate the groups table with a list of recording schemes (setting group type term appropriately).
  2. Either use the group filter definition to list the appropriate families, or have a new families_groups table with a UI to populate it.
  3. A new table occurrence_attributes_groups to join between groups (recording schemes) and their relevant attributes (plus one for sample attributes if we get that far) and a UI to configure it.
  4. A big survey with lots of occurrence attributes. Dynamic ones are linked to their groups (recording schemes) and therefore to the families.
  5. A configuration to enable dynamic attributes in the species checklist grid.
  6. The species checklist then performs an attribute lookup when you pick a species, based on the family. Attributes with a shared system function could all load into the same column (e.g. abundance) whereas other more specific attributes would presumably need to go in the extra attributes row.
@DavidRoy

This comment has been minimized.

Show comment
Hide comment
@DavidRoy

DavidRoy Jun 15, 2017

Collaborator

That's a good summary.
Having reflected on discussion, I think we should do the following in the first instance.

  1. Work at the family level since the relevant attributes will be fixed at this level of the hierarchy. Some families are split across recording schemes but this is rarely the case, and even when they are the attribute-terms can be the same.
  2. Have a 'super-survey' with a fixed set of occurrence and sample attributes. Could consider extending this with columns for the 'extra attributes' row
  3. Termlist changes dynamically as a species is selected
Collaborator

DavidRoy commented Jun 15, 2017

That's a good summary.
Having reflected on discussion, I think we should do the following in the first instance.

  1. Work at the family level since the relevant attributes will be fixed at this level of the hierarchy. Some families are split across recording schemes but this is rarely the case, and even when they are the attribute-terms can be the same.
  2. Have a 'super-survey' with a fixed set of occurrence and sample attributes. Could consider extending this with columns for the 'extra attributes' row
  3. Termlist changes dynamically as a species is selected
@kitenetter

This comment has been minimized.

Show comment
Hide comment
@kitenetter

kitenetter Jun 15, 2017

Collaborator

One question on the 'super-survey' approach: will this make it more difficult to download the relevant attributes for each taxon group? We have two routes to download data:

  • the generic 'all surveys' data has a fixed format, and I'm not sure how far the System Function tool would allow relevant data from the super-survey to feed in to the generic download - maybe it would be fine?
  • if downloading the 'super-survey' as a specific survey download the user would presumably get a potentially large table with many columns, some of which would be empty for any given species group - maybe that is acceptable as a price to be paid for the increased flexibility, and it's not too onerous for a user to delete unwanted columns from a download?

I think the first of those two points is the more important, as I suspect most users (including verifiers and LERCs) use the generic download most of the time.

Collaborator

kitenetter commented Jun 15, 2017

One question on the 'super-survey' approach: will this make it more difficult to download the relevant attributes for each taxon group? We have two routes to download data:

  • the generic 'all surveys' data has a fixed format, and I'm not sure how far the System Function tool would allow relevant data from the super-survey to feed in to the generic download - maybe it would be fine?
  • if downloading the 'super-survey' as a specific survey download the user would presumably get a potentially large table with many columns, some of which would be empty for any given species group - maybe that is acceptable as a price to be paid for the increased flexibility, and it's not too onerous for a user to delete unwanted columns from a download?

I think the first of those two points is the more important, as I suspect most users (including verifiers and LERCs) use the generic download most of the time.

@kazlauskis

This comment has been minimized.

Show comment
Hide comment
@kazlauskis

kazlauskis Jun 15, 2017

Member

I think that such functionality that downloads all the attributes should be the default one. If I click to download all my records I would like to have all the data associated with each record without having to specify the survey (which I might not even know).

If there are empty attribute columns then simply don't include those in the download report.

Member

kazlauskis commented Jun 15, 2017

I think that such functionality that downloads all the attributes should be the default one. If I click to download all my records I would like to have all the data associated with each record without having to specify the survey (which I might not even know).

If there are empty attribute columns then simply don't include those in the download report.

@JimBacon

This comment has been minimized.

Show comment
Hide comment
@JimBacon

JimBacon Jun 15, 2017

Collaborator
  1. Could use families to determine use of attributes but it seems a shame not to use the scheme data if it is going to be present and it might lead to a simpler implementation.

  2. Agree with super-survey. The survey per scheme breaks down if you want to record species from different schemes in a single sample. Sample attributes will be fixed. Only occurrence attributes will be dynamic. (Dynamic sample attributes suggests sub-samples and probably an over-complicated user interface to me.)

  3. We could implement dynamic termlists in two ways
    a. A fixed attribute with a termlist containing all possible terms which are shown dynamically according to species.
    b. A number of attributes which are shown dynamically, each with a fixed termlist.

Option b can extend to support the extra attributes row while a does not. A solution could implement both a and b but it might be simpler to select one only.

Question. To what extent is this a general requirement or is it an iRecord only requirement?

Collaborator

JimBacon commented Jun 15, 2017

  1. Could use families to determine use of attributes but it seems a shame not to use the scheme data if it is going to be present and it might lead to a simpler implementation.

  2. Agree with super-survey. The survey per scheme breaks down if you want to record species from different schemes in a single sample. Sample attributes will be fixed. Only occurrence attributes will be dynamic. (Dynamic sample attributes suggests sub-samples and probably an over-complicated user interface to me.)

  3. We could implement dynamic termlists in two ways
    a. A fixed attribute with a termlist containing all possible terms which are shown dynamically according to species.
    b. A number of attributes which are shown dynamically, each with a fixed termlist.

Option b can extend to support the extra attributes row while a does not. A solution could implement both a and b but it might be simpler to select one only.

Question. To what extent is this a general requirement or is it an iRecord only requirement?

@DavidRoy

This comment has been minimized.

Show comment
Hide comment
@DavidRoy

DavidRoy Jun 16, 2017

Collaborator

Re: To what extent is this a general requirement or is it an iRecord only requirement?

I think it's a general requirement (e.g. could be useful to LERCs) but my only current requirement is for iRecord so that should be the focus.

Collaborator

DavidRoy commented Jun 16, 2017

Re: To what extent is this a general requirement or is it an iRecord only requirement?

I think it's a general requirement (e.g. could be useful to LERCs) but my only current requirement is for iRecord so that should be the focus.

@JimBacon

This comment has been minimized.

Show comment
Hide comment
@JimBacon

JimBacon Jun 16, 2017

Collaborator

Thanks. That would confirm a solution in the warehouse is preferable to one confined to iRecord.

Collaborator

JimBacon commented Jun 16, 2017

Thanks. That would confirm a solution in the warehouse is preferable to one confined to iRecord.

@DavidRoy DavidRoy assigned johnvanbreda and unassigned JimBacon Jan 30, 2018

@DavidRoy

This comment has been minimized.

Show comment
Hide comment
@DavidRoy

DavidRoy Jan 30, 2018

Collaborator

We are at the point where we need to decide on (and implement) an approach for this.

Karolis has implemented dynamic attributes within the iRecord App.
See demo with dynamic attributes for dragonflies and bryophytes (using taxon_group for dynamic element): http://irecord-app.herokuapp.com/#samples

The occurrences will be submitted under a super-survey which includes many attributes. The intention is that users will be able to edit records via the generic editing form on iRecord, e.g.
https://www.brc.ac.uk/irecord/edit-generic-record?occurrence_id=6409756

This form will become unwieldy as we add in lots of dynamic attributes.

@johnvanbreda is it possible for the generic editing form to have a dynamic element, displaying attributes based on the taxon_group of the species? If so, how much work is involved and when could you fit this into your schedule

Collaborator

DavidRoy commented Jan 30, 2018

We are at the point where we need to decide on (and implement) an approach for this.

Karolis has implemented dynamic attributes within the iRecord App.
See demo with dynamic attributes for dragonflies and bryophytes (using taxon_group for dynamic element): http://irecord-app.herokuapp.com/#samples

The occurrences will be submitted under a super-survey which includes many attributes. The intention is that users will be able to edit records via the generic editing form on iRecord, e.g.
https://www.brc.ac.uk/irecord/edit-generic-record?occurrence_id=6409756

This form will become unwieldy as we add in lots of dynamic attributes.

@johnvanbreda is it possible for the generic editing form to have a dynamic element, displaying attributes based on the taxon_group of the species? If so, how much work is involved and when could you fit this into your schedule

@johnvanbreda

This comment has been minimized.

Show comment
Hide comment
@johnvanbreda

johnvanbreda Apr 23, 2018

Collaborator

@DavidRoy I'm now about to tackle this as I have a similar requirement for another project (so can share the costs). The requirements are slightly different in how the attributes are going to be set up, but it will make a more powerful and flexible solution.

@kazlauskis can you let me know if anything I am planning below contradicts the way you've done this in your app demo please?

The summary of the proposed approach is that we will define taxon attributes that are linked to taxa in the species list data (presumably against UKSI for UK data) by inputting values for the attributes against the appropriate taxa. The attributes will act as templates from which we can automatically derive occurrence attributes and sample attributes where relevant. The values recorded against a taxon for an attribute will do 2 things - firstly, declare that this attribute is available for this taxon (and all its descendants). Secondly, the value (or range of values or multiple selection of terms for lookup attributes) define the validation rules that can be used when values are input for occurrences of this taxon. We will enhance the attribute values to allow ranges so the taxon attributes can be used to provide a range of possible values for the input occurrence data. We’ll also have to build a better hierarchical index of taxonomy in the cache tables so that it’s easy to grab all attribute data looking up or down the hierarchy.

A worked example might make this clearer:

  1. Create a taxon attribute called wing length (mm) - float.
  2. Tick a box to set a new flag “applies to occurrences”.
  3. Add a value for this attribute to the Insecta taxon. Either set a new special value “any”, or a range of possible wing lengths for all insects (e.g. 0.5 to 200mm).
  4. Add another value for this attribute to the Odonata (dragonflies) taxon, with a range set to 20mm - 120mm.
  5. Because the “applies to occurrences” box was ticked, when the attribute was saved an equivalent occurrence attribute will be created automatically by the system.
  6. Link the occurrence attribute to the “mega-survey”.

Now, when the user selects an insect species, a query will use the taxonomic hierarchy index to look for any taxon attributes attached to the selected taxon or any of it’s ancestors that also have an occurrence attribute linked to our survey. If any attribute is found multiple times, we’ll use the one attached at the lowest level. Therefore a non-Odonata insect will find the associated wing length attribute via the taxon attribute value 0.5-200. The input form can then auto-create the control and use the range 0.5-200 as a validation rule. If they picked an Odonata species then the validation rule will be derived from the taxon attribute value linked to Odonata, so the validated range is 20 to 120.

A taxon attribute could also point to a lookup list (e.g. life stages) meaning that the occurrence attribute values associated with this would be picked from the same term list and could also have the range of possibilities enforced by the range of options chosen for the taxon. So you could have a single term list with all life stage terms in it, then link different terms to different nodes in the hierarchy to define their availability. Note that because the taxon attribute value can be set to “any”, you can allow any term from the lookup list to be picked in the occurrence data where appropriate. This might be useful for longer lists of options, e.g. habitats, sampling methods.

Worked example for life stages:

  1. Create a term list for all known insect life stages and an associated taxon attribute (multi-value).
  2. Tick the box to say “applies to occurrences”
  3. Add a value for this attribute to the Insecta taxon, with the special value “Any”.
  4. Because the “applies to occurrences” box was ticked, when the attribute was saved an equivalent occurrence attribute will be created automatically by the system.
  5. Link the occurrence attribute to the “mega-survey”.
  6. At this point, for any record of any insect, the Insect Life Stage attribute is available.
  7. Add a 2nd value for this attribute to the Odonata taxon, and add multiple values, one per allowed life stage for dragonflies (e.g. Larvae, Adult etc).
  8. Now if a dragonfly species is input, the list of available options will be limited because the choice from lower down the taxonomic hierarchy (dragonfly life stage = egg/larvae etc) overrides the choice of “any” at the insect level.

Also note that you can set a flag “applies to samples” on the taxon attribute when defining things like the habitat that would be input at the sample level.

The idea above of linking this to recording schemes isn’t really necessary now - as a separate task we could link taxa to recording schemes which would give a list of attributes for each scheme, but it’s a separate requirement I think.

There will need to be some UI updates to make configuring all this easier, fortunately this is specified as part of the other project.

Collaborator

johnvanbreda commented Apr 23, 2018

@DavidRoy I'm now about to tackle this as I have a similar requirement for another project (so can share the costs). The requirements are slightly different in how the attributes are going to be set up, but it will make a more powerful and flexible solution.

@kazlauskis can you let me know if anything I am planning below contradicts the way you've done this in your app demo please?

The summary of the proposed approach is that we will define taxon attributes that are linked to taxa in the species list data (presumably against UKSI for UK data) by inputting values for the attributes against the appropriate taxa. The attributes will act as templates from which we can automatically derive occurrence attributes and sample attributes where relevant. The values recorded against a taxon for an attribute will do 2 things - firstly, declare that this attribute is available for this taxon (and all its descendants). Secondly, the value (or range of values or multiple selection of terms for lookup attributes) define the validation rules that can be used when values are input for occurrences of this taxon. We will enhance the attribute values to allow ranges so the taxon attributes can be used to provide a range of possible values for the input occurrence data. We’ll also have to build a better hierarchical index of taxonomy in the cache tables so that it’s easy to grab all attribute data looking up or down the hierarchy.

A worked example might make this clearer:

  1. Create a taxon attribute called wing length (mm) - float.
  2. Tick a box to set a new flag “applies to occurrences”.
  3. Add a value for this attribute to the Insecta taxon. Either set a new special value “any”, or a range of possible wing lengths for all insects (e.g. 0.5 to 200mm).
  4. Add another value for this attribute to the Odonata (dragonflies) taxon, with a range set to 20mm - 120mm.
  5. Because the “applies to occurrences” box was ticked, when the attribute was saved an equivalent occurrence attribute will be created automatically by the system.
  6. Link the occurrence attribute to the “mega-survey”.

Now, when the user selects an insect species, a query will use the taxonomic hierarchy index to look for any taxon attributes attached to the selected taxon or any of it’s ancestors that also have an occurrence attribute linked to our survey. If any attribute is found multiple times, we’ll use the one attached at the lowest level. Therefore a non-Odonata insect will find the associated wing length attribute via the taxon attribute value 0.5-200. The input form can then auto-create the control and use the range 0.5-200 as a validation rule. If they picked an Odonata species then the validation rule will be derived from the taxon attribute value linked to Odonata, so the validated range is 20 to 120.

A taxon attribute could also point to a lookup list (e.g. life stages) meaning that the occurrence attribute values associated with this would be picked from the same term list and could also have the range of possibilities enforced by the range of options chosen for the taxon. So you could have a single term list with all life stage terms in it, then link different terms to different nodes in the hierarchy to define their availability. Note that because the taxon attribute value can be set to “any”, you can allow any term from the lookup list to be picked in the occurrence data where appropriate. This might be useful for longer lists of options, e.g. habitats, sampling methods.

Worked example for life stages:

  1. Create a term list for all known insect life stages and an associated taxon attribute (multi-value).
  2. Tick the box to say “applies to occurrences”
  3. Add a value for this attribute to the Insecta taxon, with the special value “Any”.
  4. Because the “applies to occurrences” box was ticked, when the attribute was saved an equivalent occurrence attribute will be created automatically by the system.
  5. Link the occurrence attribute to the “mega-survey”.
  6. At this point, for any record of any insect, the Insect Life Stage attribute is available.
  7. Add a 2nd value for this attribute to the Odonata taxon, and add multiple values, one per allowed life stage for dragonflies (e.g. Larvae, Adult etc).
  8. Now if a dragonfly species is input, the list of available options will be limited because the choice from lower down the taxonomic hierarchy (dragonfly life stage = egg/larvae etc) overrides the choice of “any” at the insect level.

Also note that you can set a flag “applies to samples” on the taxon attribute when defining things like the habitat that would be input at the sample level.

The idea above of linking this to recording schemes isn’t really necessary now - as a separate task we could link taxa to recording schemes which would give a list of attributes for each scheme, but it’s a separate requirement I think.

There will need to be some UI updates to make configuring all this easier, fortunately this is specified as part of the other project.

@JimBacon

This comment has been minimized.

Show comment
Hide comment
@JimBacon

JimBacon Apr 23, 2018

Collaborator

That sounds pretty reasonable to me.

Could you confirm that

  • any website with its own taxon list could create their own dynamic attributes.
  • any website using the UK Master List (once it has bee decorated with taxon attributes which apply to samples/occurrences) can choose which, if any, dynamic attributes to select to use in their surveys.

Would another website be able to use the UK Master List and augment it with their own dynamic attributes?

A dynamic attribute which applies to sample would presumably create sub-samples in the record.

Collaborator

JimBacon commented Apr 23, 2018

That sounds pretty reasonable to me.

Could you confirm that

  • any website with its own taxon list could create their own dynamic attributes.
  • any website using the UK Master List (once it has bee decorated with taxon attributes which apply to samples/occurrences) can choose which, if any, dynamic attributes to select to use in their surveys.

Would another website be able to use the UK Master List and augment it with their own dynamic attributes?

A dynamic attribute which applies to sample would presumably create sub-samples in the record.

@johnvanbreda

This comment has been minimized.

Show comment
Hide comment
@johnvanbreda

johnvanbreda Apr 23, 2018

Collaborator

@JimBacon I had envisaged that the attributes would generally always be attached to UKSI (since it has a fairly reliable hierarchy) and that any list which has a TVK in the external key would then be able to match across to find the attributes. Even though the UKSI list might end up with lots of attributes attached from lots of websites, they would only be relevant to the survey datasets where the linked occurrence or sample attributes had been joined to the survey dataset (see point 5 in both examples above). It would be possible to keep attributes in separate lists I think, though the cost of this increase in flexibility might be an increase the complexity of some of the joins required to find the attributes.
I think I've just confirmed your 2nd bullet point and the question about websites augmenting the UKSI attributes with their own.

The last question about sub-samples would not be necessary for single sample/record forms. For multi-record forms I think that in many cases you could attach the attributes to the same sample. For example lets say you were recording lichens with specific pollution tolerance and other requirements relating to the chemical conditions. One lichen might ask you to measure the pH, another might ask you to measure the sulphur content of the substrate. If they are in the same sample, then both these measurements can be simply attached to one sample. I suppose there may be other implementations where you want to create sub-samples to keep things separate, but I think that would be up to the implementation.

Collaborator

johnvanbreda commented Apr 23, 2018

@JimBacon I had envisaged that the attributes would generally always be attached to UKSI (since it has a fairly reliable hierarchy) and that any list which has a TVK in the external key would then be able to match across to find the attributes. Even though the UKSI list might end up with lots of attributes attached from lots of websites, they would only be relevant to the survey datasets where the linked occurrence or sample attributes had been joined to the survey dataset (see point 5 in both examples above). It would be possible to keep attributes in separate lists I think, though the cost of this increase in flexibility might be an increase the complexity of some of the joins required to find the attributes.
I think I've just confirmed your 2nd bullet point and the question about websites augmenting the UKSI attributes with their own.

The last question about sub-samples would not be necessary for single sample/record forms. For multi-record forms I think that in many cases you could attach the attributes to the same sample. For example lets say you were recording lichens with specific pollution tolerance and other requirements relating to the chemical conditions. One lichen might ask you to measure the pH, another might ask you to measure the sulphur content of the substrate. If they are in the same sample, then both these measurements can be simply attached to one sample. I suppose there may be other implementations where you want to create sub-samples to keep things separate, but I think that would be up to the implementation.

@JimBacon

This comment has been minimized.

Show comment
Hide comment
@JimBacon

JimBacon Apr 23, 2018

Collaborator

Glad I asked about how other websites create dynamic attributes. That was not the design I had imagined. It makes the UK Master List a special case of a taxon list. Will that work okay for other warehouses you know?

Sorry about the sub-sample question. I realise (after a lot of muddle thinking) that this just comes down to your experimental design of what a sample is and how fine-grained your measurements are. If you choose a design without sub-samples and you add two different species needing the same dynamic sample attribute then you only need to add it to the form once. Simple.

Collaborator

JimBacon commented Apr 23, 2018

Glad I asked about how other websites create dynamic attributes. That was not the design I had imagined. It makes the UK Master List a special case of a taxon list. Will that work okay for other warehouses you know?

Sorry about the sub-sample question. I realise (after a lot of muddle thinking) that this just comes down to your experimental design of what a sample is and how fine-grained your measurements are. If you choose a design without sub-samples and you add two different species needing the same dynamic sample attribute then you only need to add it to the form once. Simple.

@kazlauskis

This comment has been minimized.

Show comment
Hide comment
@kazlauskis

kazlauskis Apr 23, 2018

Member

Just to double check I understood it correctly. A taxon attribute would be defined in such (simplified here) way:

name: "wing length",
type: "float",
smp: [ ]
occ: [X]
taxa: {
  "Insecta": { value: "any"},
  "Odonata": { value: " 0.5-200"},
}

This would be linked to the occurrence attribute that is then associated to the 'mega-survey' shared between the iRecord website and the app.

 // what is the role of the occ_attr if we have the taxon_attr?
 taxon_attr <- occ_attr <- survey // ?

A scenario: A user has incorrectly recorded some insect species and to correct it now selects Aeshna caerulea in the dynamic form. The form would redraw itself to show the 'wing length (0.5-200)' attribute that the species parent (Odonata) has an association with.

"Animalia" : {
   "Euarthropoda": {
      "Insecta": {
           "Odonata": {
               "Aeshnidae": {
                   "Aeshna caerulea"
                }
            }
       }
   }
}

If this is how it is, then I like it, though it is different and poses some challenges for integrating it with the mobile apps. At the moment, the dynamic attributes in the iRecord App are flat - there is no hierarchy as the definitions of the attributes are directly associated to informal taxon groups and nothing else. If we have moved to your proposed idea then from the users perspective, all the submitted records that belong to the current app-survey would still be valid (?) and the user shouldn't notice much difference. For the app, on the other hand, it would require to move to using a hierarchical UKSI list within the mobile device, which is possible, but requires a bit more thinking and rewrite some highly optimised data structures. I will look into this and get back to you soon. Otherwise, as far as I understand it sounds powerful and a good idea.

Member

kazlauskis commented Apr 23, 2018

Just to double check I understood it correctly. A taxon attribute would be defined in such (simplified here) way:

name: "wing length",
type: "float",
smp: [ ]
occ: [X]
taxa: {
  "Insecta": { value: "any"},
  "Odonata": { value: " 0.5-200"},
}

This would be linked to the occurrence attribute that is then associated to the 'mega-survey' shared between the iRecord website and the app.

 // what is the role of the occ_attr if we have the taxon_attr?
 taxon_attr <- occ_attr <- survey // ?

A scenario: A user has incorrectly recorded some insect species and to correct it now selects Aeshna caerulea in the dynamic form. The form would redraw itself to show the 'wing length (0.5-200)' attribute that the species parent (Odonata) has an association with.

"Animalia" : {
   "Euarthropoda": {
      "Insecta": {
           "Odonata": {
               "Aeshnidae": {
                   "Aeshna caerulea"
                }
            }
       }
   }
}

If this is how it is, then I like it, though it is different and poses some challenges for integrating it with the mobile apps. At the moment, the dynamic attributes in the iRecord App are flat - there is no hierarchy as the definitions of the attributes are directly associated to informal taxon groups and nothing else. If we have moved to your proposed idea then from the users perspective, all the submitted records that belong to the current app-survey would still be valid (?) and the user shouldn't notice much difference. For the app, on the other hand, it would require to move to using a hierarchical UKSI list within the mobile device, which is possible, but requires a bit more thinking and rewrite some highly optimised data structures. I will look into this and get back to you soon. Otherwise, as far as I understand it sounds powerful and a good idea.

@DavidRoy

This comment has been minimized.

Show comment
Hide comment
@DavidRoy

DavidRoy Apr 24, 2018

Collaborator

If I understand it correctly, John's proposal has the advantage of being flexible to enable dynamic attributes to be set at any level of the taxon hierarchy. What Karolis has implemented in the App sets the dynamic attributes at the taxon_group level on the expectation that most attributes are defined by National Recording Schemes.

My question is therefore whether the extra complexity is needed. I assume John's other work requires this.

Collaborator

DavidRoy commented Apr 24, 2018

If I understand it correctly, John's proposal has the advantage of being flexible to enable dynamic attributes to be set at any level of the taxon hierarchy. What Karolis has implemented in the App sets the dynamic attributes at the taxon_group level on the expectation that most attributes are defined by National Recording Schemes.

My question is therefore whether the extra complexity is needed. I assume John's other work requires this.

@johnvanbreda

This comment has been minimized.

Show comment
Hide comment
@johnvanbreda

johnvanbreda Aug 20, 2018

Collaborator

@kazlauskis & @DavidRoy, I think your question is basically the same - is there actually a need to attach dynamic attributes at any level in the hierarchy, or would limiting the attachment at group level be sufficient? The solution currently implemented was funded by the DGfM (German Mycologists) and they do have the requirement to link attributes in a much more fine-grained way than at taxon group level. Typically attributes will be associated at the genus level and simply associating all attributes to "fungi" would defeat the purpose of the development. In fact they are also able to limit the associations to certain life stages, e.g. measurements of cap width for fruiting fungi only. This extra stage linking functionality can simply be ignored if not required.
I suspect that a solution limited to linking attributes to taxon groups would meet the requirements in the UK only so far and that sooner or later we will find cases where additional control is required. Just as an example, recording "insect - hymenopteran" might need a prey attribute for wasps but attributes relating to flower visit/pollen collection for bees.
Karolis - V2 (the develop branch which this feature in) has a cache_taxon_paths table with the hierarchical information required to make the querying for all attributes linked to a taxon or any of it's parents efficient.

Collaborator

johnvanbreda commented Aug 20, 2018

@kazlauskis & @DavidRoy, I think your question is basically the same - is there actually a need to attach dynamic attributes at any level in the hierarchy, or would limiting the attachment at group level be sufficient? The solution currently implemented was funded by the DGfM (German Mycologists) and they do have the requirement to link attributes in a much more fine-grained way than at taxon group level. Typically attributes will be associated at the genus level and simply associating all attributes to "fungi" would defeat the purpose of the development. In fact they are also able to limit the associations to certain life stages, e.g. measurements of cap width for fruiting fungi only. This extra stage linking functionality can simply be ignored if not required.
I suspect that a solution limited to linking attributes to taxon groups would meet the requirements in the UK only so far and that sooner or later we will find cases where additional control is required. Just as an example, recording "insect - hymenopteran" might need a prey attribute for wasps but attributes relating to flower visit/pollen collection for bees.
Karolis - V2 (the develop branch which this feature in) has a cache_taxon_paths table with the hierarchical information required to make the querying for all attributes linked to a taxon or any of it's parents efficient.

@kazlauskis

This comment has been minimized.

Show comment
Hide comment
@kazlauskis

kazlauskis Sep 9, 2018

Member

I am happy for this to be done in either way. I will very soon be blocked by this and so it would be good to have the edit form on the website asap. The app is now going to support even more taxon specific attributes and if we don't have a form ready for users to edit their records, the app's record editing on the website will be limited at best. Otherwise, it will be very confusing because the forms will have to be bloated to accept any possible argument, which our super-survey is holding at the moment.

If I understand this correctly, we can assign the same set of attributes to multiple taxa that would essentially constitute an informal taxon group? If so, then I am all good. Later syncing such survey config with the app might be a challenge, but at least it will get me going.

Thanks!

Member

kazlauskis commented Sep 9, 2018

I am happy for this to be done in either way. I will very soon be blocked by this and so it would be good to have the edit form on the website asap. The app is now going to support even more taxon specific attributes and if we don't have a form ready for users to edit their records, the app's record editing on the website will be limited at best. Otherwise, it will be very confusing because the forms will have to be bloated to accept any possible argument, which our super-survey is holding at the moment.

If I understand this correctly, we can assign the same set of attributes to multiple taxa that would essentially constitute an informal taxon group? If so, then I am all good. Later syncing such survey config with the app might be a challenge, but at least it will get me going.

Thanks!

@kazlauskis

This comment has been minimized.

Show comment
Hide comment
@kazlauskis

kazlauskis Sep 16, 2018

Member

I am happy to create a highly dynamic edit form myself - based on the front-end javascript rather than the php iform module. This isn't following how other forms are done right now, but might be a temporary solution to get us going.

Member

kazlauskis commented Sep 16, 2018

I am happy to create a highly dynamic edit form myself - based on the front-end javascript rather than the php iform module. This isn't following how other forms are done right now, but might be a temporary solution to get us going.

@johnvanbreda

This comment has been minimized.

Show comment
Hide comment
@johnvanbreda

johnvanbreda Sep 17, 2018

Collaborator

Hi Karolis - I plan to have the dev server up and running with the latest code this morning - will that help? You can then try the new dynamic forms stuff out.

Collaborator

johnvanbreda commented Sep 17, 2018

Hi Karolis - I plan to have the dev server up and running with the latest code this morning - will that help? You can then try the new dynamic forms stuff out.

@kazlauskis

This comment has been minimized.

Show comment
Hide comment
@kazlauskis

kazlauskis Sep 17, 2018

Member

Thanks John, yes, this would be great. I would be able to start working on it this weekend, so any time this week is good.

Member

kazlauskis commented Sep 17, 2018

Thanks John, yes, this would be great. I would be able to start working on it this weekend, so any time this week is good.

@johnvanbreda

This comment has been minimized.

Show comment
Hide comment
@johnvanbreda

johnvanbreda Sep 21, 2018

Collaborator

The dev warehouse is now running the v2 code.

Collaborator

johnvanbreda commented Sep 21, 2018

The dev warehouse is now running the v2 code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment