Add type=xml-internal and generate instance XML from choices sheet #176

lognaturel · 2018-02-26T23:32:02Z

As a form designer, I want to be able to specify static data sets that can be queried from the form. Currently this can be hacked with selects and choice-filter but it's not elegant (e.g. https://docs.google.com/spreadsheets/d/1mQGVWaJA4QQp3jtddMdcRSDjc1O-j_bB79GUnlj5k10/edit#gid=0 where we want to specify a lookup table without needing an external file).

#107 added type xml-external to specify an external XML instance that can be queried from within the primary instance.

xml-internal would work the same way but create an internal secondary instance. This is useful for specifying additional settings as described in #66 or including small, static data sets that are useful to do XPath queries on.

type	name
xml-internal	mydata

The internal secondary instance named mydata would be generated from the mydata list on the choices sheet.

Alternate ideas for instance generation

For this to be useful, pyxform should also generate the instance. This could be done from the existing choices sheet which has very similar behavior currently. If going this route, it might be necessary to change the rules that govern whether or not a column's values are included in an instance. Currently columns other than name/value and label are only included if there's a choice-filter that uses it. For more generic data sets `choice-filter` doesn't make sense. For example, the rule could be changed to something like if any row in this list specifies a value in the column, include it.

Alternately, another sheet with name something like datasets could be introduced. The benefit there is that name and label wouldn't need to be required and a different rule for including columns could easily be introduced.

Instance generation should be possible for xml-external as well (much like the itemsets.csv file is generated).

The text was updated successfully, but these errors were encountered:

MartijnR · 2018-02-27T18:18:12Z

Thanks for this. I think the use cases (vs. external data) are situations where the data is not used across forms, or where the user prefers an easier-to-understand (all-in-one xls file) method.

Do you think the XML structure should be the same <root><item>...</item></root> as is currently used for selects with choice_filters, and equal to what's proposed here? (excluding the proposed lang attribute addition I guess, as that's only solving a problem for external data) That makes sense to me and puts us right on the path towards explicitly/properly adding external CSV data in the future.

I don't have a strong preference for using the existing choices sheet vs a new datasets sheet. I agree the latter is a little more flexible because of bypassing the name/label column requirement, but it's mostly about what would be easier for users I think. At first glance, there seems to be a semantic difference between choices and the data added with new feature. However, it is possible to use the data in the new feature to generate a dynamic choice list for select questions as well and vice versa choices can currently be used for calculations (which might be confusing?).

lognaturel · 2018-03-01T01:45:28Z

Yes, I think using the <root><item>...</item></root> structure makes sense.

It's also not totally clear to me whether adding a datasets sheet is helpful or confusing. I think for most users choices and datasets would be conceptually different.

However, it is possible to use the data in the new feature to generate a dynamic choice list for select questions as well and vice versa choices can currently be used for calculations (which might be confusing?).

I think if someone understands that choices can be used for arbitrary calculations having two sheets won't be confusing. But for someone who doesn't know how they're related conflating the two seems more confusing to me.

Hoping we get some other opinions on this!

nribeka · 2019-10-02T18:50:02Z

Interested with this one.

nribeka · 2019-10-02T21:03:01Z

@lognaturel,

So if I'm getting this right, the idea is basically:

When the type is xml-internal, you create choices based on the choices tab
Add them to the survey as a separate instance.

Update 05.03PM:

The approach will be either using existing "choices" json element, or create a separate "datasets" element in the json object, correct?

lognaturel · 2019-10-03T03:27:47Z

When the type is xml-internal, you create choices based on the choices tab
Add them to the survey as a separate instance.

That's right. Please also note that the same list could also be used as a select. In that case, the instance should only be created once. Additionally, there may be more data columns specified beyond name and label and those should always be included for lists that are used as xml-internal.

I'm leaning towards keeping all data list definitions on the choices sheet. As @MartijnR says, any data list can be used both for populating selects and for arbitrary querying so separating them feels odd.

nribeka · 2019-10-03T15:06:43Z

@lognaturel,

After reading and poking around the code, I find an odd behavior. If the parser encounter choice_filter it will add every choices as an instance.

So if you have form like this:

type	name	label	choice_filter
select_one states	state	The State	state = 'VA'

list_name	name	label
states	VA	Virginia
states	IN	Indiana
fruits	apple	Apple
fruits	banana	Banana

You will ended up with form like the following. It could be an expected behavior, or maybe this is a bug?

<?xml version="1.0"?>
<h:html xmlns="http://www.w3.org/2002/xforms" xmlns:ev="http://www.w3.org/2001/xml-events" xmlns:h="http://www.w3.org/1999/xhtml" xmlns:jr="http://openrosa.org/javarosa" xmlns:odk="http://www.opendatakit.org/xforms" xmlns:orx="http://openrosa.org/xforms" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <h:head>
    <h:title>data</h:title>
    <model>
      <instance>
        <data id="data">
          <state/>
          <meta>
            <instanceID/>
          </meta>
        </data>
      </instance>
      <instance id="states">
        <root>
          <item>
            <label>Virginia</label>
            <name>VA</name>
          </item>
          <item>
            <label>Indiana</label>
            <name>IN</name>
          </item>
        </root>
      </instance>
      <instance id="fruits">
        <root>
          <item>
            <label>Apple</label>
            <name>apple</name>
          </item>
          <item>
            <label>Banana</label>
            <name>banana</name>
          </item>
        </root>
      </instance>
      <bind nodeset="/data/state" type="select1"/>
      <bind jr:preload="uid" nodeset="/data/meta/instanceID" readonly="true()" type="string"/>
    </model>
  </h:head>
  <h:body>
    <select1 ref="/data/state">
      <label>The State</label>
      <itemset nodeset="instance('states')/root/item[state = 'VA']">
        <value ref="name"/>
        <label ref="label"/>
      </itemset>
    </select1>
  </h:body>
</h:html>

lognaturel · 2021-10-18T16:59:05Z

#383 looks like it was close. Closing that so it can be picked up again.

lognaturel · 2021-10-18T17:24:26Z

Looking at this again, I'm not very enthusiastic about the xml-internal name. It makes perfect sense by analogy to xml-external but it has very little meaning to a user who doesn't know about the underlying ODK XForms spec. Also, internally, there's only XML. Maybe something like data-internal? That's not great either. I think the default should be to stick with xml-internal but I did want to mention this thought in case it made anyone else creative.

The issue described at #176 (comment) is tracked at #387

lognaturel · 2022-05-19T22:35:34Z

Alternative: introduce a new type dataset that only has a name.

if name has .csv, .xml or .geojson suffix: an external secondary instance declaration is added. The name column without the extension is used as its id. The name with jr://file/ prefixed is its src.
if name has no .csv, .xml or .geojson suffix: an internal secondary instance is added. The name column is used as its id and to look up the list in the choice_list (same as select_one)

This is in conflict with the first point I brought up at #107 (comment) I don't find it as compelling anymore.

This would overlap with xml-external which is not great. But it would mean we don't have to introduce a new type for csv-external and it feels more user friendly than xml-internal.

A related option is to use dataset for what we described here as xml-internal and also introduce csv-external. I currently prefer dataset for both because it's less to remember.

MartijnR · 2022-05-20T14:34:58Z

Seems an elegant solution! 👍

The name with jr://file/ prefixed is its src.

or jr://file-csv? (and not sure what is used for geojson)

lindsay-stevens · 2023-12-06T14:00:34Z

#603 (Always generate secondary instance for selects) is completed so I think this use case can be achieved with selects. If a new question type is still preferred then please re-open.

lognaturel added the needs discussion label Feb 26, 2018

lognaturel mentioned this issue May 1, 2018

Build some kind of mental model for external data getodk/docs#73

Open

lognaturel removed the needs discussion label May 10, 2018

lognaturel mentioned this issue May 22, 2018

Why aren't instances generated for all selects? #203

Closed

lognaturel mentioned this issue Jul 11, 2018

Accessing Choice metadata in a Forumla XLSForm/xlsform.github.io#131

Closed

ukanga added this to To do in Release 1.0 via automation Mar 19, 2019

nribeka mentioned this issue Oct 3, 2019

Allowing xml-internal question type. #383

Closed

lognaturel mentioned this issue Oct 16, 2019

Choice lists for all selects are generated as instances if any list uses a choice filter #387

Closed

lognaturel moved this from To do to In progress in Release 1.0 Nov 6, 2019

lognaturel mentioned this issue Mar 1, 2022

Always generate an instance for a list with columns other than name and label #590

Closed

lognaturel added the needs discussion label Mar 1, 2022

lindsay-stevens closed this as completed Dec 6, 2023

Release 1.0 automation moved this from In progress to Done Dec 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add type=xml-internal and generate instance XML from choices sheet #176

Add type=xml-internal and generate instance XML from choices sheet #176

lognaturel commented Feb 26, 2018 •

edited

MartijnR commented Feb 27, 2018 •

edited

lognaturel commented Mar 1, 2018

nribeka commented Oct 2, 2019

nribeka commented Oct 2, 2019 •

edited

lognaturel commented Oct 3, 2019

nribeka commented Oct 3, 2019

lognaturel commented Oct 18, 2021

lognaturel commented Oct 18, 2021

lognaturel commented May 19, 2022

MartijnR commented May 20, 2022

lindsay-stevens commented Dec 6, 2023

Add type=xml-internal and generate instance XML from choices sheet #176

Add type=xml-internal and generate instance XML from choices sheet #176

Comments

lognaturel commented Feb 26, 2018 • edited

MartijnR commented Feb 27, 2018 • edited

lognaturel commented Mar 1, 2018

nribeka commented Oct 2, 2019

nribeka commented Oct 2, 2019 • edited

lognaturel commented Oct 3, 2019

nribeka commented Oct 3, 2019

lognaturel commented Oct 18, 2021

lognaturel commented Oct 18, 2021

lognaturel commented May 19, 2022

MartijnR commented May 20, 2022

lindsay-stevens commented Dec 6, 2023

lognaturel commented Feb 26, 2018 •

edited

MartijnR commented Feb 27, 2018 •

edited

nribeka commented Oct 2, 2019 •

edited