Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add type=xml-internal and generate instance XML from choices sheet #176

Closed
lognaturel opened this issue Feb 26, 2018 · 11 comments
Closed

Add type=xml-internal and generate instance XML from choices sheet #176

lognaturel opened this issue Feb 26, 2018 · 11 comments

Comments

@lognaturel
Copy link
Contributor

lognaturel commented Feb 26, 2018

As a form designer, I want to be able to specify static data sets that can be queried from the form. Currently this can be hacked with selects and choice-filter but it's not elegant (e.g. https://docs.google.com/spreadsheets/d/1mQGVWaJA4QQp3jtddMdcRSDjc1O-j_bB79GUnlj5k10/edit#gid=0 where we want to specify a lookup table without needing an external file).

#107 added type xml-external to specify an external XML instance that can be queried from within the primary instance.

xml-internal would work the same way but create an internal secondary instance. This is useful for specifying additional settings as described in #66 or including small, static data sets that are useful to do XPath queries on.

type name
xml-internal mydata

The internal secondary instance named mydata would be generated from the mydata list on the choices sheet.

Alternate ideas for instance generation For this to be useful, pyxform should also generate the instance. This could be done from the existing choices sheet which has very similar behavior currently. If going this route, it might be necessary to change the rules that govern whether or not a column's values are included in an instance. Currently columns other than name/value and label are only included if there's a choice-filter that uses it. For more generic data sets `choice-filter` doesn't make sense. For example, the rule could be changed to something like if any row in this list specifies a value in the column, include it.

Alternately, another sheet with name something like datasets could be introduced. The benefit there is that name and label wouldn't need to be required and a different rule for including columns could easily be introduced.

Instance generation should be possible for xml-external as well (much like the itemsets.csv file is generated).

@MartijnR
Copy link
Contributor

MartijnR commented Feb 27, 2018

Thanks for this. I think the use cases (vs. external data) are situations where the data is not used across forms, or where the user prefers an easier-to-understand (all-in-one xls file) method.

Do you think the XML structure should be the same <root><item>...</item></root> as is currently used for selects with choice_filters, and equal to what's proposed here? (excluding the proposed lang attribute addition I guess, as that's only solving a problem for external data) That makes sense to me and puts us right on the path towards explicitly/properly adding external CSV data in the future.

I don't have a strong preference for using the existing choices sheet vs a new datasets sheet. I agree the latter is a little more flexible because of bypassing the name/label column requirement, but it's mostly about what would be easier for users I think. At first glance, there seems to be a semantic difference between choices and the data added with new feature. However, it is possible to use the data in the new feature to generate a dynamic choice list for select questions as well and vice versa choices can currently be used for calculations (which might be confusing?).

@lognaturel
Copy link
Contributor Author

Yes, I think using the <root><item>...</item></root> structure makes sense.

It's also not totally clear to me whether adding a datasets sheet is helpful or confusing. I think for most users choices and datasets would be conceptually different.

However, it is possible to use the data in the new feature to generate a dynamic choice list for select questions as well and vice versa choices can currently be used for calculations (which might be confusing?).

I think if someone understands that choices can be used for arbitrary calculations having two sheets won't be confusing. But for someone who doesn't know how they're related conflating the two seems more confusing to me.

Hoping we get some other opinions on this!

@nribeka
Copy link
Contributor

nribeka commented Oct 2, 2019

Interested with this one.

@nribeka
Copy link
Contributor

nribeka commented Oct 2, 2019

@lognaturel,

So if I'm getting this right, the idea is basically:

  • When the type is xml-internal, you create choices based on the choices tab
  • Add them to the survey as a separate instance.

Update 05.03PM:

  • The approach will be either using existing "choices" json element, or create a separate "datasets" element in the json object, correct?

@lognaturel
Copy link
Contributor Author

When the type is xml-internal, you create choices based on the choices tab
Add them to the survey as a separate instance.

That's right. Please also note that the same list could also be used as a select. In that case, the instance should only be created once. Additionally, there may be more data columns specified beyond name and label and those should always be included for lists that are used as xml-internal.

I'm leaning towards keeping all data list definitions on the choices sheet. As @MartijnR says, any data list can be used both for populating selects and for arbitrary querying so separating them feels odd.

@nribeka
Copy link
Contributor

nribeka commented Oct 3, 2019

@lognaturel,

After reading and poking around the code, I find an odd behavior. If the parser encounter choice_filter it will add every choices as an instance.

So if you have form like this:

type name label choice_filter
select_one states state The State state = 'VA'
list_name name label
states VA Virginia
states IN Indiana
fruits apple Apple
fruits banana Banana

You will ended up with form like the following. It could be an expected behavior, or maybe this is a bug?

<?xml version="1.0"?>
<h:html xmlns="http://www.w3.org/2002/xforms" xmlns:ev="http://www.w3.org/2001/xml-events" xmlns:h="http://www.w3.org/1999/xhtml" xmlns:jr="http://openrosa.org/javarosa" xmlns:odk="http://www.opendatakit.org/xforms" xmlns:orx="http://openrosa.org/xforms" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <h:head>
    <h:title>data</h:title>
    <model>
      <instance>
        <data id="data">
          <state/>
          <meta>
            <instanceID/>
          </meta>
        </data>
      </instance>
      <instance id="states">
        <root>
          <item>
            <label>Virginia</label>
            <name>VA</name>
          </item>
          <item>
            <label>Indiana</label>
            <name>IN</name>
          </item>
        </root>
      </instance>
      <instance id="fruits">
        <root>
          <item>
            <label>Apple</label>
            <name>apple</name>
          </item>
          <item>
            <label>Banana</label>
            <name>banana</name>
          </item>
        </root>
      </instance>
      <bind nodeset="/data/state" type="select1"/>
      <bind jr:preload="uid" nodeset="/data/meta/instanceID" readonly="true()" type="string"/>
    </model>
  </h:head>
  <h:body>
    <select1 ref="/data/state">
      <label>The State</label>
      <itemset nodeset="instance('states')/root/item[state = 'VA']">
        <value ref="name"/>
        <label ref="label"/>
      </itemset>
    </select1>
  </h:body>
</h:html>

@lognaturel
Copy link
Contributor Author

#383 looks like it was close. Closing that so it can be picked up again.

@lognaturel
Copy link
Contributor Author

Looking at this again, I'm not very enthusiastic about the xml-internal name. It makes perfect sense by analogy to xml-external but it has very little meaning to a user who doesn't know about the underlying ODK XForms spec. Also, internally, there's only XML. Maybe something like data-internal? That's not great either. I think the default should be to stick with xml-internal but I did want to mention this thought in case it made anyone else creative.

The issue described at #176 (comment) is tracked at #387

@lognaturel
Copy link
Contributor Author

Alternative: introduce a new type dataset that only has a name.

  • if name has .csv, .xml or .geojson suffix: an external secondary instance declaration is added. The name column without the extension is used as its id. The name with jr://file/ prefixed is its src.
  • if name has no .csv, .xml or .geojson suffix: an internal secondary instance is added. The name column is used as its id and to look up the list in the choice_list (same as select_one)

This is in conflict with the first point I brought up at #107 (comment) I don't find it as compelling anymore.

This would overlap with xml-external which is not great. But it would mean we don't have to introduce a new type for csv-external and it feels more user friendly than xml-internal.

A related option is to use dataset for what we described here as xml-internal and also introduce csv-external. I currently prefer dataset for both because it's less to remember.

@MartijnR
Copy link
Contributor

Seems an elegant solution! 👍

The name with jr://file/ prefixed is its src.

or jr://file-csv? (and not sure what is used for geojson)

@lindsay-stevens
Copy link
Contributor

#603 (Always generate secondary instance for selects) is completed so I think this use case can be achieved with selects. If a new question type is still preferred then please re-open.

Release 1.0 automation moved this from In progress to Done Dec 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Release 1.0
  
Done
Development

Successfully merging a pull request may close this issue.

4 participants