Skip to content
This repository has been archived by the owner on Nov 2, 2023. It is now read-only.

Filename of XLSForm export should be XML compliant #44

Closed
yanokwa opened this issue Dec 5, 2016 · 15 comments
Closed

Filename of XLSForm export should be XML compliant #44

yanokwa opened this issue Dec 5, 2016 · 15 comments

Comments

@yanokwa
Copy link
Member

yanokwa commented Dec 5, 2016

When pyxform converts XLS to XML, it uses the XLS file name as the root node name. And because XML has special character requirements (e.g., names must begin with a letter, colon, or underscore, subsequent characters can include numbers, dashes, and periods), we have to be careful about the name of the export.

One way to make this name compliant is to use the form ID in the export, but drop "build_" and the timestamp to make it more human-friendly. So for example, if your form id is build_Favorite-Color_1480937828, the exported file name should be Favorite-Color.
edit by @issa-tseng; this section is confusing as it refers to things Build does not actually do

With this strategy, it might still be possible to have an invalid root node name, so the export should also handle names that start with invalid characters.

@issa-tseng
Copy link
Member

I think exactly those restrictions are why I generate the name as described:

I prefix with build_ so that the leading character cannot be a number.
I postfix with the timestamp because aggregate chokes if multiple forms have the same ID and people were having problems with that.

If you'd prefer risking the user's own input I'm happy to go back to that behaviour.

@yanokwa
Copy link
Member Author

yanokwa commented Dec 21, 2016

I think my reference to the form ID confused things!

If you have a form name 1 Form and you export it, you get 1-Form-export. This XLSForm, when converted with pyxform will fail because 1-Form-export is used as the name of an XML node and nodes cannot start with a number.

Is it possible to add validation to form names?

@issa-tseng issa-tseng added this to the 0.3.3 milestone Jun 12, 2017
@issa-tseng
Copy link
Member

I'm just going to prefix a _ if I see a leading number. Cool @yanokwa ?

@yanokwa
Copy link
Member Author

yanokwa commented Jun 12, 2017

Hmm. Why not use validation on the names?

@issa-tseng
Copy link
Member

  1. Completely different part of the code.
  2. A 100% valid Build=>XML=>Aggregate=>Collect filename would be rejected simply because Build=>XLSForm could theoretically fail.
  3. Already-saved forms will load up and immediately flag red.
  4. It's a really confusing limitation to explain; see 2.

@yanokwa
Copy link
Member Author

yanokwa commented Jun 12, 2017

@lognaturel had an alternate idea. Given that the form is valid, maybe we should file this as an issue against pyxform and it should handle numbers more gracefully.

@issa-tseng
Copy link
Member

I'm okay with either! This is a pretty simple change on this side, though. Just catch XLSForm files on their way out the door and subst the attachment meta name.

@lognaturel
Copy link
Member

I'm happy with leading numbers being prepended by _.

@yanokwa
Copy link
Member Author

yanokwa commented Jun 14, 2017

I've filed this upstream: XLSForm/pyxform#130. I propose we hold off on a fix until the pyxform team has a chance to respond.

@issa-tseng
Copy link
Member

This is still a trivial one-line change on our side. I'm happy to do it or to do nothing; please advise! :)

@yanokwa
Copy link
Member Author

yanokwa commented Jul 6, 2017

I say do nothing for now. I'd rather fix this upstream.

@issa-tseng
Copy link
Member

Okay, removing from the milestone.

@issa-tseng issa-tseng removed this from the 0.3.3 milestone Jul 6, 2017
@florianm florianm added this to To do in Roadmap Dec 9, 2021
@florianm florianm modified the milestone: 0.5.0 Dec 13, 2021
@florianm
Copy link
Contributor

florianm commented Dec 13, 2021

Just to document current behaviour, a form "Test040" will be exported by build2xlsform v1.6 as "Test040-export.

Leading with the (sanitised) form title seems OK now.

I could even live (better) without the "-export" postfix because 1. the file extension clarifies the form standard and 2. including the exported form in R packages throws an error and I have to manually remove the -export from each file. The latter is an edge case that probably not many users suffer from.

Update: PR #266 changes this to "Test040.xslx".
Update 2: PR #271 could include #267 which might resolve this issue here.

@yanokwa is this issue addressed?

@florianm florianm moved this from To do to In progress in Roadmap Jan 27, 2022
@florianm florianm added this to the 0.4.1 hotfixes milestone Jan 27, 2022
@florianm
Copy link
Contributor

To document current behaviour, a form named "290test4" exports to "290test4.xslx" (no errors) with an internal form title and form id of "290test4". Uploading that to ODK Central (1.3) throws no errors.
Converting the XLSForm back to XLS also seems to work, no errors are raised:

<?xml version="1.0"?>
<h:html xmlns="http://www.w3.org/2002/xforms" xmlns:ev="http://www.w3.org/2001/xml-events" xmlns:h="http://www.w3.org/1999/xhtml" xmlns:jr="http://openrosa.org/javarosa" xmlns:odk="http://www.opendatakit.org/xforms" xmlns:orx="http://openrosa.org/xforms" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
    <h:head>
        <h:title>290test4</h:title>
        <model odk:xforms-version="1.0.0">
  [...]
            </itext>
            <instance>
                <data id="290test4" version="1643329150">
...

Does that mean that form titles beginning with numbers are now handled well on the XLSForm side, and this issue here can be closed?

@florianm florianm moved this from In progress to To do in Roadmap Jan 28, 2022
@lognaturel
Copy link
Member

Ah yes indeed! The file name used to be used as the node name for the child of the instance. Now it's always data.

Roadmap automation moved this from To do to Done Jan 28, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
Development

No branches or pull requests

4 participants