Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Admin page: export form definitions and data with attachments #779

Open
ebruchez opened this issue Jan 27, 2013 · 32 comments
Open

Admin page: export form definitions and data with attachments #779

ebruchez opened this issue Jan 27, 2013 · 32 comments

Comments

@ebruchez
Copy link
Collaborator

ebruchez commented Jan 27, 2013

Feature

The form definition, its dependencies (images, attachments), and/or form data and dependencies, are exported into a zip archive, which can be imported back into a Form Builder/Form Runner installation.

Use cases

  • Forms are developed on a instance of Form Builder distinct from the one used for staging and/or production and not reachable via direct internet connection.
  • Archival/backup of forms
  • Exchange of forms
  • Migrating from one persistence implementation to another (e.g. eXist to relational)
    • We won't support exporting from eXist at this point as that implementation has been removed.
  • Exporting from database to Orbeon Forms resources

Location of functionality

  • Form Runner Admin page
  • Form Runner Summary page
  • P2: Form Builder Summary/Detail pages for non-published form definitions

Steps

  • retrieve form definition, related dependencies, form data
  • use zip processor (or probably from Scala) to produce downloadable zip file
  • multiple forms can be selected for export when possible
  • specific data can be selected for export when possible (from Summary page only)

Service

It might be good to implement most of the functionality as a service first. This would also enable a scripted mode which would not require the Orbeon Forms user interface.

NOTE: This however would still require a running Form Builder instance.

  • This is implemented as a service that returns a zip archive or an error.
  • You can pass the service the id of a form, and/or search values as shown in the Form Builder Summary page. This selects 0 or more forms.
  • Each form selected is included in the archive.
  • If possible, calling the service from the command-line could be done with a simple wrapper around a tool like - wget/curl. If that is not possible or desirable, another solution can be found.
@ebruchez
Copy link
Collaborator Author

ebruchez commented Jun 6, 2013

See also #1055.

@ebruchez
Copy link
Collaborator Author

ebruchez commented Feb 18, 2015

Also exporting data with dependencies should also be an option.

@ebruchez
Copy link
Collaborator Author

Also requested: ability to export selected subset of data only. See customer thread.

@ebruchez
Copy link
Collaborator Author

+1 from customer

@ebruchez
Copy link
Collaborator Author

ebruchez commented May 2, 2018

+1 from customer

@ebruchez
Copy link
Collaborator Author

For this customer , handling form definition only is enough and there is no requirement to handle data as well. This might simply at little bit a first implementation.

@ebruchez
Copy link
Collaborator Author

ebruchez commented May 18, 2018

Versioning should be handled as well:

  • upon exporting, the user must be able to select the version(s) to export
  • version numbers must be included in the zip archive

See also #1451 and #3448 about relevant improvements to versioning on the Home page.

@ebruchez
Copy link
Collaborator Author

The requirement also calls for:

  • doing this manually via a Form Runner UI
  • doing this via an API

@avernet
Copy link
Collaborator

avernet commented Jun 27, 2018

+1 from customer

@ebruchez
Copy link
Collaborator Author

ebruchez commented Jul 10, 2018

+1 from customer as an option to migrate between providers (although another, explicit way of migrating between providers might be better).

@ebruchez
Copy link
Collaborator Author

+1 from customer

@ebruchez
Copy link
Collaborator Author

ebruchez commented Nov 9, 2021

+1 from customer for archival of data.

ZIP64 came out with the ZIP 4.5 specification in 2001 and Java 7 (2011) supports ZIP64. So file size for large exports should not be a problem.

@ebruchez
Copy link
Collaborator Author

ebruchez commented May 29, 2023

First step: write "archive export API". Parameters:

  • date range

    • all
    • last-modified-time-gt/last-modified-time-ge
    • last-modified-time-lt/last-modified-time-le (optional)
  • data vs. form vs. both (vs. attachments?)

    • content=form-definition|form-data|attachments ?
  • app/form/form-version

    • app: part of the path, optional
    • form: part of the path, optional
    • form-version: URL parameter or header?
      • Search API uses Orbeon-Form-Definition-Version header for latest/specific/all
        • but that might not have been the best choice vs. a URL parameter
    • document-id: part of the path, optional (requires app/form)
      • scenario: export a single instance of form data with its attachments
  • P2: "dry run" to just return the count(s) of operations

New entry point:

/fr/service/persistence/export/$app/$form/$document-id?form-version=$form-version

$app/$form/$document-id/form-version are all optional, but in constrained ways:

  • no parameter: export everything
    • we can imagine allowing form-version=latest|all, but specific doesn't make much sense here
  • $app only: export everything under the app
    • we can imagine allowing form-version=latest|all, but specific doesn't make much sense here
  • $app and $form
    • form-version=latest|specific|all
  • $app and $form and document-id
    • form-version: can be omitted or specific, but if so must match

@ebruchez
Copy link
Collaborator Author

We are implementing this mostly at the level of the persistence proxy, not directly in the providers.

  • a request that needs to search for form definitions and their versions, use the Form metadata API
  • a request that needs to search for data uses the Search API
  • GETting data and attachments use the CRUD API

@ebruchez
Copy link
Collaborator Author

ebruchez commented Jun 8, 2023

  • export form definitions with attachments only
    • initial implementation
    • support app/form/version filters
  • export form data with attachments only
    • retrieve form data and attachments
    • call Search API
  • consider passing list of app/form/versions to export
  • option to export revision history
    • call /history API
  • date ranges
    • constraint is only on data
  • error handling options: fail fast vs. accumulate errors
  • attachments
    • determine where to place them in hierarchy; timestamp?
  • should use latest, current, or timestamp for latest data?
  • timestamps should have determined resolution: second, millisecond, or microsecond
    • right now latest data doesn't have milliseconds, while search API does
  • review behavior with permissions
    • do we need to check permissions beyond what the APIs return?
    • what about list permissions?
    • see Purge API: similar question, where current user permissions might not allow her to even see some of the data; should admin be able to bypass permissions and see/export all? probably yes, but that requires changes to the Search API
  • generate a meaningful filename
    • 2023-06-22-orbeon-forms-export.zip
    • and/or when possible add details, like form-data or acme-sales
  • handle deleted data
  • investigate: Search API returns create only operation; why?
  • easy: rename and create separate issue for import

@ebruchez
Copy link
Collaborator Author

ebruchez commented Jun 9, 2023

Clarification: for the date range filtering, we probably want this to apply to the data, not the form definition. (If we would like some filtering applying to form definitions, we need to be explicit and add other parameters.)

For app/form/version triplet selection, we have:

  • all forms
  • filtering by range
  • specific triplets

If we pass lots of triplets, we need to watch out for:

  • how to encode them
  • URL length

However, internally, we don't have a URL length limit, and the use case is the Admin page passing a large number of triplets to the API. We are ok passing this as URL parameters for now, and later, if needed, we can add a POST option.

@ebruchez
Copy link
Collaborator Author

For date/times, we need clarity. Suggesting (updated above):

  • last-modified-time-gt/last-modified-time-ge
  • last-modified-time-lt/last-modified-time-le (optional)

@ebruchez
Copy link
Collaborator Author

Realizing that there is a conflict between:

  • the ability to specify app/form/document-id and form version by path and URL parameter (for the form version)
  • the ability to pass a series of triplets via URL parameter

Should we support both? The Admin page will support the triplets, since that is the result of the selection in the UI.

@ebruchez
Copy link
Collaborator Author

For how attachments are currently handled by Form Runner, see this comment. This said, at the API/service level/database, we do support updates to attachments.

In theory, we could call the Data History on attachments as well.

Right now, we don't. Scenario:

  • save form data with attachments
  • don't change attachments but save data again

Result:

issue/779/1/form/latest/form.xhtml
issue/779/1/data/9e217967755f608bd8e90b194a88b64482dc74b9/2023-06-14T20:48:02Z/data.xml
issue/779/1/data/9e217967755f608bd8e90b194a88b64482dc74b9/2023-06-14T20:48:02Z/edd1cbc2f97dd5240303de095e30cda779d1e8ef.bin
issue/779/1/data/9e217967755f608bd8e90b194a88b64482dc74b9/2023-06-14T20:48:02Z/c0d22317677b5ea07885d710e0a0a6318a94a779.bin
issue/779/1/data/9e217967755f608bd8e90b194a88b64482dc74b9/2023-06-14T20:47:11.941Z/data.xml
issue/779/1/data/9e217967755f608bd8e90b194a88b64482dc74b9/2023-06-14T20:47:11.941Z/edd1cbc2f97dd5240303de095e30cda779d1e8ef.bin
issue/779/1/data/9e217967755f608bd8e90b194a88b64482dc74b9/2023-06-14T20:47:11.941Z/c0d22317677b5ea07885d710e0a0a6318a94a779.bin

This is not good because the attachments are duplicated.

@ebruchez
Copy link
Collaborator Author

ebruchez commented Jun 15, 2023

Solution:

  • for a given document id, keep track of attachment filenames across historical data as well
  • write the first one as latest
  • duplicate attachment references are ignored

@ebruchez
Copy link
Collaborator Author

ebruchez commented Jun 15, 2023

For data date ranges:

  • include-data-revision-history=false
    • simply check the current data's date is within range
  • include-data-revision-history=true
    • include all the entries within the range
    • this means current data might not be included
    • if data is deleted, nothing will be exported (see above: we need change to Search API to support including historical data for deleted data)

@ebruchez
Copy link
Collaborator Author

ebruchez commented Jun 15, 2023

The UI must allow controlling the following axes:

  • form vs. data
    • form definitions only
    • form data only
    • both
  • current data vs. revision history
    • current data only
    • also include revision history
    • no need to allow for "revision history only"
  • app/form
    • everything
    • everything for a given app
    • everything for a given app/form
    • from selection (checkboxes)
  • form versions
    • latest only
    • all
    • from selection (checkboxes), same checkbox/option as above
  • date ranges
    • only data older than x
    • only data newer than x

All of these options require a dialog. We must have a button or an entry in the Operations menu to open that "Export" dialog. Once the options are selected, an "Export" (vs. "Cancel") button will start the export and download the zip.

@ebruchez
Copy link
Collaborator Author

For date/time constraints, a use case might typically be:

So the above must be supported. Also:

  • Use date/time fields. The user will set the time at 00:00.
  • For now, conditions mean only before/after date/time? ideally:
    • "is"
    • "before"
    • "after"
    • "on or before" (for a day)/"at or before" (for a time)
    • "on or after" (for a day)/"at or after" (for a time)

@ebruchez
Copy link
Collaborator Author

For timestamps, we get them through the following APIs:

  • CRUD
  • Search
  • History

In the first case, we use the standard Created/Last-Modified headers, and those only have a resolution of one second. IN the other cases, we return our own ISO dates which have millisecond resolution. Suggesting adding millisecond ISO dates to the GET/HEAD calls as well. If present, those will be used for the purpose of setting the last modification dates in the Zip archive entry metadata and paths.

@ebruchez
Copy link
Collaborator Author

ebruchez commented Aug 4, 2023

  • the autocomplete fields in the dialog do not allow entering a search text
image

@ebruchez
Copy link
Collaborator Author

ebruchez commented Aug 4, 2023

  • forward-port changes to master branch
  • forward-port changes to custom 2022.1-pe branch @ebruchez @obruchez
    • 2023-08-28 Resolved not to do this for now, will be in 2023.1 as well customer branch

@avernet avernet added this to To do in Orbeon Forms 2023.1 Aug 10, 2023
@avernet avernet moved this from To do to In progress in Orbeon Forms 2023.1 Aug 10, 2023
ebruchez added a commit that referenced this issue Aug 10, 2023
ebruchez added a commit that referenced this issue Aug 10, 2023
ebruchez added a commit that referenced this issue Aug 10, 2023
ebruchez added a commit that referenced this issue Aug 10, 2023
@ebruchez ebruchez changed the title Import/export forms with dependencies Export forms and data with dependencies Aug 11, 2023
@ebruchez
Copy link
Collaborator Author

Created separate issue #5924 for import.

@ebruchez
Copy link
Collaborator Author

ebruchez commented Aug 11, 2023

  • @ebruchez check that form-version=all works (didn't work today for me)

@ebruchez ebruchez changed the title Export forms and data with dependencies Export form definitions and data with attachments Aug 11, 2023
ebruchez added a commit that referenced this issue Aug 11, 2023
avernet added a commit that referenced this issue Aug 25, 2023
- This after we've `container.js` change done for #779
@ebruchez
Copy link
Collaborator Author

ebruchez commented Oct 16, 2023

  • consider optimizing export/purge of historical data by running a single SQL query instead of issuing individual DELETE or GET (either implement, or enter new RFE)
    • possible because we always want to return or delete rows within a given date range

@ebruchez ebruchez changed the title Export form definitions and data with attachments Admin page: export form definitions and data with attachments Oct 17, 2023
@ebruchez
Copy link
Collaborator Author

ebruchez commented Oct 19, 2023

Entered #6023

@ebruchez
Copy link
Collaborator Author

ebruchez commented Nov 29, 2023

@ebruchez
Copy link
Collaborator Author

ebruchez commented Dec 8, 2023

Permissions: I think we had decided that this was a admin function, which would export all data regardless of permissions.

  • check whether this is indeed the current behavior
  • document that aspect

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

No branches or pull requests

3 participants