Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve asset and log CSV exports #783

Merged
merged 21 commits into from Feb 3, 2024
Merged

Conversation

mstenta
Copy link
Member

@mstenta mstenta commented Jan 26, 2024

Problem(s):

There are a number of issues with the current CSV exports that farmOS produces:

  1. CSV exports do not include all data, and that isn't made clear to users, who may think that they are getting everything when they aren't. We have an open issue to add a warning: https://www.drupal.org/project/farm/issues/3328886
  2. The "Export CSV" button doesn't appear everywhere (eg: it's not on pages like /asset/%/children), and we need to manually add it to every View we want it to appear on.
  3. Adding it to Views the way we have been is imperfect too because we have one CSV "Rest export" display that we want to be linked to from multiple other displays, and they may not have the same fields/filters (Views currently shows a warning about this whenever you're editing on of the farmOS Views that includes this display).
  4. Right now, the CSV "Rest export" displays have pager disabled so they can "include all items", which results in OOM errors on larger sites. It's simply not feasible to allow exporting ALL records in a single page request (see "Next steps" below for a way we can achieve this).
  5. The CSV exports are not compatible with the CSV importers, and we basically decided they never will be (see https://www.drupal.org/project/farm/issues/2900239). This will still be the case after this PR, but it brings us a step closer to supporting this.

These are all somewhat non-trivial problems, and while all of them have potential solutions/workarounds individually, none of them feel very "elegant".

Solution:

This PR addresses many of the issues above by refactoring how farmOS produces CSV exports. Not all problems are solved 100%, but some of them are, and all of them are improved significantly.

It provides a new "Export CSV" bulk action, similar to the existing "Export KML" action, and removes the "Export CSV" link from asset and log Views. So instead of clicking an "Export CSV" link at the bottom of the page, the user selects the records they want to export (or clicks "select all"), and then selects "Export CSV" from the actions dropdown.

The action uses Drupal core's Serialization API, with the default content entity normalizer and the CSV encoder from the CSV Serialization module. Using the default content entity normalizer (instead of only passing in the columns included in the Views) means that a lot more data can be included. It is possible to include ALL entity data, but the way this gets encoded by default is weird, so this PR has some logic to limit what is included and how it gets normalized. Regardless it is a huge improvement over the current exports, which only include the columns visible in the View. It also includes a confirmation step with a warning about the limitations of CSV-exported data (issue (1) described above).

Bulk actions automatically appear in ALL asset and log Views so the option will be available in a lot more places (issue (2) above). By removing the "Export CSV" link, Views no longer complains about column filter mismatches between displays (issue (3)). By requiring the user to select the entities that they want to export, and not trying to include all of them, we avoid potential OOM errors (issue (4)). See "Next steps" below for a potential follow-up that would allow exporting all items.

Regarding CSV export/import compatibility (issue (5) above), this is step in the right direction. It changes the CSV columns headers to be field IDs instead of field labels, which match what the CSV importers expect, and it includes more potential data. It may be worth reopening https://www.drupal.org/project/farm/issues/2900239 and documenting what all the differences are, so we can consider whether or not we want to continue the effort of bringing them together.

Notably, this also creates a new farm_export module, and moves the KML export actions to a farm_export_kml submodule, alongside the new farm_export_csv module that is based on the same logic. This mimics the existing farm_import module which has CSV and KML sub-modules. It also provides a place to expand and centralize our export-related logic in the future. See next steps below...

Changes:

There are a couple of notable changes between the CSV files that were produced by the old code and this new code:

  1. CSV headers are field IDs instead of field labels.
  2. More columns are included.
  3. Quantity columns are NOT included (see "Next steps" below).
  4. Timestamp fields are returned in RFC3339 format.
  5. Multi-value cells use a pipe (|) separator instead of a comma (,).

There may be others... those were the main ones I noticed/worked on personally. We get most of this output "for free" from the Drupal content entity normalizer, as opposed to specifying each one explicitly in Views, so we may find that there are bugs/issues that went unnoticed during my testing which we can address in follow-ups.

Next steps:

  1. Quantity and Data Stream Views

This PR only changes the asset and log CSV exports. It does not change the Quantity or Basic Data Stream exports, which are the only other two places where "CSV (Rest export)" Views displays exist. We should investigate implementing a similar solution for those two Views as a next step. I did not have time to dig into them for this PR, so I only focused on the asset and log Views.

The challenges with the Quantity View are:

a. It does not currently include any bulk action options, so there are some considerations to enable that as a first step.
b. The default content entity normalizers only include data from the entity itself, but the Quantity Views also join in Log entity data to add some columns. We will need to make a custom quantity entity normalizer to do enable that.

The challenges with the Basic Data Stream View are essentially the same. That View does not currently provide bulk actions, and the rows that it queries are not content entities, so we will need a custom normalizer for them.

  1. Allow user to select columns to include

Right now the CSVs include a default set of columns. It would be nice if we could offer the use the ability to select which columns they want to include/exclude in the UI. This may be possible to do in the new confirmation form step. I didn't have time to explore it in this round.

  1. Quantity entity reference normalizer

The log exports do not contain quantity data. This will require a custom normalizer. I didn't have time to do that in this round, but should be a pretty easy followup once we have some resources for it.

I think we can justify omitting this in this phase because quantity data was already limited to 1 quantity in the old exporters, and we still provide full quantity export capabilities in the Quantity Views.

  1. Inventory normalizer

Current asset inventory is not included in the asset exports. We need a custom normalizer for this.

  1. ID tags normalizer

Asset ID tags are not included in the asset exports. We need a custom normalizer for this.


If anyone would like to help with some of those normalizers (eg: quantity, inventory, id tags) as part of this first pass please feel free! Create a branch off of this one in your own fork and propose commit(s) in this PR's comments for consideration/inclusion!

mstenta added a commit to mstenta/farmOS that referenced this pull request Jan 26, 2024
@mstenta mstenta marked this pull request as ready for review January 26, 2024 14:33
mstenta added a commit to mstenta/farmOS that referenced this pull request Jan 26, 2024
@mstenta
Copy link
Member Author

mstenta commented Jan 26, 2024

Force pushed my branch to add some documentation updates to https://farmos.org/guide/export/

@mstenta
Copy link
Member Author

mstenta commented Jan 26, 2024

Regarding CSV export/import compatibility (issue (5) above), this is step in the right direction. It changes the CSV columns headers to be field IDs instead of field labels, which match what the CSV importers expect, and it includes more potential data. It may be worth reopening https://www.drupal.org/project/farm/issues/2900239 and documenting what all the differences are, so we can consider whether or not we want to continue the effort of bringing them together.

I reopened https://www.drupal.org/project/farm/issues/2900239 (postponed on this PR) and updated the issue summary. If anyone wants to help document differences, that would be much appreciated!

@mstenta mstenta added this to the v3.1.0 milestone Jan 30, 2024
mstenta added a commit to mstenta/farmOS that referenced this pull request Jan 31, 2024
@paul121 paul121 self-requested a review February 1, 2024 22:24
@mstenta
Copy link
Member Author

mstenta commented Feb 2, 2024

Thanks @paul121! Looks great!

mstenta added a commit to mstenta/farmOS that referenced this pull request Feb 2, 2024
mstenta added a commit to mstenta/farmOS that referenced this pull request Feb 2, 2024
@mstenta mstenta merged commit 3e891a8 into farmOS:3.x Feb 3, 2024
1 of 2 checks passed
@mstenta mstenta deleted the 3.x-csv-export branch February 3, 2024 00:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

None yet

2 participants