Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

export docs and related docs #7

Merged
merged 30 commits into from
Aug 24, 2023
Merged

export docs and related docs #7

merged 30 commits into from
Aug 24, 2023

Conversation

ETLaurent
Copy link
Contributor

@ETLaurent ETLaurent commented Aug 15, 2023

Summary

Big choucroute to batch-export documents, export a single document and page.

Export their actual data that is in the database (we don't use the manager that is populating fields and adding data).
We export the docs as they are in the DB.

We only use the docs manager to get the related docs.

Then we compress the docs in a aposDocs.json that's inside a ZIP or GZIP archive, and we return the URL to the client that's opened by the browser for download.

What are the specific steps to test this change?

single doc (you can also test with a page):

image image image

multiple docs:

image image image image image

What kind of change does this PR introduce?

(Check at least one)

  • Bug fix
  • New feature
  • Refactor
  • Documentation
  • Build-related changes
  • Other

Make sure the PR fulfills these requirements:

  • It includes a) the existing issue ID being resolved, b) a convincing reason for adding this feature, or c) a clear description of the bug it resolves
  • The changelog is updated
  • Related documentation has been updated
  • Related tests have been updated

If adding a new feature without an already open issue, it's best to open a feature request issue first and wait for approval before working on it.

Other information:

@linear
Copy link

linear bot commented Aug 15, 2023

PRO-4261 As an editor, I can download a ZIP file of my exported documents

When clicking on the download button of the export modal, after I selected the related docs that I want to include in the export, the process must be triggered.

image.png

Two new routes export must exist in the modules piece-type and page. Both routes will call a method export from the module doc-type .

This method should take the IDs of the documents the user want to export, as well as the related types that he wants to get. Only the types, not the IDs, since we decided that it would be too heavy for the export modal to get the amount of related docs when we batch a large amount of documents.

Generate files

This method will need to use the relate method from the module schema, this method fills documents with related ones.

Since we need to store documents in a flat EJSON array, it might be a good idea to add an option to this method in order to get a flat array of related documents instead of filling requested documents with their related docs, see how it could potentially be used with query builders.

We want one EJSON file for each db collection, for example aposDocs.json and aposAttachment.json, containing arrays of documents..

We want, for each document to export the draft and live versions (only draft if there is no live). The draft version must appear first to avoid situation where a live exists without draft which is forbidden.

Same thing for related documents, they must be written first in the EJSON file, otherwise the doc manager will yell when we'll import it.

We will create an attachment folder when actual attachments will be stored (images, files…).

Generate Zip

To generate the ZIP file we would prefer using the native node library.

Progress bar notification

This progress bar notification is triggered by wrapping your export function with the run method of the job module. Here is an example from the piece-type-importer module:

return self.apos.modules['apostrophecms/job'].run(
  req,
  (req, reporting, { notificationId }) => self.importRun(req, {
    progressNotifId: notificationId,
    filePath: file.path,
    reporting,
    totalPieces
  }),
  {}
);

The reporting object passed from the second argument function allows to set the total of the documents you want to import with the method setTotal, as well as to call the methods success or failure for every document that we tried to export or import.

Finally the ZIP file will be sent back to the client and the download triggered from the user's browser.

Permissions

For exporting documents, a user that has at least one per-type permission should be able to export anything. He won’t be able to see adminOnly documents, therefore should not be able to export these documents. It matches the way it works currently because in this case the user can see all drafts. In core, admin should be able to export adminOnly content, except for users.

For Reference

https://github.com/apostrophecms/tech-designs/blob/main/3.x/share-documents-across-sites/design.md

Acceptance Criteria

  • I'm an editor or admin on with the Download modal open, and my documents selected, and I hit "Download" and when I do, a ZIP file begins downloading.
  • The zip file downloads and I open it and I see the the draft and live versions (only draft if there is no live) in my unarchived folder.

@ETLaurent ETLaurent changed the title call export route export docs and attachments in zip file Aug 17, 2023
}
if (self.options.import !== false) {
self.apos.asset.iconMap['apos-import-export-upload-icon'] = 'Upload';
}
Copy link
Contributor Author

@ETLaurent ETLaurent Aug 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not related to this PR, but improves this code

lib/methods.js Outdated Show resolved Hide resolved
@ETLaurent ETLaurent changed the title export docs and attachments in zip file export docs and related docs Aug 17, 2023
const data = {
'aposDocs.json': docsData,
'aposAttachments.json': attachmentsData
// attachments: 'attachments/' // TODO: add attachment into an "/attachments" folder
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

next step


const data = {
'aposDocs.json': docsData,
'aposAttachments.json': attachmentsData
Copy link
Contributor Author

@ETLaurent ETLaurent Aug 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aposAttachment.json has mocked data, for now (see TODOs)

Comment on lines 74 to 80
setTimeout(() => {
self.apos.attachment.uploadfs.remove(downloadPath, error => {
if (error) {
self.apos.util.error(error);
}
});
}, expiration || 1000 * 60 * 60);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

setTimeout is bad... if the process gets killed, that won't be clearing the downloads

Copy link
Contributor

@ValJed ValJed Aug 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we should find a better way to clean files.. is this cleaning files every hours?
Couldn't we clean the file when the client downloaded it by performing a new request with the file ID? 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Every hour yes.

The client is requesting the file directly, from the public folder, and I don't think apostrophe handles that.
So I fear we cannot know if the file has been downloaded.

I may be wrong though...
It might be possible, I'm quickly looking for a possible solution, but I don't think we can.

Besides, we might want the archive to reside in public to be available for some time in the case the first download fails, IDK...

And if we're implementing a solution that's piping the whole thing (without writing the archvive on the server but directly streaming it to the client), we won't have this problem

Comment on lines 82 to 84
return {
url: downloadUrl
};
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return url, we're not forced to do this because the url is given and opened by the export-download event, but let's return something in case of...

Comment on lines 3 to 4
// TODO: remove:
const attachmentsMock = [ { foo: 'bar' } ];
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this TODO

const relatedTypes = self.apos.launder.strings(req.body.relatedTypes);
const extension = self.apos.launder.string(req.body.extension, 'zip');

// TODO: add batchSize?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

??

if (!relatedTypes.length) {
const docs = await self.fetchActualDocs(req, allIds, reporting);

// TODO: change undefined to attachments
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and this TODO


const docs = await self.fetchActualDocs(req, [ ...allIds, ...allRelatedIds ], reporting);

// TODO: change undefined to attachments
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and that TODO

.map(relatedDoc => relatedDoc._id);
},

// TODO: factorize with the one from AposI18nLocalize.vue
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ValJed this TODO can be removed when we'll merge our work

},

// TODO: factorize with the one from AposI18nLocalize.vue
// TODO: limit recursion to 10 as we do when retrieving related types?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

??

return related.filter(doc => self.apos.modules[doc.type].relatedDocument !== false);
},

// TODO: use actual attachments when calling this method:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and that TODO

Comment on lines +74 to +75
// FIXME: the progress notification is not always dismissed.
// Probably a fix that needs to be done in job core module.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah...

Comment on lines +5 to +10
window.apos.bus && window.apos.bus.$on('export-download', event => {
if (!event.url) {
return;
}
window.open(event.url, '_blank');
});
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the event I was referring to above

Comment on lines +31 to +32
// TODO: do not include types that have the `relatedDocument = false`
// option in their module configuration
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should do that, let's keep the TODO

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be done in my PR

Comment on lines 87 to 89
// FIXME: seems like reporting is not working very well,
// when calling failure on purpose, we have the progress bar
// at 200%... 🤷‍♂️
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to verify when making final tests

self.exportFormats = {
zip,
gzip,
...(self.options.exportFormats || {})
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

projects can add custom formats

@@ -36,7 +36,10 @@ module.exports = {
export: {
label: 'aposImportExport:export',
messages: {
progress: 'aposImportExport:exporting'
progress: 'aposImportExport:exporting',
completed: 'aposImportExport:exported',
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the "completed" message makes the job handle the display of the final notification (eg. "Exported 2 Articles"

progress: 'aposImportExport:exporting'
progress: 'aposImportExport:exporting',
completed: 'aposImportExport:exported',
icon: 'database-export-icon',
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

already by default in the core, which is weird and too close to the piece-type-exporter module.

still set it here because it's this module responsibility to set that database export icon

progress: 'aposImportExport:exporting',
completed: 'aposImportExport:exported',
icon: 'database-export-icon',
resultsEventName: 'export-download'
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that is necessary in order for the job to include the results into a 'export-download' which will trigger the url opening in the frontend (see ui/apos/apps/index.js).

the core has been changed so that resultsEventName is taken into account (see apostrophecms/apostrophe#4270)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will need to document properly all these changes before to release.

<AposSelect
:choices="extensions"
:selected="extension"
@change="onExtensionChange"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, we said only ZIP...
but since gzip was added for the sake of testing another library, let's keep these 2 formats

Comment on lines +205 to +213
const result = await window.apos.http.post(`${action}/${this.action}`, {
busy: true,
body: {
_ids: docsId,
relatedTypes,
messages: this.messages,
extension: this.extension
}
});
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this actual API call to export docs

@ETLaurent ETLaurent requested a review from ValJed August 23, 2023 16:45
@ETLaurent ETLaurent marked this pull request as ready for review August 23, 2023 16:45
archive.append(content, { name: filename });
}

archive.finalize();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this triggers the event finish?

@ETLaurent ETLaurent merged commit 552a832 into main Aug 24, 2023
9 checks passed
@ETLaurent ETLaurent deleted the pro-4261-export-docs branch August 24, 2023 09:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants