Import from wallabag asynchronously #1611

nicosomb · 2016-01-21T09:49:47Z

As we can see in #1598, import massive files is not possible (we set the limit to 20M on v2.wallabag.org but it's not the good solution.
We need to implement RabbitMQ as in #1581.
We need to refactor JSON export in wallabag v1 to only download articles URL, not content.

jcharaoui · 2016-02-09T19:25:10Z

I'd like to suggest Wallabag should keep import/export of content for a simple reason. With really large collections, some articles could be years old and the URLs to access them may not work down the road. The website could be completely offline, or the CMS may have changed and the old permalinks not migrated to the new platform.

Also, to solve the problem of loading big export files into memory, another solution would be to use a streaming json parser such as https://github.com/salsify/jsonstreamingparser. This would allow loading only one entry at a time in memory for processing, making the process much faster and less ressource-intensive.

tcitworld · 2016-02-09T19:41:53Z

I'd like to suggest Wallabag should keep import/export of content for a simple reason.

We can't retrieve content from Pocket, but however the solution you provide us may help us to keep content from json files.

We need to refactor JSON export in wallabag v1 to only download articles URL, not content.

We should make this a choice, then. If export with content fails, then export just URLs and metadata.

jcharaoui · 2016-02-09T19:55:38Z

We should make this a choice, then. If export with content fails, then export just URLs and metadata.

That's one option, but it could also be feasible for the export process to keep track of which entries are written out to disk and resume in case of a PHP timeout error. Regardless, unless I'm mistaken I think the export process (database to file) is by nature much faster than import (file to database) because we're having to run multiple database queries per entry. So making the v1 -> v2 JSON import more efficient seems to me like a bigger priority than refactoring the v1 export code.

tcitworld · 2016-02-09T19:57:07Z

So making the v1 -> v2 JSON import more efficient seems to me like a bigger priority than refactoring the v1 export code.

A number of users also have encountered timeout/memory issues while exporting json from v1, so it stays a concern too.

jcharaoui · 2016-02-09T19:57:59Z

So making the v1 -> v2 JSON import more efficient seems to me like a bigger priority than refactoring >> the v1 export code.

A number of users also have encountered timeout/memory issues while exporting json from v1, so it stays a concern too.

Agreed!

HLFH · 2016-09-20T13:31:47Z

Hi @j0k3r @nicosomb Is asynchronous export easier to implement now you have implemented the asynchronous import feature? But I guess it won't be for the 2.1.0 milestone.

j0k3r · 2016-09-20T13:33:30Z

Easier maybe.
But we still need to find how to ping user once the export is done.

nicosomb added the Feature label Jan 21, 2016

nicosomb added this to the 2.0.0 milestone Jan 21, 2016

nicosomb modified the milestones: 2.0.0, 2.1.0 Mar 29, 2016

This was referenced Apr 5, 2016

Importing from Pocket in v2 gives Internal Server Error 500 #1855

Closed

adding links asynchronously #156

Closed

nicosomb changed the title ~~[v2] Import from wallabag asynchronously~~ Import from wallabag asynchronously Aug 28, 2016

nicosomb added the Work in progress label Sep 5, 2016

nicosomb removed the Work in progress label Sep 14, 2016

nicosomb mentioned this issue Sep 15, 2016

Use asynchronous jobs for imports #1941

Merged

j0k3r closed this as completed Sep 19, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Import from wallabag asynchronously #1611

Import from wallabag asynchronously #1611

nicosomb commented Jan 21, 2016

jcharaoui commented Feb 9, 2016

tcitworld commented Feb 9, 2016

jcharaoui commented Feb 9, 2016

tcitworld commented Feb 9, 2016

jcharaoui commented Feb 9, 2016

HLFH commented Sep 20, 2016

j0k3r commented Sep 20, 2016

Import from wallabag asynchronously #1611

Import from wallabag asynchronously #1611

Comments

nicosomb commented Jan 21, 2016

jcharaoui commented Feb 9, 2016

tcitworld commented Feb 9, 2016

jcharaoui commented Feb 9, 2016

tcitworld commented Feb 9, 2016

jcharaoui commented Feb 9, 2016

HLFH commented Sep 20, 2016

j0k3r commented Sep 20, 2016