Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migration from third services - Framabag #426

Closed
nicosomb opened this issue Feb 3, 2014 · 14 comments
Closed

Migration from third services - Framabag #426

nicosomb opened this issue Feb 3, 2014 · 14 comments
Milestone

Comments

@nicosomb
Copy link
Member

nicosomb commented Feb 3, 2014

Currently, Framabag can't import datas from other services.

We have to change the import system to upload a file and import items 10 per 10.

@nicosomb
Copy link
Member Author

nicosomb commented Feb 3, 2014

See #365.

@nicosomb nicosomb modified the milestones: 1.6.0, 1.5.0 Feb 12, 2014
@nicosomb
Copy link
Member Author

@mariroz: interested with this issue?

@mariroz
Copy link
Contributor

mariroz commented Feb 28, 2014

yes, this is, of course, important task, and can work on it, but only if it's not urgent.
I will be totally unable to work on wallabag during next 5-6 days as going to spent some time on skiing. I hope, will read comments, however.
Also, here is small list of tasks, I plan to do, if noone will do it in meantime:

both are simple and can be done quickly.
Then, if you think it will be useful, I would like to make some refactoring related to check_setup.php: maybe it will be better to avoid running it each time script is started, I think, this can be done more effective using try/catch construction.
Then current issue #426, and then, maybe, will try to work on handling of txt, pdf video etc (#508, #444,...) as this is interesting task.
But all this after 5-6 days of silence.

@nicosomb
Copy link
Member Author

no problem, thank you so much for your help and good holidays ;-)

@nicosomb
Copy link
Member Author

I started this feature on upload-file branch (https://github.com/wallabag/wallabag/tree/upload-file).

  • upload form to upload a file into cache folder
  • when we click on "Import from Pocket / etc.", new entries are added in database, without title / content, to avoid max execution time with big files.
  • cron system (or via a link) to fetch content for X entries.

All seems to be good but cron crashes, due to getPageContent() (I moved it into Tools.class.php to be accessible from cron.php): content of first fetched article is displayed. I think it's due to ob_start.

@mariroz: do you have an idea?

@nicosomb nicosomb reopened this Feb 28, 2014
@mariroz
Copy link
Contributor

mariroz commented Mar 1, 2014

will try to investigete this evening.

mariroz added a commit to mariroz/wallabag that referenced this issue Mar 7, 2014
nicosomb added a commit that referenced this issue Mar 7, 2014
getPageContent moved to Tools, fix of #426
@nicosomb
Copy link
Member Author

nicosomb commented Mar 7, 2014

@mariroz I think we have to delete pocket/readability/instapaper export file when import is over. What do you think?

@mariroz
Copy link
Contributor

mariroz commented Mar 7, 2014

@nicosomb , of course, all parsed import files should be deleted.
Also, maybe there is no sense to store it at all: every import file can be parsed w/o moving it to cache folder. Pls also note, that every outside file inside application web root is potential security problem.

@nicosomb
Copy link
Member Author

nicosomb commented Mar 7, 2014

I store this file in cache to separate import and fetch content actions.
Like this, we can fetch content for 10 articles, then 10 articles, etc.

When you have 1000 links to import, you avoid timeout.

@mariroz
Copy link
Contributor

mariroz commented Mar 7, 2014

yes, you fetch content for 10 articles at once.
BUT, as I can see in code, you insert all entries into DB at once with empty content. So, after this step you don't need file any more. Pls correct me, if I'm wrong.

@nicosomb
Copy link
Member Author

nicosomb commented Mar 7, 2014

You're right. We can delete this file.

@mariroz
Copy link
Contributor

mariroz commented Mar 7, 2014

Also, sorry to be so critical, but I don't like idea with cron: it makes application essentially more complicated for common users. Pls don't forget, that php cli is often not available on common hosting plans (the same about cron - it may be inavailable, it is complicated to configure, use and debug in case of problems for usual user).
I think, that it will be much better to make import using series of reload (or ajax requests) of the same import page with some progress message. For example, if user press "import", he will be redirected to import page, where he will see something like "importing articles 1-10", then redirect in 5 sec to the same page but with message "importing articles 10-20" etc. Or don't redirect but request ajax action to do the same. Hope, you understand, what I mean.

@nicosomb
Copy link
Member Author

Import file is now deleted when import is over (see 8d7cd2c).

it makes application essentially more complicated for common users. Pls don't forget, that php cli is often not available on common hosting plans

I know this and that's why we can click on a link to fetch content for 10 articles.
We can improve this system with ajax, in next version.

What do you think about that, @mariroz?

@nicosomb
Copy link
Member Author

nicosomb commented Apr 6, 2014

This feature was added on 1.6. I close this issue.

@nicosomb nicosomb closed this as completed Apr 6, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants