Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ToDo list #78

Closed
21 of 30 tasks
jarun opened this issue Oct 22, 2016 · 35 comments
Closed
21 of 30 tasks

ToDo list #78

jarun opened this issue Oct 22, 2016 · 35 comments

Comments

@jarun
Copy link
Owner

jarun commented Oct 22, 2016

Continued from #39.

Notes

The list below is a growing one. While suggesting new features please consider contributing to Buku. The code is intentionally kept simple and easy to understand with comments. We'll be happy to help out any new contributor.

Some of the just-completed features may not have been released yet. Grab the master branch for those.

Identified tasks

  • Need a PyPI maintainer
  • Make refreshdb faster using threads (record updates should be synchronized).
  • API documentation
  • Add more tests
  • Android app (using the same database)
  • Rest API for webapps
  • Show usage count in lag list
  • Proxy support
  • Continuous search at prompt
  • Add prompt help
  • Specify custom DB file to class BukuDb (library usage, no exposed option)
  • Move to urllib3
  • Handle redirects using referrer masking. Example URL. Fixed with urllib3.
  • Support URL shortening. This helps to share URLs. (see Support adding shortened URLs to the database #92 for limitations)
  • Make a bookmark title immutable via refreshdb()
  • Markdown import/export
  • Regex search
  • Ubuntu PPA
  • Export specific tags to HTML
  • Exact word match using REGEX. Make substring match optional.
  • Delete all records based on a search result
  • Delete multiple items, support combination of indices and ranges
  • Append tags
  • Travis CI integration
  • Ubuntu deb package generation on new tag
  • Merge bookmark database files (for users who work on multiple systems)
  • Export bookmarks in FF or Chrome html format.
  • Option to add folder names as tags while importing HTML (see Import folders as tags while importing bookmarks HTML. #80)
  • Implement self-upgrade (see Support self-upgrade #83)
  • Anything else which would add value (please discuss in this thread)
This was referenced Oct 22, 2016
Closed
@sporgj
Copy link

sporgj commented Oct 26, 2016

How about a "backup" mode. Users could provide backends such as git, S3 where their bookmark database gets pushed to

@jarun
Copy link
Owner Author

jarun commented Oct 26, 2016

Buku doesn't intend to be a cloud solution by itself. If a solution uses Buku as a library they are free to do so.

For personal usage, one could, however, store the database file in a mapped drive which is synced and create a softlink to it in the regular Buku db path. I store the actual file in my Dropbox directory, for example.

I think I should document this though. Thanks for reminding!

@polo2ro
Copy link

polo2ro commented Nov 6, 2016

Hi,
Thank you for this usefull tool!
I use it a lot and found like other solutions that it is difficult to detect invalid url, because this is command line i can actually do that myself : https://github.com/polo2ro/buku-scripts
This can be a functionality to include in buku, to be nice to the new users and to be more cross-platform, or keep external but maybe as documentation.

@jarun
Copy link
Owner Author

jarun commented Nov 6, 2016

@polo2ro buku -u for full DB refresh also shows the error info on failure. Recently I have added a new param for verbosity to some APIs. I think I can make changes to show the info only when there's a failure.

@polo2ro
Copy link

polo2ro commented Nov 6, 2016

yes i think that would do the trick; when using -u on all database i run into other problems because many of my bookmarks where on the same site and they probably have some policy to limit high number of page/seconds, that is why i used the sleep command. I my first try i almost deleted all the links from the same site because of that.

@jarun
Copy link
Owner Author

jarun commented Nov 6, 2016

I get it now.

I noticed that you check only for status 200. Does curl handle HTTP re-directions transiently?

There are also some status codes which indicate temporary failures. I think it would be great to make the script more verbose with the status code if it is not 200 and a verbose description. Note that buku -u already does that.

Instead of an additional shell script, I think it would be great if you can add an API in Buku to do this.

Of course it will need more checks. For example, it would need a better check for malformed URLs (refer to jarun/googler@b53b638 for a hint). Then the additional delay, verbose status codes and so on... I can add it as a task item if you wanna pick it up.

@jarun
Copy link
Owner Author

jarun commented Nov 7, 2016

@polo2ro We'll soon move over to urllib3 which has retry in-built. I believe the above issue will be over with it.

@polo2ro
Copy link

polo2ro commented Nov 7, 2016

here are some examples url of my use case:

http://www.seloger.com/annonces/achat/maison/estang-32/107048765.htm
http://www.seloger.com/annonces/achat/maison/saint-fort-sur-gironde-17/112444387.htm

the first url give me a 200 OK, all good
the second give me a 301 Moved Permanently

The probleme is that the site give me a 301 instead of a 404 for a page that does not exists anymore, that is why i check the 200 Ok only.

I am all for a pull request for that but for now i am not sure what to do, maybe a simple output of the http code can be sufficient to detect 404 or other unwanted codes

for example a capability to filter by http code can make possible to remove 404 links like this:
buku --http-code 404 -d

or maybe using -u to set a tag:
buku -u --http-code 404 --tag expired

@jarun
Copy link
Owner Author

jarun commented Nov 7, 2016

I'll test the scenario with urllib3 and update.

@jarun
Copy link
Owner Author

jarun commented Nov 8, 2016

I think this works fine with urllib3. Results with latest master (at the time of writing):

$ buku -p
1. http://www.seloger.com
   > Petites annonces immobilières | 1er site immobilier français | Portail immo

2. http://www.seloger.com/
   > Petites annonces immobilières | 1er site immobilier français | Portail immo

3. http://www.seloger.com/annonces/achat/maison/saint-fort-sur-gironde-17/112444387.htm
   > Vente maison   Charente-Maritime (17) | Achat maisons   en  Charente-Maritime

4. http://www.seloger.com/annonces/achat/maison/estang-32/107048765.htm
   > Vente maison 7 pièces Estang  - maison Maison ancienne F7/T7/7 pièces 211m² 105000€

$ buku -u
Title: [Petites annonces immobilières | 1er site immobilier français | Portail immo]
Index 1 updated

Title: [Petites annonces immobilières | 1er site immobilier français | Portail immo]
Index 2 updated

Title: [Vente maison   Charente-Maritime (17) | Achat maisons   en  Charente-Maritime]
Index 3 updated

Title: [Vente maison 7 pièces Estang  - maison Maison ancienne F7/T7/7 pièces 211m² 105000€]
Index 4 updated

@jarun
Copy link
Owner Author

jarun commented Nov 8, 2016

In addition, now you can view only failed and skipped (due to mime) using:

$ buku -u --tacit

@polo2ro
Copy link

polo2ro commented Nov 8, 2016

This is great! i will check this out with an update on my database and get back to you. Thank you

@polo2ro
Copy link

polo2ro commented Nov 8, 2016

The retry functionality is not working in my case because seloger.com give me a 200OK with a error page, i don't know if this a common behavior. I am luky that the error page does not have a title.

I uploaded by database https://1fichier.com/?9urj0c0p14

i get the error message after 45 updates, i think nothing can be done about it because http codes are note used properly here.

I am pretty sure this will work with apache mod_ratelimit

@jarun
Copy link
Owner Author

jarun commented Nov 8, 2016

The retry functionality is not working in my case because seloger.com give me a 200OK with a error page

This doesn't seem standard behavior. Here's an example of correct behaviour:

$ ./buku.py -a http://tuxdiary.com/kdfgdfg
[ERROR] [404] Not Found
Title: []

12. http://tuxdiary.com/kdfgdfg

Don't think we can do much about it.

@drzraf
Copy link

drzraf commented Nov 19, 2016

Suggestion:
Make Buku integrated to most modern browsers (would gain traction).
It means writing Firefox, Midori, ... extensions using Buku as a backend (storage/search/tags/...)
That would greatly help people switching from a browser to another.

@jarun
Copy link
Owner Author

jarun commented Nov 19, 2016

Thank you for the suggestion. Yes, Buku is available as a library and I know terminal based projects which are interested in using it. Personally, I do not have the time to spend in writing browser extensions though. There are several projects I maintain already.

@The-Wayvy
Copy link
Contributor

Hi Jarun
I am new to open-source, and found your project on up-for-grabs.
I want to make refreshdb faster using threads.

@jarun
Copy link
Owner Author

jarun commented Nov 24, 2016

Hi @DamianSiniakowicz thank you for your interest in the project. Please let me know any questions you have. I am available on Gitter.

@naaaargle
Copy link

Hi! Are there any tests in particular you would like to see? I am new to open source and think that would be a good way to get started. If you don't have anything specific in mind, I can jump in and try to find something not yet covered. Thanks!

@AndreiUlmeyda
Copy link

AndreiUlmeyda commented Nov 25, 2016

Herroo!
A few TODOs might possibly be solved by marrying buku and peco as preliminarily tested thereabout.

  • Continuous search at prompt
  • Delete all records based on a search result

and maybe, since peco supports sticky selections,

  • batch tagging.

I am thinking pucu? beco? I dunno! I bet those are naughty words in one language or another.

It would take the form of a little script mashing buku output into a line that looks acceptably pretty in peco and contains all or most of the searchable content (which might fail if preferred search targets contain more than one or two of the longer fields, url/description/tags, but we'll see) and parsing those same lines afterwards for everything from opening the url(s) to batch tagging/deleting (these might require the index).

Cheerio!

UPDATE: try buku -p -j | jq '.[] | .title + " | " + .uri + " | " + .tags' | peco (depends on jq) for a glimpse of how it might look like. But do not worry! Things need to get ugly before they can start to get better. Then again, grandma has lied before.

@jarun
Copy link
Owner Author

jarun commented Nov 26, 2016

@naaaargle thanks for your offer! there are lots of stuff not covered in tests yet. I can come up with a list for you.

@jarun
Copy link
Owner Author

jarun commented Nov 26, 2016

@AndreiUlmeyda, I am open to the options as long as they remain simple. While the terminal is awesome, we'll have to keep the marriage less complicated for the benefit of healthy mortal children.

@jarun
Copy link
Owner Author

jarun commented Nov 26, 2016

@naaaargle here are some test cases which popped up:

  • test with custom database file (this is not exposed, you can init BukuDb with a custom file)
  • encrypt/decrypt buku with custom number of passes
  • test cases with empty database (print, delete, search)
  • some test cases for update_bm()
  • tests for regex search
  • tests for delete_resultset
  • tests for replacetag
  • if you are here you can find more test cases

Also, please let me know if you'd like to maintain the PyPi branch. You may need to spend some time on PyPi. The project structure is ready for it. But I'm quite blank about PyPi.

@AndreiUlmeyda
Copy link

Altrightythen, I've thrown the <buku|peco> idea inside a little script and I am pretty sure it classifies as simple, thanks to the inherent neatness of buku. I will try and get the suboptimal line structuring sorted out over the next few days and would greatly appreciate a bit of input from people who can imagine themselves using it. I will open an issue there later laying out where the problems lie and what information from users is needed to get it straight. If that isn't there yet just throw whatever thoughts you have at it in as many issues as you like.

Cheers

@jarun
Copy link
Owner Author

jarun commented Nov 26, 2016

@AndreiUlmeyda 👍 Thank you! I'll check it out!

@Qu4tro
Copy link

Qu4tro commented Nov 30, 2016

Any thoughs on a REST API, so that webapps could have a ready bookmark backend?

@jarun
Copy link
Owner Author

jarun commented Nov 30, 2016

👍 I am open to PRs. :) Added it in the task list.

@jarun
Copy link
Owner Author

jarun commented Dec 5, 2016

@AndreiUlmeyda I was trying it. Works well. I have one request though... can you please change the name to something else?

@AndreiUlmeyda
Copy link

Aw man, that was the best part. But ok, I will do that. Thanks for checking it out

@jarun
Copy link
Owner Author

jarun commented Dec 5, 2016

But ok, I will do that.

Thanks!

@AndreiUlmeyda
Copy link

@jarun Okay, it is done. And I know climate change is an issue but please don't make me change it again. A little question now that the thing is sufficient for my own needs: Would you rather have it existing as a separate project, have it incorporated into buku, or none of the above? Either thing is fine with me. I was happy to be able to improve my bash a bit but in case some people find use for it and request more than one or two additional features I would just rewrite it in python anyway and have it require half the amount of code (or a third when incorporated into buku).

Cheers

@jarun
Copy link
Owner Author

jarun commented Dec 6, 2016

@AndreiUlmeyda I would like it to be a separate project that would be linked to Buku. As you would have seen, I linked Rasmus' Rofi frontend for Buku from my project. But with your new addition, it wouldnt suffice so I'ld have a distinct section that goes Porjects using Buku. That would add more emphasis and visibility.

What do you say?

@AndreiUlmeyda
Copy link

@jarun Sparkles with me! Let's do that.

@jarun
Copy link
Owner Author

jarun commented Dec 6, 2016

Here you go!

@AndreiUlmeyda
Copy link

Neato. A pleasure doing business with you.

@jarun jarun mentioned this issue Dec 6, 2016
33 tasks
@jarun jarun closed this as completed Dec 6, 2016
Repository owner locked and limited conversation to collaborators Dec 6, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

8 participants