MTNI-199 ⁃ Checklist for finishing up releasable product. #1

blackforestboi · 2017-04-03T12:38:37Z

blackforestboi · 2017-10-23T07:17:59Z

so, now that @poltak hustled through and replaced the search TWICE, we are on the finish line!

The last things to do are:
(@everyone: Any other ideas?)

Features

PRIO 1:

Prefiltering urls in import MTNI-141 ⁃ Url filter to reduce number of indexed documents #112 (@arpit, do you think you can tackle this until the end of the week, otherwise @poltak may be quicker?)
Copywriting on Acknowledgements page & settings (@oliversauter)
Import from Old Extension (@poltak)
Menu bar icon in smaller size (@oliversauter)
Change text in blacklist dialogue and include the url/domain that is about to be deleted. "Do you want to delete all visits to asana.com?"
change wording from "recording" to "indexing" pages in the drop down (@oliversauter)
Adding uninstall survey link (see old extension: https://github.com/WorldBrain/Research-Engine/blob/master/src/js/background.js#L366)
Adding new onboarding screen for first time users.
Making a feature switch for all the console logs. Most of the things are not important to a use, and sometimes can even be cluttering for them. We should discuss what to show and what is only important to devs.

PRIO 2:

Filtering by bookmarks MTNI-210 ⁃ Filter by bookmarks #26 (taken on by @mukeshkharita)
Putting previously failed URLs on the back of the import queue. With each pause, or cancelling, and then restarting, all failed URLs are fetched again.
First loading bar in import process needs a bit explanatory text "Please wait while we analyze & prepare your browsing history & bookmarks"
handling of cyrillic, chinese and japanese characters?
When first using extension with empty list, show text "You have not made any history yet, either visit some pages and come back, or import your existing history" Image of moonlanding drawing.

Bugs

Sometimes the screenshot and favicons are not properly saved. Reproducing by closing some tabs earlier, shortly after the indexing is kicked off.
PDFs are not indexed in imports
The term search is an "OR" search. > should be "AND"
Reducing size of screenshots (currently 1.2MB per page) (potential library?) Target: 50kb?
when searching after searching something else, the loading bar is not shown anymore. Therefore there is no indication for a user that an actual search happens. It just updates the results.
Quick Blacklist does not work -> it deletes everything, if you chose to, but does not add the page to the blacklist.
if no favicon available, dont show anything > right now it shows missing image placeholder.
some words weirdly indexed (see comment)

blackforestboi · 2017-10-23T07:51:46Z

@poltak Weird search times:
I assume in this case though, that the time it takes to load all of the results (which are almost 500) could be the dominating factor to the "total search" time. For the same query in the address bar its about half that time. The bug with the screenshot-size might be related?

blackforestboi · 2017-10-24T00:12:47Z

Testrun on Firefox:

When trying to run the importer, it is stuck in the preparation mode and throws this error:

Also when opening pages, after 10 seconds Firefox freezes. I assume because of the same error.

poltak · 2017-10-24T04:59:09Z

@oliversauter Need a bit more details about the first one. Is it reproducible at all and with different types of search?
The search branch is working great in FF for me as well. Is this FF problem on a build of master or search branch?

If on the search branch, make sure to reinstall the ext from scratch and wipe your data whenever you update as well.

RE listed bugs:

Sometimes the screenshot and favicons are not properly saved. Reproducing by closing some tabs earlier, shortly after the indexing is kicked off.

This is the expected behaviour of the WebExt API. If the user closes a tab while it's extracting tab data, it will crash as the tab won't exist anymore in memory.

Reducing size of screenshots

We currently store the full res PNG screenshot. We could change to JPEG format and specify to the WebExt API to capture them at lower quality. Relevant API. We only use these as thumbnails for the Overview UI right now, so we can easily get them down to ~50Kb with lower quality JPEGs, but depends if you have any future plans (like displaying them larger somewhere).

when searching after searching something else, the loading bar is not shown anymore

It should still be there, it's just below the results. Either we can stop rendering the results list view immediately as a new query is detected (just leaving the spinner), or we can move the spinner somewhere, like overlaying on top of the results list.

Quick Blacklist does not work

This one already fixed in my new search branch. Will come later.

if no favicon available, dont show anything > right now it shows missing image placeholder.

In the results list? For pages that don't have favicon data, it should not render any image element there. Can you give me a URL of a page where I could reproduce this behaviour and see if I can fix it?

blackforestboi · 2017-10-24T06:12:20Z

Need a bit more details about the first one. Is it reproducible at all and with different types of search?

My wonder was here at first, why the total search time was so high, even though the term search time was rather low. My assumption is that the gap of about 3000ms came from loading all those documents from pouch. It was especially impactful for terms that appear in many pages.
An idea to solve this: if the page docs related to a term-key are stored in order of their initial saving, we could use the same pagination approach of only loading 20 at a time?

so we can easily get them down to ~50Kb with lower quality JPEGs, but depends if you have any future plans (like displaying them larger somewhere).

What storage-size would be reasonable so we could display them a bit larger, maybe 3-4 times the size of right now?

or we can move the spinner somewhere, like overlaying on top of the results list.

Why not removing the previous results, if a search is made, so the regular spinner is on top again?
If a person changes the query, they don't to see the old results anymore anyway.

Can you give me a URL of a page where I could reproduce this behaviour and see if I can fix it?

I saw it a couple of times with twitter pages.

This is the expected behaviour of the WebExt API. If the user closes a tab while it's extracting tab data, it will crash as the tab won't exist anymore in memory.

Should we maybe then delete the page completely?

poltak · 2017-10-24T07:40:11Z

With the first one, need to be able to reproduce it first or else everything is just speculation. Will see if I can try get it reproducible with larger data set later today.

What storage-size would be reasonable so we could display them a bit larger, maybe 3-4 times the size of right now?

All depends on the amount of visual artefacts you want visible. That would be the main result of changing to JPEG and lowering quality. I'll try a few quality values later and send you some screenshots of what it would look like.

I saw it a couple of times with twitter pages.

Really need specific examples of where it's happening for you so I can reproduce.

Should we maybe then delete the page completely?

Yeah, it would be nice, but as the entire page visit scenario is a bunch of async stuff, there's no guarantee of what stage it got up to before the user cancelled the tab (the state of the DBs, both for index and pouch, is unknown). This one would be better as its own issue to work towards and see how we can handle different scenarios.

blackforestboi · 2017-10-24T08:59:09Z

Will see if I can try get it reproducible with larger data set later today.

Hack: looked for terms in DB that have 300+ urls in the map.

Really need specific examples of where it's happening for you so I can reproduce.

I figured it is for twitter pages that I import, where this problem appears.

This one would be better as its own issue to work towards and see how we can handle different scenarios.

Ok defo something for later.

poltak · 2017-11-02T05:53:15Z

When first using extension with empty list, show text "You have not made any history yet, either visit some pages and come back, or import your existing history" Image of moonlanding drawing.

This one would be a bit messy to implement. We'd either need to do:

have a initial query on both DBs at the start of every search, seeing if there's anything in there
have a one-time flag stored somewhere that says "no data yet", but this would need to be checked every time we index a file (and swap it if not set)

Open to other suggestions.

Making a feature switch for all the console logs. Most of the things are not important to a use, and sometimes can even be cluttering for them. We should discuss what to show and what is only important to devs.

Anything logged to console should only really be a concern to devs; shouldn't really need to concern ourselves with standard users opening the console - they can if they want. At the moment it should be mostly timers for various events and various errors that are caught. None are needed at all, but the errors are nicer than hiding them, IMO. Any ideas with what you want/don't want in there?

blackforestboi · 2017-11-02T07:16:04Z

This one would be a bit messy to implement.

Ok, was just an idea. Let's do it later. I think it is a good onboarding feature. But defo for later improvement

Any ideas with what you want/don't want in there?

In the old extension I often got the feedback that in the console of a page (not the extension console) it was cluttered with (error) messages, thus being annoying for developers who use the console to investigate other things unrelated to the extension.

So prio would be to remove all logs that appear on the page logs. I currently only see 1 though:
html-pipeline: 1101.947021484375ms

- this stuff gets run in content script, hence pollutes every page's console that it runs on - #1 (comment)

blackforestboi · 2017-11-08T13:20:28Z

THANK YOU all for your awesome work in the past months. You made this possible.
Yesterday we pushed the new version of the tool into the Chrome and Firefox Addon Stores.
worldbrain.io/download_extensions.

Closing this as all things for the release have been done.

Pull in changes from upstream

blackforestboi added the Prio 1 label May 29, 2017

blackforestboi mentioned this issue Oct 24, 2017

MTNI-154 ⁃ Simple terms indexed search #125

Merged

17 tasks

poltak mentioned this issue Oct 30, 2017

MTNI-162 ⁃ Some Firefox versions not granting ext. access to IndexedDB #133

Closed

blackforestboi mentioned this issue Nov 1, 2017

MTNI-173 ⁃ Counter for bookmarks / History imports is still off #144

Closed

WorldBrain-syncboy added the Prio 1 label Nov 4, 2017

poltak added a commit that referenced this issue Nov 6, 2017

Remove console timer from HTML transform module

148c0ac

- this stuff gets run in content script, hence pollutes every page's console that it runs on - #1 (comment)

blackforestboi closed this as completed Nov 8, 2017

poltak mentioned this issue Nov 13, 2017

Show initial search message when user has no data #192

Merged

blackforestboi changed the title ~~Checklist for finishing up releasable product.~~ MTNI-199 ⁃ Checklist for finishing up releasable product. Apr 19, 2018

kellective mentioned this issue Oct 10, 2019

Onboarding tidy up #886

Merged

poltak pushed a commit that referenced this issue Jun 2, 2020

Merge pull request #1 from WorldBrain/develop

a0e6125

Pull in changes from upstream

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MTNI-199 ⁃ Checklist for finishing up releasable product. #1

MTNI-199 ⁃ Checklist for finishing up releasable product. #1

blackforestboi commented Apr 3, 2017 •

edited

blackforestboi commented Oct 23, 2017 •

edited by poltak

blackforestboi commented Oct 23, 2017

blackforestboi commented Oct 24, 2017

poltak commented Oct 24, 2017

blackforestboi commented Oct 24, 2017

poltak commented Oct 24, 2017

blackforestboi commented Oct 24, 2017

poltak commented Nov 2, 2017

blackforestboi commented Nov 2, 2017

blackforestboi commented Nov 8, 2017

MTNI-199 ⁃ Checklist for finishing up releasable product. #1

MTNI-199 ⁃ Checklist for finishing up releasable product. #1

Comments

blackforestboi commented Apr 3, 2017 • edited

blackforestboi commented Oct 23, 2017 • edited by poltak

Features

Bugs

blackforestboi commented Oct 23, 2017

blackforestboi commented Oct 24, 2017

Testrun on Firefox:

poltak commented Oct 24, 2017

blackforestboi commented Oct 24, 2017

poltak commented Oct 24, 2017

blackforestboi commented Oct 24, 2017

poltak commented Nov 2, 2017

blackforestboi commented Nov 2, 2017

blackforestboi commented Nov 8, 2017

blackforestboi commented Apr 3, 2017 •

edited

blackforestboi commented Oct 23, 2017 •

edited by poltak