-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sorting results #4
Comments
Supported locales: dz, bo |
iPhone 7 is 10.3.3 beta 1 |
Ok, in addition to following these bureports (plus this Debian one), I'll find a workaround so that we can sort Tibetan (we'll need it in other contexts too anyway) |
Why do we need to sort results for this app? Is this the equivalence of sort alpha? For large search results on weaker phones, this will slow performance. If the indexes were pre-sorted, it may be that the results would be sorted. Another way to do this is to assign a presort number across all indexes so that I can do a simple numeric sort at the database level. It would increase the index sizes slightly, but grant a nice level of performance and let the sort code be based on the server (a more controlled environment) rather than the client. |
What does "sort alpha" mean? if it means alphabetical sorting then yes, and that's necessary... We can imagine doing that on the server but that's going to take some time (Tibetan sorting is not obvious at all), so let's first give it a try on the client. lasca is some sort of reimplementation of the UCA in JS, it has some rules for Tibetan so we don't even need to write them down... I've forked it so that we can package it properly, but I admit I'm not completely sure how to do that properly, I could use a little help... what would be needed to use it in a simple way in the app? |
To sort after the results have been returned from the database, the library would be best as a module that I can import. (ES6 syntax). If it is possible for you to get it working on a JavaScript page by including a script, I can take care of the rest. Packaging can be as easy as exporting the object/class etc that does the work. I have no experience with UCA or Iasca, so I am glad you do! |
I've added a demo.html on the repo. I'll update the |
(done) |
BTW, don't hesitate to modify the lasca.js file to make it export something nice and modern, I suspect it's doing something not very intelligent right now, but modules in JS is not really my coMfort zone... (looks like something always evolving in many different directions) |
Great, thanks, Élie! |
There appears to be a problem with sort. Maybe you can help me, @eroux ? Here is a list of 20 titles, ཀུན་མཁྱེན་རིག་པ་འཛིན་པ་ཆེན་པོ་ཆོས་ཀྱི་གྲགས་པའི་གསུང་འབུམ As a list of objects:
This throws an exception. |
and more basically:
|
should be ok with latest lasca commit |
It appears that the search now works!! But.. it is not fast enough, even on a laptop. Search for 'W2' 5869 results, 15 seconds to sort on a MacBookPro 2.9 Ghz Intel Core i7 in chrome It is my belief that we need to think of a different method for sorting. Perhaps before the indices are provided to the app. We could assign an integer value that is valid across all types of index. This way, I could order results during the initial query. An alternative option is to pre-sort in the app and assign this sort integer. Either way, this has now been relegated to the bottom of the Release 1 list. |
Wow, these numbers are clearly too big. I've spotted a few easy optimizations that can be made in lasca, I'll take some time tomorrow to implement them... |
I've made some optimizations, I'm trying to test them, i'll open a separate issue for that |
Tried again with the latest lasca.js optimisations. Unfortunately, the exception reared its ugly head with the largest dataset. I did get numbers for the smaller sets, and they show no change in execution time. I am pushing sort to Release 2. |
hmm ok, I'll test it further tomorrow with a larger dataset (I just realized I could use the .json files from the repo, I'll do that) |
(if you have a simple way to test it with the whole data simply, I'd be happy to fix the thing) |
When I test, I use phonegap serve -- I also work on a reduced dataset, where I include all of the index files, but only a few selected detail json files. |
I got annoyed by lasca so I wrote a small lib that performs much better: tibetan-sort-js. It is packaged in a quite modern way so it should be easy to use... Example tibetansort = require('tibetan-sort-js');
var big = require('path/to/workIndex-0.json');
var bigA = Object.keys(big);
var before = new Date();
bigA.sort(tibetansort.compare);
var after = new Date();
console.log("sorted "+bigA.length+" strings in "+(after.getTime() - before.getTime())+"ms"); output:
which is reasonable I think... almost 100 times faster than lasca. |
Results should be sorted (see google doc). A solution that seems reasonable is to use Intl.Collator with the
bo
locale, or maybe falling back to thedz
locale for very old phones (thebo
locale should work for 2 years or so).As a side not there is a request to improve tibetan collation in CLDR with the work I did in this repo.
I'm running some experiments on http://eroux.fr/locales.html it doesn't seem to work, but I cannot understand why, i'll file some bugs, ask questions, etc. and update the issue. In the meantime I'm interested by the result of the page in your browser and on your phones!
The text was updated successfully, but these errors were encountered: