Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recoll WebUI returns 0 results when search folder other than <all> is chosen #23

Closed
hagfelsh opened this issue Jan 19, 2014 · 17 comments
Closed

Comments

@hagfelsh
Copy link

Hi! Marvelous work on this project, I just discovered recoll & webui and am absolutely delighted at its power.

I discovered something odd today, though, while using the webUI: when constraining the search scope to a subdirectory (rather than ), the search will return 0 results, in every case, in every directory. The Recoll search GUI itself will return expected results.

Searching for the same term from will properly find the search terms, even from within the directories that return 0 when searched exclusively.

I've reinstalled both, just in case there was a problem there somewhere, but the problem is easily duplicated.

I'm using Fedora 20 x64, Recoll 1.19.19 + Xapian 1.2.15. WebUI version was whatever was up on January 18th, about 12 GMT. The web browser is Firefox 25.0. I also tried it in chrome 32.0.1700.76 m with the same results.

The searched material is on a CIFS share mounted on the Fedora machine. WebUI is started from the same account that owns the index (non-root).

Here is a comparison of a search for "balance" in a folder called ENG Doc Control/Docs. The base of the searched directories is mw-ksb.

recoll query:
(((balance:(wqf=11) OR balancing OR balanced OR balancer OR balances OR balancers) AND (XP PHRASE 4 XPmw-ksb PHRASE 4 XPENG Doc Control PHRASE 4 XPDOCs)))

webui query:
"GET /results?query=balance&dir=mw-ksbENG+Doc+ControlDOCs&after=&before=&sort=relevancyrating&ascending=0&page=1 HTTP/1.1" 200 9665

The web query in this example returned 0 results, while the Recoll UI, which was constrained to the same subfolder, returned 239.

Here's what the webUI returns when set to for the same search term:

"GET /results?query=balance&dir=%3Call%3E&after=&before=&sort=relevancyrating&ascending=0&page=1 HTTP/1.1" 200 70209

Please let me know if I can provide more information. Your help is greatly appreciated!

@ghost
Copy link

ghost commented Jan 20, 2014

Hi,

I'll assume that you are using Recoll 1.19.9 as .19 is not yet there :)

The query encoding seems wrong on the failing query: there should be some "%2F" pieces for the / separators. Is this a github effect or are they really missing when printed on the terminal ?

An initial try at reproducing this failed (I do get a correctly encoded query and results), but I can try harder once I know that the / characters are really missing.

GET /results?query=sac&dir=d%2Fdir+with+blanks&after=&before=&sort=relevancyrating&ascending=0&page=1 HTTP/1.1"

@hagfelsh
Copy link
Author

Oops... version from the future!

I did my best to read the webui code that generates the query, but my python isn't very strong. I ran diff against the webui.py that's in the zip & what is running on my machine and the files are identical.

Is there a debug flag I can set to create more verbose logging for you? The query string I posted for the webui was from stdout, which seems to be where the app sends its logging info.

@ghost
Copy link

ghost commented Jan 20, 2014

This is weird. In the query, there should be slash characters encoded as %2F. Instead, they seem to be suppressed. In other terms, "dir=mw-ksbENG+Doc+ControlDOCs" should look like "dir=mw-ksb%2FENG+Doc+Control%2FDOCs" instead (and the capitalization of "DOCs" is weird too by the way).

I think that it would be interesting to have a look at the generated HTML by using "show page source". I am especially interested by the "folders" section, the part which looks like:

 <b>Folder</b><br>
        <select id="folders" name="dir">
                <option style="margin-left: 0em" value="&lt;all&gt;">&lt;all&gt;</option>
                <option style="margin-left: 0em" selected value="d">d</option>
                <option style="margin-left: 2em" value="d/d1">d1</option>
                <option style="margin-left: 4em" value="d/d1/d3">d3</option>
                <option style="margin-left: 2em" value="d/d2">d2</option>
                <option style="margin-left: 2em" value="d/dir with blanks">dir with blanks</option>
                <option style="margin-left: 2em" selected value="d/ENG Doc Control">ENG Doc Control</option>
                <option style="margin-left: 4em" selected value="d/ENG Doc Control/Docs">Docs</option>
        </select><br>

(Put the data between 2 lines with 4 backquotes to prevent interpretation by GitHub, have a look at the "Markdown" link above).

@hagfelsh
Copy link
Author

Oo! The capitalization in "DOCs" is correct--that's how the directory is named, for whatever reason.

Looking at your example HTML, I wonder if this has anything to do with this index being of a CIFS share mounted on a Windows file server...?

Here's the source of the page after searching for "balance" with 0 results returned, from the same directory as in my original post. I fouled up the names of the folders, but kept any spaces or special characters that were present in the original folder names. The only non letters there were are ( ) - . and _


<body>
<div id="fade"></div>
<div id="searchbox">
<form action="results" method="get">
<table id="form">
<tr>
    <td width="50%">
        <b>Query</b>
        <input tabindex="0" type="search" name="query" value="balance" autofocus><br><br>
        <input type="submit" value="Search">&nbsp;
        <a href="./" tabindex="-1"><input type="button" value="Reset"></a>&nbsp;
        <a href="settings" tabindex="-1"><input type="button" value="Settings"></a>
    </td>
    <td width="30%">
        <b>Folder</b><br>
        <select id="folders" name="dir">
                <option style="margin-left: 0em" value="&lt;all&gt;">&lt;all&gt;</option>
                <option style="margin-left: 0em" selected value="mw-ksb">mw-ksb</option>
                <option style="margin-left: 0em" value="mw-ksbDocumentation">mw-ksbDocumentation</option>
                <option style="margin-left: 0em" value="mw-ksbDocumentationXXXX">mw-ksbDocumentationXXXX</option>
                <option style="margin-left: 0em" value="mw-ksbDocumentationXXXX">mw-ksbDocumentationXXXX</option>
                <option style="margin-left: 0em" value="mw-ksbDocumentationXxxxxxxxx">mw-ksbDocumentationXxxxxxxxx</option>
                <option style="margin-left: 0em" value="mw-ksbDocumentationXxxxRelease">mw-ksbDocumentationXxxxRelease</option>
                <option style="margin-left: 0em" selected value="mw-ksbENG Doc Control">mw-ksbENG Doc Control</option>
                <option style="margin-left: 0em" selected value="mw-ksbENG Doc ControlDOCs">mw-ksbENG Doc ControlDOCs</option>
                <option style="margin-left: 0em" value="mw-ksbStuff">mw-ksbStuff</option>
                <option style="margin-left: 0em" value="mw-ksbStuffOneNote_RecycleBin">mw-ksbStuffOneNote_RecycleBin</option>
                <option style="margin-left: 0em" value="mw-ksbHTTP_Save">mw-ksbHTTP_Save</option>
                <option style="margin-left: 0em" value="mw-ksbMisc">mw-ksbMisc</option>
                <option style="margin-left: 0em" value="mw-ksbPt Ds">mw-ksbPt Ds</option>
                <option style="margin-left: 0em" value="mw-ksbPt Dscore">mw-ksbPt Dscore</option>
                <option style="margin-left: 0em" value="mw-ksbPt DsEHV">mw-ksbPt DsEHV</option>
                <option style="margin-left: 0em" value="mw-ksbPt Dsmirror">mw-ksbPt Dsmirror</option>
                <option style="margin-left: 0em" value="mw-ksbPt Dsplatform">mw-ksbPt Dsplatform</option>
                <option style="margin-left: 0em" value="mw-ksbPt Dstest">mw-ksbPt Dstest</option>
                <option style="margin-left: 0em" value="mw-ksbPt Pub">mw-ksbPt Pub</option>
                <option style="margin-left: 0em" value="mw-ksbRecent page">mw-ksbRecent page</option>
                <option style="margin-left: 0em" value="mw-ksbRecent pageA_HZ">mw-ksbRecent pageA_HZ</option>
                <option style="margin-left: 0em" value="mw-ksbRecent pageRemote_Mfg">mw-ksbRecent pageRemote_Mfg</option>
                <option style="margin-left: 0em" value="mw-ksbRecent pageD_Hs">mw-ksbRecent pageD_Hs</option>
                <option style="margin-left: 0em" value="mw-ksbRecent paged_ws">mw-ksbRecent paged_ws</option>
                <option style="margin-left: 0em" value="mw-ksbRecent pageH_Lo">mw-ksbRecent pageH_Lo</option>
                <option style="margin-left: 0em" value="mw-ksbRecent pageJ_Jo">mw-ksbRecent pageJ_Jo</option>
                <option style="margin-left: 0em" value="mw-ksbRecent pagejrs">mw-ksbRecent pagejrs</option>
                <option style="margin-left: 0em" value="mw-ksbRecent pageM_Mds">mw-ksbRecent pageM_Mds</option>
                <option style="margin-left: 0em" value="mw-ksbRecent pageM_Km">mw-ksbRecent pageM_Km</option>
                <option style="margin-left: 0em" value="mw-ksbRecent pageR_En">mw-ksbRecent pageR_En</option>
                <option style="margin-left: 0em" value="mw-ksbRecent pageRn_Se">mw-ksbRecent pageRn_Se</option>
                <option style="margin-left: 0em" value="mw-ksbRecent pageR_Tp">mw-ksbRecent pageR_Tp</option>
                <option style="margin-left: 0em" value="mw-ksbRecent pageRTp">mw-ksbRecent pageRTp</option>
                <option style="margin-left: 0em" value="mw-ksbRecent pageS_Pgl">mw-ksbRecent pageS_Pgl</option>
                <option style="margin-left: 0em" value="mw-ksbRecent pagetfl">mw-ksbRecent pagetfl</option>
                <option style="margin-left: 0em" value="mw-ksbRecent pageTime">mw-ksbRecent pageTime</option>
                <option style="margin-left: 0em" value="mw-ksbRecent pageW_He">mw-ksbRecent pageW_He</option>
                <option style="margin-left: 0em" value="mw-ksbStandards">mw-ksbStandards</option>
                <option style="margin-left: 0em" value="mw-ksbStandardsServices">mw-ksbStandardsServices</option>
                <option style="margin-left: 0em" value="mw-ksbStandardsStandards (from MXP)">mw-ksbStandardsStandards (from MXP)</option>
                <option style="margin-left: 0em" value="mw-ksbStandardsSAF-TY">mw-ksbStandardsSAF-TY</option>
                <option style="margin-left: 0em" value="mw-ksbSPL">mw-ksbSPL</option>
                <option style="margin-left: 0em" value="mw-ksbSPLR010035003000300">mw-ksbSPLR010035003000300</option>
                <option style="margin-left: 0em" value="mw-ksbSPLR0500350033.114">mw-ksbSPLR0500350033.114</option>
                <option style="margin-left: 0em" value="mw-ksbSPLR050035003600318">mw-ksbSPLR050035003600318</option>
                <option style="margin-left: 0em" value="mw-ksbSPLR050035.1000328">mw-ksbSPLR050035.1000328</option>
                <option style="margin-left: 0em" value="mw-ksbSPLR050035.2100303">mw-ksbSPLR050035.2100303</option>
                <option style="margin-left: 0em" value="mw-ksbSPLR0500360031.413">mw-ksbSPLR0500360031.413</option>
                <option style="margin-left: 0em" value="mw-ksbSPLR060030003500343">mw-ksbSPLR060030003500343</option>
                <option style="margin-left: 0em" value="mw-ksbSPLR060031003300304">mw-ksbSPLR060031003300304</option>
                <option style="margin-left: 0em" value="mw-ksbSPLR060032003200315">mw-ksbSPLR060032003200315</option>
                <option style="margin-left: 0em" value="mw-ksbSPLR060033.10.106">mw-ksbSPLR060033.10.106</option>
                <option style="margin-left: 0em" value="mw-ksbSPLR060034003100374">mw-ksbSPLR060034003100374</option>
                <option style="margin-left: 0em" value="mw-ksbSPLR060034003200312">mw-ksbSPLR060034003200312</option>
                <option style="margin-left: 0em" value="mw-ksbSPLR0600350031.159">mw-ksbSPLR0600350031.159</option>
                <option style="margin-left: 0em" value="mw-ksbSPLR0600350031.195">mw-ksbSPLR0600350031.195</option>
                <option style="margin-left: 0em" value="mw-ksbWsmall">mw-ksbWsmall</option>
                <option style="margin-left: 0em" value="mw-ksbWsmallqw">mw-ksbWsmallqw</option>
                <option style="margin-left: 0em" value="mw-ksbWsmallWsmall">mw-ksbWsmallWsmall</option>
                <option style="margin-left: 0em" value="mw-ksbWsmallwifxt">mw-ksbWsmallwifxt</option>
        </select><br>
        <b>Dates</b> <small class="gray">YYYY[-MM][-DD]</small><br>
        <input name="after" value="" autocomplete="off"> &mdash; <input name="before" value="" autocomplete="off">
    </td>
    <td>
        <b>Sort by</b>
        <select name="sort">
                <option selected value="relevancyrating">Relevancy</option>
                <option value="mtime">Date</option>
                <option value="url">Path</option>
                <option value="filename">Filename</option>
                <option value="fbytes">Size</option>
                <option value="author">Author</option>
        </select><br>
        <b>Order</b>
        <select name="ascending">
                <option value="0" selected>Descending</option>
                <option value="1">Ascending</option>
        </select>
    </td>
</tr>
</table>
<input type="hidden" name="page" value="1" />
</form>
</div>

@hagfelsh
Copy link
Author

On a whim, I partially tested the CIFS theory by indexing a local directory. The resulting folder list has a distinctly different visual appearance than that of my CIFS-share directories. Each separate dir has its own line with only its name listed, rather than the concatenation of its name as well as its parents back to root.

Tomorrow, i'll map up a LUN to the recoll machine and copy my index target over so it's stored locally in ext4. I suspect that everything will work as expected...

@ghost
Copy link

ghost commented Jan 21, 2014

The / characters are definitely missing. I would like to try and reproduce this, it would make a resolution easier. Could you please tell me precisely how the CIFS share is mounted ? I tried with a vanilla autofs mount and things look normal...

@hagfelsh
Copy link
Author

I'm not sure I understand what sort of information you'd like, so let me know if I'm missing something.

The CIFS mount only includes _netdev,ro and is mounted to a mountpoint on /. As far as what is being served, it's a Windows 2008 R2 FileSharing server sharing at full network permissions and full NTFS permissions for the user I'm using to mount it.

@ghost
Copy link

ghost commented Jan 23, 2014

Thanks, I was wondering if you could have been using a fuse-based mount. I'll try to reproduce the issue, but I currently have trouble getting Fedora 20 to behave as a Virtualbox guest.

@ghost
Copy link

ghost commented Jan 23, 2014

Ok, I can reproduce the problem. I can try to see what happens and look for a possible fix now.

@ghost
Copy link

ghost commented Jan 23, 2014

Ok, I think it's fixed. This had nothing to do with the kind of system actually, just the fact that the top dir was directly under root (/). There is a fixed file here: https://github.com/medoc92/recoll-webui/blob/master/webui.py

Please let me know how this works for you and I'll put up a pull request.

Cheers,

jf

@hagfelsh
Copy link
Author

Sorry for the delay--github seems to have ceased notifying me of updates to this thread...

Fantastic! It's fixed!

Now, a related question: are the subdirectories supposed to be listed out one after another, or is there supposed to be some sort of visual or treed organization to show the parent/child relationship?

What I'm seeing in the dropdown is just a pure list of each directory name, sorted like this:
parentA
parentB
parentC
childA
childB
childC
childAchildA
childBchildA

If the above is confusing, it lists all the level 1 directories first, then the level 2 afterward, then 3 & so on.

@ghost
Copy link

ghost commented Jan 30, 2014

Ok, this is weird. Here is what my folder menu looks like, it's a representation of the tree:

recoll-webui-folder-menu

This is with a recent firefox. Do you get the same thing with firefox and chrome ?

Cheers,

jf

@hagfelsh
Copy link
Author

hagfelsh commented Feb 3, 2014

Looks to be a browser specific problem!

I get just a straight, non-indented pile of words from the following versions:
Chrome (running on Windows 7) 32.0.1700.102 m
IE (running on Windows 7) 9.0.8112.16421

However, it does display correctly in Firefox 22.0 (running on Win 7) and 25.0 (Running on Fedora). Also on FF ESR 10.0.5 (Running on Centos).

Writing for multiple browsers must be miserable!

@ghost
Copy link

ghost commented Feb 3, 2014

Yes, it must be awful, happily enough, I'm more of a desktop programmer...

Anyway, while koniu seems to be away, I have changed the identation method to something ugly but which should work on all browsers (hopefully). The modified file is here:
https://github.com/medoc92/recoll-webui/blob/master/views/search.tpl

I'll create a pull request, but I really hope that a nicer solution can be found...

@koniu
Copy link
Owner

koniu commented Feb 6, 2014

Assuming this is fixed. Thanks medoc

@koniu koniu closed this as completed Feb 6, 2014
@hagfelsh
Copy link
Author

hagfelsh commented Feb 6, 2014

Sorry for my delay in replying, guys--I'll apply this patch and report back.

@hagfelsh
Copy link
Author

hagfelsh commented Feb 6, 2014

Fixed!

Thank you guys for your help. This thing is a masterpiece!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants