Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for searching in external files (fixing #89) #100

Closed
wants to merge 4 commits into from

Conversation

R0Wi
Copy link
Member

@R0Wi R0Wi commented Apr 12, 2020

This fixes #89. It is implemented like proposed here: nextcloud/fulltextsearch#546 (comment)

@TheR00st3r
Copy link

Are there plans when this PR will be considered and released?

@R0Wi
Copy link
Member Author

R0Wi commented Jun 15, 2020

@4ipalino does this fix the issue for you?

@TheR00st3r
Copy link

@R0Wi I tested it again and it worked on my system. As long as the external Drive is not used as Rootdirectory with foldername "/".

It´s important to set an explicit foldername for the external share:
image

@R0Wi
Copy link
Member Author

R0Wi commented Jun 16, 2020

Thanks for the feedback. Thats a case i didn't test so i'll have a look at this. Maybe i can implement a solution for that, too.

@R0Wi
Copy link
Member Author

R0Wi commented Jun 21, 2020

@4ipalino i modified the patch so now a external share mounted as root-directory should also work. If that's the case, the check of the filepath is now omitted because logically seen every "external file" is allowed to appear in the search result.

Maybe you would like to check that out if you've some time left and give me feedback on this 😄

@trendzetter
Copy link

I can confirm the fix is working. The code looks fine but probably my opinion is is not enough as a review since I am not a nextcloud developer.

@R0Wi
Copy link
Member Author

R0Wi commented Jun 24, 2020

@daita @MorrisJobke or maybe @skjnldsv any chance to get some feedback on this, please?

@skjnldsv
Copy link
Member

@daita @MorrisJobke or maybe @skjnldsv any chance to get some feedback on this, please?

Sorry, this is out of my expertise! :/

@MorrisJobke MorrisJobke removed their request for review July 2, 2020 22:00
@trendzetter
Copy link

trendzetter commented Jul 6, 2020

WTF!? This is a fix for a major blocking bug in one of the most important features to make this enterprise ready. Why is no-one of the nextcloud willing to review your code??

@ArtificialOwl
Copy link
Member

Hello, main dev on fts here.

I was supposed to work on the app last week but got overwhelmed by other more important thing.

@R0Wi
Copy link
Member Author

R0Wi commented Jul 6, 2020

Hello, main dev on fts here.

I was supposed to work on the app last week but got overwhelmed by other more important thing.

No problem for me. I already did a few minor improvements for some customers who use the FTS so maybe it would be interesting for you if i can support you somehow. Just let me know :-)

@trendzetter
Copy link

Sorry for my emotional reaction. I am looking into the contribution guidelines, see if I can make more of a contribution instead of pointing out issues.

@trendzetter
Copy link

I am still figuring out why this happens, not sure if it has anything to do with this fix but I am not seeing all of the results found. I often see only one or 2 or even no results listed. I just know results are found because it displays on top of the page:"de zoekopdracht in Files naar 'bunny' leverde 8 resultaten op in 51 ms" (8 results found in Dutch)

@R0Wi
Copy link
Member Author

R0Wi commented Jul 8, 2020

I am still figuring out why this happens, not sure if it has anything to do with this fix but I am not seeing all of the results found. I often see only one or 2 or even no results listed. I just know results are found because it displays on top of the page:"de zoekopdracht in Files naar 'bunny' leverde 8 resultaten op in 51 ms" (8 results found in Dutch)

I think this has nothing to do with this change i should rather be a frontend problem (this fix just extends the backend search mechanism). Maybe you could take a look at the dev-console network tab and see the JSON response returned by the server? If the result rows are missing it could be something on the backend otherwise i'd say there's something going on in the frontend javascript code or something. If you like you can create a issue here and i'll see if i can help you somehow (just to keep this discussion clean here)

@Jake-Etherflow
Copy link

Works perfectly! Thank you very much. I was able to mount nfs and then map nextcloud local storage to it with no further issues displaying results for the index.

@ArtificialOwl
Copy link
Member

Hello,

There is no reason for this app to search/access remote files. Can you try to migrate your work onto files_fulltextsearch ?

@R0Wi
Copy link
Member Author

R0Wi commented Aug 3, 2020

Well currently the issue is that external files are properly indexed (with the help of files_fulltextsearch). But the results (indexed files) are then filtered by ownership when creating a search request inside this app. Because external files do not have a special owner they're filtered.

My current approach is to get all external mounts accessible to the current user when he uses the search functionallity. This info can then be integrated into the filter part of the ES query.

I know that this somehow mixes up the separation between files_fulltextsearch and fulltextsearch_elasticsearch but i did not find another way to implement the mentioned feature. Maybe you could give a draft on how to provide this inside files_fulltextsearch? I'd then really be looking forward implementing this :-)

@omidmeh
Copy link

omidmeh commented Aug 19, 2020

I'm also very interested in this.

In the meantime, I tried just downloading the files that you've changed and replacing the ones from the original plugin installation with them. However, it looks like when I do that the search no longer works!

Is there any other steps I need to take other than replacing the old files with these in the installation folder?

@Jake-Etherflow
Copy link

Jake-Etherflow commented Aug 19, 2020

I don't remember where the other patch was, but I had to combine the two in the same file to get search working and external files as well. If you'd like a copy of it, shoot me a message, and I can send it to you.

EDIT: Pretty sure #107 was it.

@R0Wi
Copy link
Member Author

R0Wi commented Aug 20, 2020

@omidmeh i think what you could do is cloning this repo, applying the patch from this PR and installing the apps dependencies manually (note you need to install git as well as composer). So basically you could do something like this:

cd <NEXTCLOUD_ROOT>/apps
rm -rf fulltextsearch_elasticsearch
mkdir fulltextsearch_elasticsearch && cd fulltextsearch_elasticsearch
git clone https://github.com/nextcloud/fulltextsearch_elasticsearch.git .
git checkout v1.5.2
curl https://patch-diff.githubusercontent.com/raw/nextcloud/fulltextsearch_elasticsearch/pull/100.patch | git apply -v
composer install
cd .. && chown -R www-data:www-data fulltextsearch_elasticsearch

Let me know if this helps.

EDIT: note that v1.5.2 should already contain the fix regarding ES7.7 support mentioned by @Jake-Etherflow so the above should work

@omidmeh
Copy link

omidmeh commented Aug 28, 2020

This fixed my issue as well. Thank you!!

@omidmeh
Copy link

omidmeh commented Aug 28, 2020

The only thing that I noticed is that when I search for something that belongs to a different user, although I don't actually see the result, it does show "the search in Files for 'search-term' returned 1 results in 56 ms".

That's still a huge improvement since the owner user sees the file but the non-owners don't. Would be nice if we could adjust that number as well though!

@R0Wi
Copy link
Member Author

R0Wi commented Aug 28, 2020

Could be related to R0Wi#1

@ravermeister
Copy link

sigh, I really hope this fix find the way to the master branch (and the next Version). otherwise the whole thing is nearly useless, even worse, It gives the User the wrong feeling about the content of his own data!.

@lhurt
Copy link

lhurt commented Nov 20, 2020

Why is it not merged to the main branch?

@Apfelwurm
Copy link

Hi, can please someone of the staff review this PR? It works perfectly and its a pain to implement this fix for every release manually.
Thanks!

@trendzetter
Copy link

I think someone is waiting for your good consultancy fees.

@R0Wi
Copy link
Member Author

R0Wi commented Jan 8, 2021

Like mentioned in #100 (comment) this PR does not respect the separation of concerns between the FTS apps (even it works...). Unfortunately currently i don't have much time to port my work onto files_fulltextsearch because i think this would be a bit more heavy lifting.

@tauceti82
Copy link

@omidmeh i think what you could do is cloning this repo, applying the patch from this PR and installing the apps dependencies manually (note you need to install git as well as composer). So basically you could do something like this:

cd <NEXTCLOUD_ROOT>/apps
rm -rf fulltextsearch_elasticsearch
mkdir fulltextsearch_elasticsearch && cd fulltextsearch_elasticsearch
git clone https://github.com/nextcloud/fulltextsearch_elasticsearch.git .
git checkout v1.5.2
curl https://patch-diff.githubusercontent.com/raw/nextcloud/fulltextsearch_elasticsearch/pull/100.patch | git apply -v
composer install
cd .. && chown -R www-data:www-data fulltextsearch_elasticsearch

Let me know if this helps.

EDIT: note that v1.5.2 should already contain the fix regarding ES7.7 support mentioned by @Jake-Etherflow so the above should work

the 100.patch was adapted so it couldn't patch the Makefile as the version+=1.5.2 was changed to 20.0.0 so it could not find the correct positions to patch...what helped was to manually change version+=1.5.2 to version+=20.0.0 before applying the patch...

And then do in the end occ upgrade otherwise it will disable the plugin.

it cost me hours to get the search working on external files! This has to be fixed asap!

@R0Wi R0Wi force-pushed the master branch 2 times, most recently from f7ed532 to 65430df Compare March 18, 2021 20:26
@R0Wi
Copy link
Member Author

R0Wi commented Mar 18, 2021

@tauceti82 sorry for the circumstances i rebased this PR onto the current v21.0.0 so you should now be able to apply the patch again by git checkout v21.0.0 and then following the commands from above. Maybe i manage to build a regularly updated custom .tar file via Github Actions and link it here these days.

@trendzetter
Copy link

@ArtificialOwl @MorrisJobke Why is this not getting any review or any kind of feedback? Is nextcloud sectarian? This makes me think twice before proposing Nextcloud to anyone.

@MorrisJobke
Copy link
Member

I left the company and only have very little time in my spare time. Sorry

@Apfelwurm
Copy link

Any chance this problem can get revisited?

@R0Wi
Copy link
Member Author

R0Wi commented Mar 9, 2022

ping @ArtificialOwl

@trendzetter
Copy link

@adsworth ?

@adsworth
Copy link

@adsworth ?

I've only worked on the app store and that was three years ago or so. I'm not an active Nextcloud member/ceveloepr.

@ArtificialOwl
Copy link
Member

it is possible that nextcloud/files_fulltextsearch#170 fixes this issue. this needs confirmation

@R0Wi
Copy link
Member Author

R0Wi commented Jun 8, 2022

@ArtificialOwl thank's but if i understand the code correctly, this most likely doesn't solve the problem for external files. It's already a long time ago i dealt with the code but if i remember correctly, the main problem was the search filter condition in https://github.com/nextcloud/fulltextsearch_elasticsearch/pull/100/files#diff-b55ac6e506ccacc79909f6e0bb3d312f92a4d33e089c0d952fe032fdfe60fb71R378 which currently does not allow search results to have an empty owner when searching inside the index.

Nevertheless i'm going to test the patch these days and give you some feedback on this 👍

@lhurt
Copy link

lhurt commented Sep 10, 2022

@ArtificialOwl thank's but if i understand the code correctly, this most likely doesn't solve the problem for external files. It's already a long time ago i dealt with the code but if i remember correctly, the main problem was the search filter condition in https://github.com/nextcloud/fulltextsearch_elasticsearch/pull/100/files#diff-b55ac6e506ccacc79909f6e0bb3d312f92a4d33e089c0d952fe032fdfe60fb71R378 which currently does not allow search results to have an empty owner when searching inside the index.

Nevertheless i'm going to test the patch these days and give you some feedback on this 👍

Hi, is there any noteworthy progress on this issue?

@trendzetter
Copy link

trendzetter commented Sep 10, 2022

Hi, is there any noteworthy progress on this issue?

It looks promising and it's merged. I haven't tested myself but I suppose the best thing to do is try install a vanilla build and see if searching works now on external files

@trendzetter
Copy link

trendzetter commented Oct 21, 2022

I managed to test this scenario again and I can confirm searching using the search box works for external files now.
Unless someone else thinks this needs to be reopened I think it's probably OK to close.
@lhurt @ArtificialOwl

@R0Wi
Copy link
Member Author

R0Wi commented Oct 23, 2022

Also tested it on a fresh NC25 instance with one and two users and a samba share. Seems to work now because there was a change how files are indexed. Don't know when this changes came into the app but with app version 24 searching in external files seems to be fixed. Thank's @ArtificialOwl or who else fixed this 👍

@lhurt
Copy link

lhurt commented Nov 5, 2022

Unfortunately it doesn't work for me.

I have some Samba shares, where each user has to provide its password in the configuration.

My test included the following steps

  1. Recreate index for all users from scratch
  2. Show live indexing (occ fulltextsearch:live)
  3. Copy a file on an indexed directory using linux bash
  4. Copy a file via Windows Explorer in the very same directory
  5. No indexing is displayed in the live search
  6. All previously existing files can be found using nextcloud search, but not the copied ones

Any idea what could be wrong?

@R0Wi
Copy link
Member Author

R0Wi commented Nov 5, 2022

@lhurt since the fulltextsearch app heavy relies on the file-system events of NC (like for example when a new file is detected by Nextcloud), I think you're hitting the problem, that files which are directly added to the samba share do not trigger the NC events. See official docs. If indexing works when uploading the file directly to NC (for example via WebUI), then this is most likely not a problem with the fulltextsearch app itself.

@lhurt
Copy link

lhurt commented Jan 30, 2023

Thanks for the explanation. Wouldn't https://www.php.net/manual/en/book.inotify.php help to improve the situation. I'm not sure if this is the correct place to ask. Please forgive me if I'm completely off track. I've no PHP development skills.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Full Text Search Finds No Results For External Storage