Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intermittent Multi-threading bug with small folders #3

Closed
clb92 opened this issue Nov 16, 2019 · 10 comments
Closed

Intermittent Multi-threading bug with small folders #3

clb92 opened this issue Nov 16, 2019 · 10 comments
Assignees
Labels
bug Something isn't working scan Scan module

Comments

@clb92
Copy link

clb92 commented Nov 16, 2019

I have the two following directories created

Files I want indexed:
/mnt/user/projects/test/

Location of indexes:
/mnt/user/temp/sist2_indexes/

The sist2_indexes/ directory is initially empty.
Here's an ls of the test/ directory with the files to be indexed:

> ls -l /mnt/user/projects/test/

total 20
-rw-rw-rw- 1 me users    7 Nov 15 15:31 File.txt
-rw-rw-rw- 1 me users    9 Nov 15 15:31 File2.txt
-rw-rw-rw- 1 me users    9 Nov 15 15:31 File3.txt
-rw-rw-rw- 1 me users 3290 Nov 15 15:31 image1.png
-rw-rw-rw- 1 me users 3290 Nov 15 15:32 image2.png

I run the following Docker commands

Scan

> docker run -it -v /mnt/user/projects/test/:/files -v /mnt/user/temp/sist2_indexes/:/indexes simon987/sist2 scan -t 16 /files -o /indexes/test_index

sist2 V1.1.5
---------------------
threads         16
tn_qscale       5.0/31.0
tn_size         500px
output          /indexes/test_index/

Index

> docker run -it --network host -v /mnt/user/temp/sist2_indexes/:/indexes simon987/sist2 index /indexes/test_index

Delete index <0>
Create index <0>
Close index <0>
Update settings <0>
Update mappings <0>
Open index <0>
Indexed   0 documents (0kB) <0>

Already here it seems something has gone wrong, since it says "Indexed 0 documents".

Web

> docker run --rm --network host -d --name sist2 -v /mnt/user/temp/sist2_indexes/:/indexes -v /mnt/user/projects/test/:/files simon987/sist2 web --bind 0.0.0.0 --port 8888 /indexes/test_index

f275f598e9b39564cd8e4ac06bcb1915a066a6bf3b566ea9cd1ff64c321c13f1

The web interface comes up as expected on port 8888, and the index "test_index" shows up in the "Search in indices" list, but no files show up and searching doesn't do anything.

What am I doing wrong here?

@simon987
Copy link
Owner

It's having trouble connecting to Elasticsearch, make sure you have an instance running
For example:

docker run -d -p 9201:9200 \
	-e "discovery.type=single-node" \
	docker.elastic.co/elasticsearch/elasticsearch:7.4.2

and run with --es-url http://localhost:9201

In the next release I'll add a proper error message, I'm sorry about that

@clb92
Copy link
Author

clb92 commented Nov 16, 2019

With --es-url http://localhost:9201 in all of the three sist2 commands?

@simon987
Copy link
Owner

With --es-url http://localhost:9201 in all of the three sist2 commands?

Sorry,
Only the web and index commands

@clb92
Copy link
Author

clb92 commented Nov 17, 2019

Gotcha.
I still get "Indexed 0 documents" from the index command:

> docker run -it --network host -v /mnt/user/temp/sist2_indexes/:/indexes simon987/sist2:latest index /indexes/test_index --force-reset --es-url http://localhost:9201

Delete index <200>
Create index <200>
Close index <200>
Update settings <200>
Update mappings <200>
Open index <200>
Indexed   0 documents (0kB) <400>

It seems like the scan has some problems. I've tried running it maybe 20 times by now, trying different things and combinations, and the best I got (seemingly randomly?) was that it found 2 of the 3 .txt files (and none of the .png files) and successfully indexed them and showed them in the web UI.
I have not been able to reproduce it.

@simon987
Copy link
Owner

simon987 commented Nov 17, 2019

That's very weird, if you don't mind, can you show me the contents of test_index ?

EDIT: Also, can you try with only one thread (-t 1)?

@clb92
Copy link
Author

clb92 commented Nov 17, 2019

It seems to work perfectly every time with 1 thread!

@simon987
Copy link
Owner

That's odd, I'll try to see why that is the case when I have more time this week.
Thanks a lot for the help :)

@clb92
Copy link
Author

clb92 commented Nov 17, 2019

No, I thank you for your help!

I'm so hyped for this tool. Lately, I've been using Everything Search running under Wine in an Ubuntu VM with network shares mounted, so that I could search all my files a central place. It was not a good, stable, fast or pretty solution... This tool is going to change that :-)

I do have some feature requests for the future:

  • Login for the web interface. Even HTTP Basic Auth would do. Scratch that, I'd forgotten how insecure that is...
  • Different views for search results. I would like a simple list as a display option.

@simon987 simon987 added the bug Something isn't working label Nov 17, 2019
@simon987 simon987 changed the title Indexed 0 documents - Anything obvious I'm doing wrong? Intermittent Multi-threading bug with small folders Nov 17, 2019
@simon987 simon987 self-assigned this Nov 17, 2019
@simon987 simon987 added the scan Scan module label Nov 17, 2019
simon987 added a commit that referenced this issue Nov 19, 2019
@simon987
Copy link
Owner

@clb92 Please let me know if the v1.1.8 release fixed that issue

(As an aside, you should not set too many threads, especially if you're working with hard drives)

@clb92
Copy link
Author

clb92 commented Nov 19, 2019

I don't seem to encounter the problem any more. Haven't tested it super thoroughly though. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working scan Scan module
Projects
None yet
Development

No branches or pull requests

2 participants