-
-
Notifications
You must be signed in to change notification settings - Fork 170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Huge long term disk usage (performance improvement) #152
Comments
please post this issue to https://github.com/manticoresoftware/manticoresearch |
@DEgITx Hello, I have submitted an issue at manticore as seen/linked above, he suggested 1) to upgrade manticore as you are using very old in Rats search 2) to mlock attributes / attributes + doclists + hitlists but i do not know how to. Do you know what exactly to do or you even provide a modified file so i try it? Or beyond out of your scope? |
No reply from @DEgITx , can you try to build with latest manticoresearch please, because its developer is not willing to debug this serious issue on 3.5 old manticoresearch that you seems to be using in rats-search ? If you fail to reply or not willing to update the SW, i may have to stop using rats-search since it is causing disk overload for me as described. |
@slrslr are you shure you are using last rats build? There are 3.6.0 manticore version for 64-bit hosts. |
you will see version in rats.log, must be 3.6.0 |
another option to descrese disk usage and using sphinx overall is this option section https://github.com/DEgITx/rats-search/blob/master/docs/USAGE.md#torrent-scanner-settings it will descrese sphinx usage, but also descrese speed of collection torrents. Performance of disk usage by searched process in related to manticore issues. |
@slrslr rats-search/src/background/sphinx.js Line 35 in 750dbfd
you can add sphinx/manticore options accroding their docs and run |
I do not know where is that log and its file name, i was unable to find it in git cloned rats-search/ dir. and no "daemon version" string found in its files. But i have found "Manticore 2.6.1 a7fa71e@180126 release" in searchd.log
i do not know what is this or how i should be setting this, i am not a developer. The developer in the issue i linked earlier says:
outputs: i have tried to update from https://github.com/DEgITx/rats-search.git, it said: Updating 69df9ec..750dbfd After some time i have found https://github.com/DEgITx/rats-search/tree/manticore3 but do not know what is its .git or if/how it is compatible with my current manticore. |
for server version you must use last master version of rats rats.log located in the same directory where you started npm run server ensure that you are see this message, in case of linux your directory must be imports\linux\x64\searchd ensure that you are seeing remove searchd.log before start if not post you info that you are see |
i am sorry, unfortunately this information does not make me understand how to proceed, please kindly let me know the steps. My commands was:
i have been searching for this file in my rats-search directory clone (from git) |
Sphinx path: */rats-search/imports/linux/x64/searchd |
$ md5sum imports/linux/x64/searchd verify thats correct line, if not check you git repository, something not updated in your repo and it's old |
mine md5sum is different
i do not understand git, do you have idea what is wrong in that mentioned command: I know that one can "reset" local repository: cd rats-search;git reset --hard |
git reset --hard doesn't touch files not under git |
thanks, yes, it updated the files to: |
It must start migration process, what messages do you see below |
i think that it output some issue at first run but that run was unfortunately on background and i do not know more detail (no .log file in rats-search/ contain any more detail. But on every run now it shows only the things i posted under link in my previous comment. Nothing more it stops like that. I do not see any significant CPU, disk activity of the server.js processes and "npm run server" one. When i interrupt by Ctrl+C shortcut, it says:
so it mentions rats-search/imports/linux/x64/index_converter |
update from master and check again |
just did and this is the output of "npm run" |
try to restart it, if not helping, can you pack you database and send to me? the log end is pretty stange |
i do not think i can restart it (interrupting on command line does not seem to end it either), so far i am killing it, after killing and starting i think the result is same as described earlier, please kindly download the database using |
not downloading by this magnet |
Did you embed also trackers? it is private torrent with several trackers Working on my side. Seeded non-stop, active peer so i am not sure why not worked. I am uploading to a server now.. can take longer time. |
@DEgITx upload finished, please download database here and try to reproduce the issue #152 (comment) |
@sanikolaev, thanks for advice, I will enable them as stored only |
@DEgITx
consider reopening this issue? |
sorry, I edited anwer above, you probably need only '--logdebug', |
Thanks that worked - I have been running rats with sphinx --logdebug for a few hours and issue reappeared (hundreds of MB/s read sys. drive). hopefully it will help you to debug this issue (maybe should be re-opened)
can you let me know so i will try to apply your modification and test on my end? |
@sanikolaev , I noticed problem to execute such query with stored only fields
with any of them will fail the request (as example from files) |
Please provide more details about the failure. I can't reproduce it: mysql> drop table if exists t; create table t(f text, path text stored); desc t; insert into t values(0,'abc','path'),(0,'def','another path'); select * from t; select max(id) as maxid from t;
--------------
drop table if exists t
--------------
Query OK, 0 rows affected (0.05 sec)
--------------
create table t(f text, path text stored)
--------------
Query OK, 0 rows affected (0.03 sec)
--------------
desc t
--------------
+-------+--------+----------------+
| Field | Type | Properties |
+-------+--------+----------------+
| id | bigint | |
| f | text | indexed stored |
| path | text | stored |
+-------+--------+----------------+
3 rows in set (0.00 sec)
--------------
insert into t values(0,'abc','path'),(0,'def','another path')
--------------
Query OK, 2 rows affected (0.00 sec)
--------------
select * from t
--------------
+---------------------+------+--------------+
| id | f | path |
+---------------------+------+--------------+
| 1514445464932450316 | abc | path |
| 1514445464932450317 | def | another path |
+---------------------+------+--------------+
2 rows in set (0.00 sec)
--------------
select max(id) as maxid from t
--------------
+---------------------+
| maxid |
+---------------------+
| 1514445464932450317 |
+---------------------+
1 row in set (0.00 sec) |
i just crested db from zero with config:
gives unknown local index(es) 'files' in search request |
I've corrected it:
|
@sanikolaev some questions about stored_only_fields. First of all is switching rt_attr_string to to stored fields will affect the performance and memory usage of table in memory?
hash attribute can not be conveted, that cause error on request:
so before using of them I'm intresting is it will be any purpose of switching, because it will cause some limitation of requests that I can use in future. And for example do you think in @slrslr situation it will help? His table has a million of records, so maybe effect can not be such significant. I can do commit in separate branch to give him test, is it good or not. And I also recommend update doc in manticore about stored_only_fields. There was no info about stored_only_fields and attributes difference and limitation, so hard to compare pros cons of their usage. |
I reopen this issue because of potencial change in rats to increse performance |
Yes, that's correct.
It's all simple:
I don't know what queries you have, so can't say what can be converted to stored fields. Obviously if you have to do
Yes, it will, please read manticoresoftware/manticoresearch#602 (comment). The string attributes take almost 7GB of RAM which on 4GB server makes Manticore read them from disk on probably each query. |
@sanikolaev and also intresting question, what converting requests between those twos after config change, is need to convertion aready created db, how to do it in both directions -> to attr strings from fields and to fields from attr strings? |
You can't switch from string attr to stored field or vice versa online, you need to re-create the index and repopulate it with data. |
# [1.8.0](v1.7.1...v1.8.0) (2021-09-12) ### Bug Fixes * **db:** converting db to version 8 ([c19a95d](c19a95d)) * **db:** moving content type to uint values ([f4b7a8d](f4b7a8d)) * **docker:** moved to 16 version ([1089fa3](1089fa3)) * **linux:** add execute right to searchd.v2 [#152](#152) ([0bc35c5](0bc35c5)) * **linux:** fix convertation of db under linux system [#152](#152) ([ea01858](ea01858)) ### Features * **log:** using tagslog ([750dbfd](750dbfd)) * **server:** missing rats.log functionality restored [#145](#145) ([d5243ff](d5243ff)) ### Performance Improvements * **db:** optimize some tables to stored_only_fields to recrudesce memory usage of big databases [#152](#152) ([762b0d1](762b0d1))
I have git clonned and "npm start server" 3 hours ago, and the issue still exist, i am unsure why this issue is closed. |
@slrslr , this issue is closed because it was reopened only for resolving/optimize some suggestions from manticore teams memebers, itself it's not garantee that you problem is resolved on such big database, and I said before the problem related to manticore engine and not for rats search (so thats why this issue closed). You need to ask @sanikolaev and and other manticore teams is any more optimization in you case possible to make such big database work with satisfactory speed and performance, because descresing of performance related to manticore. If they will tell that there is no possible optimization in configuration in database and it structure your only choise will be delete some data from tables to make you work comfatable. |
# [1.8.0](v1.7.1...v1.8.0) (2021-09-12) ### Bug Fixes * **db:** converting db to version 8 ([c19a95d](c19a95d)) * **db:** moving content type to uint values ([f4b7a8d](f4b7a8d)) * **docker:** moved to 16 version ([1089fa3](1089fa3)) * **linux:** add execute right to searchd.v2 [#152](#152) ([0bc35c5](0bc35c5)) * **linux:** fix convertation of db under linux system [#152](#152) ([ea01858](ea01858)) ### Features * **log:** using tagslog ([750dbfd](750dbfd)) * **server:** missing rats.log functionality restored [#145](#145) ([d5243ff](d5243ff)) ### Performance Improvements * **db:** optimize some tables to stored_only_fields to recrudesce memory usage of big databases [#152](#152) ([762b0d1](762b0d1))
Hello,
on Linux, there is more than 100MB/s disk usage returned by iotop utility for the process:
searchd --config /home/me/rats-search/sphinx.conf --nodetach
searchd = rats-search/imports/linux/x64/searchd
when i have killed the rats (there is no restart/shutdown button) and started again, i see after some minutes issue is back.
Any idea on commands that can shed some light on this please?
here some more details (password: r)
i am at 1.7 million torrents and the database folder is 7,5G already, program shows something under 20 million torrents possible
The text was updated successfully, but these errors were encountered: