-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory usage/leak #121
Comments
Hi John, We are actively looking into this. We will update progress here. If you could please share a bit of the details around the FS being indexed, that would be great. We are expecting that there are directories with very high amounts of sub-dirs, causing the queues to grow quickly. If you'd like to discuss via email, feel free to reach out to dmanno@lanl.gov |
Hello Dominic! Thanks for the reply! Just for public visibility, the file system I'm trying to index is BeeGFS. It is 8.1 PB usable and currently has 1,034,075,745 inodes in use. I'll email you directly with additional details. John DeSantis |
Hi. Please try out these two branches: insitu deferred-queue |
Hello, Thank you! I am running the deferred-queue build now. I'll update this issue with the results once it is done. John DeSantis |
Hello, I'm pleased to state that I've noticed stability in the memory footprint so far! I'll update again once the indexing finishes. Feel free to review the screen shot I've included to show that the deferred-queue branch has had a positive effect, including a small decrease in the amount of memory being used And for what it's worth, the directory in question has 38,574,219 sub directories. |
Hello, I wanted to provide an update. Unfortunately, indexing on two specific directories still continues to fail due to memory usage, despite either branch and various combinations of arguments, so we've reached out to the users in question to tidy up their directories. Once we hear back from them I'll get a count of their current directory structures and verify items have been culled. I'll then re-run the indexing process (both methods) and report back. Thanks again! |
Please give the |
Hello, The reduce-work-size branch is extremely promising! So far, not only has the indexing process run longer (4x) for this specific directory, it's memory footprint has been reduced by ~83%!!!! Please see the screen shot below to see the fruits of your labor. I'll update again once the process completes or is terminated. |
Hello, Both indexes completed without any logged errors. CPU and memory statistics for both processes is listed below: 56,591,355 directories & 80% reduction in memory footprint:
38,574,219 directories & 91.5% reduction in memory footprint:
We will need to provide more storage for the out of tree index though, as these two directories consumed all 3.7 TB allocated for the indexes. Thank you! |
Thanks so much for the challenging use case to help make the software better.
With so many directories, you might be a good candidate for using a couple of features that
can provide high value to at least some using orgs that have lots of directories.
The treesummary feature provides a way make some queries over large subtrees much faster
for some limited but often useful information and is tailorable for a site.
The rollup feature also provides a way to securely rollup up all gufi index information into
fewer databases thereby speeding up queries by sometimes orders of magnitude.
The more directories the more databases that must be consulted, but treesummaries and rollups
can decrease the number of directories by orders of magnitude in a user access secure way.
Potentially these things could help you and of course it would be nice to know if these tools
can handle such a wide directory structure that your site has, something we could help ensure works.
Further, gufi does not include subdir info in a standard gufi db file (for one directory) to help deal with the exact situation you faced (a dir with millions of one level down subdirs). Treesummaries and rollups which are temporary tables that do include subdir info. This strict rule that standard gufi db files don’t have any subdir info in them makes it possible to break up the tree across multiple machines and it allows for subtree updating, so if you know that a part of the tree has not changed, you only need to delete and recreate those parts of a tree that have changed. If your scalable file system has ways of letting you know what subtrees have changed, you can cut the re-indexing down by a lot. Lustre is getting a feature added to log just which subtrees change instead of logging every tiny change, making the logs tiny and making re-indexing pretty cheap. Not keeping subdirs info in the directories database file, was not intuitive in the way Posix works, but enables composition a gufi index tree from subtrees trivially which is handy.
Thanks so much for your help
Gary Grider
From: tacc-desantis ***@***.***>
Reply-To: mar-file-system/GUFI ***@***.***>
Date: Saturday, April 1, 2023 at 8:01 PM
To: mar-file-system/GUFI ***@***.***>
Cc: Subscribed ***@***.***>
Subject: [EXTERNAL] Re: [mar-file-system/GUFI] Memory usage/leak (Issue #121)
Hello,
Both indexes completed without any logged errors. CPU and memory statistics for both processes is listed below:
56,591,355 directories & 80% reduction in memory footprint:
23962.18user 9703.66system 35:46:25elapsed 26%CPU (0avgtext+0avgdata 41947132maxresident)k 104162440inputs+3706836056outputs (24major+44473160minor)pagefaults 0swaps
38,574,219 directories & 91.5% reduction in memory footprint:
9857.33user 3419.66system 14:54:09elapsed 24%CPU (0avgtext+0avgdata 18088904maxresident)k 30993064inputs+1370928848outputs (2major+25345753minor)pagefaults 0swaps
We will need to provide more storage for the out of tree index though, as these two directories consumed all 3.7 TB allocated for the indexes.
Thank you!
John DeSantis
—
Reply to this email directly, view it on GitHub<https://urldefense.com/v3/__https:/github.com/mar-file-system/GUFI/issues/121*issuecomment-1493198751__;Iw!!Bt8fGhp8LhKGRg!Dzo1kLf8-iXIPzP2HMD2Zd0Pxt5AglHf1vAJKWpqITT_opR8NOMG3sk7_7ZczzV2oGqw1LCwEt7eXgJUC8paj9oo$>, or unsubscribe<https://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/ACZXDGMSCWRSCPSNJ3SKTRDW7DMYPANCNFSM6AAAAAAV5VMKHI__;!!Bt8fGhp8LhKGRg!Dzo1kLf8-iXIPzP2HMD2Zd0Pxt5AglHf1vAJKWpqITT_opR8NOMG3sk7_7ZczzV2oGqw1LCwEt7eXgJUC6EWRFmr$>.
You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>
|
Hello,
Certainly! I thoroughly appreciate the time and fixes the development team produced.
Agreed. Currently, I've only been using If there are any other features the development team would like to test against large, production file systems, let me know. Thank you again! |
Glad it is helping you.
Yes guf_find is about 1/50th of the capability gufi has. There is full blown sql support including temporary output tables for sorting/joining,
Tree summaries allow you to summarize the tree below, rollups allow you to consolidate in a secure way, xattr support, extra fields for site use, completely extensible for site tables. Very complex things are pretty easy like make me a sorted histogram of all the file types in the entire system to simple things like find me the top 3 largest directories. Hope you find use for some of it…
Thanks
Gary
From: tacc-desantis ***@***.***>
Sent: Monday, April 3, 2023 7:35 AM
To: mar-file-system/GUFI ***@***.***>
Cc: Grider, Gary Alan ***@***.***>; Comment ***@***.***>
Subject: [EXTERNAL] Re: [mar-file-system/GUFI] Memory usage/leak (Issue #121)
Hello,
Thanks so much for the challenging use case to help make the software better.
Certainly! I thoroughly appreciate the time and fixes the development team produced.
Potentially these things could help you and of course it would be nice to know if these tools
can handle such a wide directory structure that your site has, something we could help ensure works.
Agreed. Currently, I've only been using gufi_find for querying the tree, so I'll need to read-up on available documentation (including recent slides) and look at some of the commit logs to take advantage of these options. Despite this, gufi_find is still a gem, as our current production need is to speed up file system purging. Once we have developed a proper workflow for this process, we'll most likely apply it to several other production file systems, and then begin to utilize GUFI's other features for richer file system accounting.
If there are any other features the development team would like to test against large, production file systems, let me know.
Thank you again!
John DeSantis
—
Reply to this email directly, view it on GitHub<https://urldefense.com/v3/__https:/github.com/mar-file-system/GUFI/issues/121*issuecomment-1494336275__;Iw!!Bt8fGhp8LhKGRg!DIeJpnl4_RmaR0hhfvp62AJlo1W2a_Muj3PYu5Fvsy07WAyTsI8O4rq1QWbeKc3i5OO2NboUUZJOMlqgWIytIPbj$>, or unsubscribe<https://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/ACZXDGMZY3PQTYHDNGYXK5LW7LGYFANCNFSM6AAAAAAV5VMKHI__;!!Bt8fGhp8LhKGRg!DIeJpnl4_RmaR0hhfvp62AJlo1W2a_Muj3PYu5Fvsy07WAyTsI8O4rq1QWbeKc3i5OO2NboUUZJOMlqgWFU749Ve$>.
You are receiving this because you commented.Message ID: ***@***.******@***.***>>
|
The changes made in the 3 branches have been merged into one branch and then merged with master via #125. Compression with zlib has been changed to use Please let us know if the combining of all of the changes broke something. |
Hello,
Excellent!
Will do! I am running an index now. Also, I've deliberately have skipped the indexing of the problematic directories now since users are actively removing content; one of the logs had 1k+ lines stating that directories couldn't be opened due to the active file/directory removals. Thanks again for such amazing responses from the development team! John DeSantis |
Gary,
Where's that "mind blown" gif???
Thanks for these suggestions, a seed has been planted... John DeSantis |
Hello, Marking this as resolved! I am no longer seeing memory issues that I had seen previously. Thanks again for all of the development team's hard work. John DeSantis |
Gary,
I've been reading the documentation regarding using gufi_query, but whenever I try to use the path() function I get errors, even with the examples in the GUFI.docx file:
# ./gufi_query -n 1 -E "select path,* from entries;" /gufi_index/lo_scemo/|head -10
Error: no such column: path: /gufi_index/lo_scemo/db.db: "select path,* from entries;"
Error: no such column: path: /gufi_index/lo_scemo/R-4.2.1/db.db: "select path,* from entries;"
Error: no such column: path: /gufi_index/lo_scemo/R-4.2.1/src/db.db: "select path,* from entries;"
Error: no such column: path: /gufi_index/lo_scemo/R-4.2.1/m4/db.db: "select path,* from entries;"
Error: no such column: path: /gufi_index/lo_scemo/R-4.2.1/doc/db.db: "select path,* from entries;"
I also tried using the "oldbigfiles" example (https://github.com/mar-file-system/GUFI/blob/master/examples/oldbigfiles) and received the same error:
# ./test.sh /gufi_index/ 1
query to create db insert recs from entry query into out db old big files
Error: wrong number of arguments to function path(): /gufi_index/db.db: "insert into sument select uidtouser(uid, 0), path(), name, size, mtime from entries where type='f'and s
ize>4;"
Error: wrong number of arguments to function path(): /gufi_index/lo_scemo/db.db: "insert into sument select uidtouser(uid, 0), path(), name, size, mtime from entries where type
='f'and size>4;"
Error: wrong number of arguments to function path(): /gufi_index/another_user/db.db: "insert into sument select uidtouser(uid, 0), path(), name, size, mtime from entries where type=
'f'and size>4;"
Error: wrong number of arguments to function path(): /gufi_index/lo_scemo/R-4.2.1/db.db: "insert into sument select uidtouser(uid, 0), path(), name, size, mtime from entries wh
ere type='f'and size>4;"
Surely I'm causing the issue?
Thanks,
John DeSantis
|
There is no path field (on purpose because that would not allow easy move a director somewhere else, etc.)
There is a function path() but it seems to have changed in the last year.
We either need to provide a new path() function or
For now I think path just seems to want 2 inputs but no idea why, think it was reused for rollup
Try
./gufi_query -n 1 -E "select path(uid,uid),* from entries;" /gufi_index/lo_scemo/ | head -10
And of course
./gufi_query -n 1 -E ‘select path(uid,uid) || “\” || name, size, uid, uidtouser(uid,20) from entries;" /gufi_index/lo_scemo/ | head -10
There are a bunch of functions we wrote to deal with formatting date time output, uid to user gid to user, path, etc.
Those are in dbutils.c look for addquery
We need to produce a “sql guide go gufi”
I imagine we will because we have a tutorial coming up at MSST in May
You can look at the entire schema in the code or just go to a dir and do
Sqlite3 dbname where dbname is the name of the database in that dir
Then something like .schema or something
There is a LOT there and happy to describe if it helps
From: tacc-desantis ***@***.***>
Sent: Thursday, April 6, 2023 8:34 AM
To: mar-file-system/GUFI ***@***.***>
Cc: Grider, Gary Alan ***@***.***>; Comment ***@***.***>
Subject: [EXTERNAL] Re: [mar-file-system/GUFI] Memory usage/leak (Issue #121)
Gary,
I've been reading the documentation regarding using gufi_query, but whenever I try to use the path() function I get errors, even with the examples in the GUFI.docx file:
# ./gufi_query -n 1 -E "select path,* from entries;" /gufi_index/lo_scemo/|head -10
Error: no such column: path: /gufi_index/lo_scemo/db.db: "select path,* from entries;"
Error: no such column: path: /gufi_index/lo_scemo/R-4.2.1/db.db: "select path,* from entries;"
Error: no such column: path: /gufi_index/lo_scemo/R-4.2.1/src/db.db: "select path,* from entries;"
Error: no such column: path: /gufi_index/lo_scemo/R-4.2.1/m4/db.db: "select path,* from entries;"
Error: no such column: path: /gufi_index/lo_scemo/R-4.2.1/doc/db.db: "select path,* from entries;"
I also tried using the "oldbigfiles" example (https://github.com/mar-file-system/GUFI/blob/master/examples/oldbigfiles<https://urldefense.com/v3/__https:/github.com/mar-file-system/GUFI/blob/master/examples/oldbigfiles__;!!Bt8fGhp8LhKGRg!CHQf25p0yw3qlMtIukZvhAP3J49otDtmmA4Xh_6bEjmd8aIqHcTFKDBA6fcRZabsKSBjKZdXb_2OixiLqiX7ieU4$>) and received the same error:
# ./test.sh<https://urldefense.com/v3/__http:/test.sh__;!!Bt8fGhp8LhKGRg!CHQf25p0yw3qlMtIukZvhAP3J49otDtmmA4Xh_6bEjmd8aIqHcTFKDBA6fcRZabsKSBjKZdXb_2OixiLqjpsihcO$> /gufi_index/ 1
query to create db insert recs from entry query into out db old big files
Error: wrong number of arguments to function path(): /gufi_index/db.db: "insert into sument select uidtouser(uid, 0), path(), name, size, mtime from entries where type='f'and s
ize>4;"
Error: wrong number of arguments to function path(): /gufi_index/lo_scemo/db.db: "insert into sument select uidtouser(uid, 0), path(), name, size, mtime from entries where type
='f'and size>4;"
Error: wrong number of arguments to function path(): /gufi_index/another_user/db.db: "insert into sument select uidtouser(uid, 0), path(), name, size, mtime from entries where type=
'f'and size>4;"
Error: wrong number of arguments to function path(): /gufi_index/lo_scemo/R-4.2.1/db.db: "insert into sument select uidtouser(uid, 0), path(), name, size, mtime from entries wh
ere type='f'and size>4;"
Surely I'm causing the issue?
John DeSantis
Large Scale Systems @ TACC
The University of Texas at Austin
10601 Exploration Way, PRC 196
Austin, TX 78758
PGP: E7E9 1601 154B 0AF8 E0DB
On 4/3/23 13:59, garygrider wrote:
Glad it is helping you.
Yes guf_find is about 1/50th of the capability gufi has. There is full blown sql support including temporary output tables for sorting/joining,
Tree summaries allow you to summarize the tree below, rollups allow you to consolidate in a secure way, xattr support, extra fields for site use, completely extensible for site tables. Very complex things are pretty easy like make me a sorted histogram of all the file types in the entire system to simple things like find me the top 3 largest directories. Hope you find use for some of it…
Thanks
Gary
From: tacc-desantis ***@***.***<mailto:***@***.***>>
Sent: Monday, April 3, 2023 7:35 AM
To: mar-file-system/GUFI ***@***.***<mailto:***@***.***>>
Cc: Grider, Gary Alan ***@***.***<mailto:***@***.***>>; Comment ***@***.***<mailto:***@***.***>>
Subject: [EXTERNAL] Re: [mar-file-system/GUFI] Memory usage/leak (Issue #121)
Hello,
Thanks so much for the challenging use case to help make the software better.
Certainly! I thoroughly appreciate the time and fixes the development team produced.
Potentially these things could help you and of course it would be nice to know if these tools
can handle such a wide directory structure that your site has, something we could help ensure works.
Agreed. Currently, I've only been using gufi_find for querying the tree, so I'll need to read-up on available documentation (including recent slides) and look at some of the commit logs to take advantage of these options. Despite this, gufi_find is still a gem, as our current production need is to speed up file system purging. Once we have developed a proper workflow for this process, we'll most likely apply it to several other production file systems, and then begin to utilize GUFI's other features for richer file system accounting.
If there are any other features the development team would like to test against large, production file systems, let me know.
Thank you again!
John DeSantis
—
Reply to this email directly, view it on GitHub<https://urldefense.com/v3/__https:/github.com/mar-file-system/GUFI/issues/121*issuecomment-1494336275__;Iw!!Bt8fGhp8LhKGRg!DIeJpnl4_RmaR0hhfvp62AJlo1W2a_Muj3PYu5Fvsy07WAyTsI8O4rq1QWbeKc3i5OO2NboUUZJOMlqgWIytIPbj$%3E, or unsubscribe<https://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/ACZXDGMZY3PQTYHDNGYXK5LW7LGYFANCNFSM6AAAAAAV5VMKHI__;!!Bt8fGhp8LhKGRg!DIeJpnl4_RmaR0hhfvp62AJlo1W2a_Muj3PYu5Fvsy07WAyTsI8O4rq1QWbeKc3i5OO2NboUUZJOMlqgWFU749Ve$%3E.
<https://urldefense.com/v3/__https:/github.com/mar-file-system/GUFI/issues/121*issuecomment-1494336275__;Iw!!Bt8fGhp8LhKGRg!DIeJpnl4_RmaR0hhfvp62AJlo1W2a_Muj3PYu5Fvsy07WAyTsI8O4rq1QWbeKc3i5OO2NboUUZJOMlqgWIytIPbj$%3E,%20or%20unsubscribe%3chttps:/urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/ACZXDGMZY3PQTYHDNGYXK5LW7LGYFANCNFSM6AAAAAAV5VMKHI__;!!Bt8fGhp8LhKGRg!DIeJpnl4_RmaR0hhfvp62AJlo1W2a_Muj3PYu5Fvsy07WAyTsI8O4rq1QWbeKc3i5OO2NboUUZJOMlqgWFU749Ve$%3E.%20%0b>> You are receiving this because you commented.Message ID: ***@***.******@***.***<mailto:***@***.******@***.***>>>
—
Reply to this email directly, view it on GitHub<https://urldefense.com/v3/__https:/github.com/mar-file-system/GUFI/issues/121*issuecomment-1499163722__;Iw!!Bt8fGhp8LhKGRg!CHQf25p0yw3qlMtIukZvhAP3J49otDtmmA4Xh_6bEjmd8aIqHcTFKDBA6fcRZabsKSBjKZdXb_2OixiLqtBP0Wgq$>, or unsubscribe<https://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/ACZXDGLYUGP2EHMJATIOL6TW73H5PANCNFSM6AAAAAAV5VMKHI__;!!Bt8fGhp8LhKGRg!CHQf25p0yw3qlMtIukZvhAP3J49otDtmmA4Xh_6bEjmd8aIqHcTFKDBA6fcRZabsKSBjKZdXb_2OixiLqqm1anEo$>.
You are receiving this because you commented.Message ID: ***@***.******@***.***>>
|
Gary,
These work perfectly, thank you for the insight!
A guide would be most beneficial, especially since (as you already said) there are a lot more data points available via gufi_query vs. the wrapper tools gufi_find, gufi_ls, etc. I'd love to attend MSST, but it's too short of a notice and I'm in the middle of a cross country move :(
Agreed, that's what I did since we all love to abuse and exploit SQL!
You may want to rescind that offer, hahaha! Thanks again, |
I pushed some commits that restore the original
However, because we are not in the business of forcing user queries to be in a certain format, it is easy to misuse
|
Thanks!
From: Jason Lee ***@***.***>
Sent: Thursday, April 6, 2023 12:03 PM
To: mar-file-system/GUFI ***@***.***>
Cc: Grider, Gary Alan ***@***.***>; Comment ***@***.***>
Subject: [EXTERNAL] Re: [mar-file-system/GUFI] Memory usage/leak (Issue #121)
I pushed some commits that restore the original path(), epath(), and fpath() sqlite functions. path() no longer takes in arguments. I have renamed the function with 2 arguments (summary.name<https://urldefense.com/v3/__http:/summary.name__;!!Bt8fGhp8LhKGRg!DaPZEEnX2wQBJ5C9cc7MZJcmv8BxOp16XGR17rL8fo2RniRf-RQnk21HjCGLkql4kF8_r_JYuzcSXBxUd2zg7Xri$> and summary.rollupscore) to rpath(), which is meant for handling path names in rolled up indices. Its intended use in the -E query is:
SELECT rpath(summary.name<https://urldefense.com/v3/__http:/summary.name__;!!Bt8fGhp8LhKGRg!DaPZEEnX2wQBJ5C9cc7MZJcmv8BxOp16XGR17rL8fo2RniRf-RQnk21HjCGLkql4kF8_r_JYuzcSXBxUd2zg7Xri$>, summary.rollupscore) || "/" || pentries.name<https://urldefense.com/v3/__http:/pentries.name__;!!Bt8fGhp8LhKGRg!DaPZEEnX2wQBJ5C9cc7MZJcmv8BxOp16XGR17rL8fo2RniRf-RQnk21HjCGLkql4kF8_r_JYuzcSXBxUd2p_p6Gh$>
FROM summary, pentries
WHERE summary.inodes == pentries.pinode;
However, because we are not in the business of forcing user queries to be in a certain format, it is easy to misuse rpath().
uidtouser, gidtogroup, and several other functions had an extra unexpected argument. They have been removed. The extra argument was to help with alignment of output, but they were always unused and set to 0.
—
Reply to this email directly, view it on GitHub<https://urldefense.com/v3/__https:/github.com/mar-file-system/GUFI/issues/121*issuecomment-1499431244__;Iw!!Bt8fGhp8LhKGRg!DaPZEEnX2wQBJ5C9cc7MZJcmv8BxOp16XGR17rL8fo2RniRf-RQnk21HjCGLkql4kF8_r_JYuzcSXBxUd9Df3xxx$>, or unsubscribe<https://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/ACZXDGP6Q2MK4SBQOYT7KDTW74AMXANCNFSM6AAAAAAV5VMKHI__;!!Bt8fGhp8LhKGRg!DaPZEEnX2wQBJ5C9cc7MZJcmv8BxOp16XGR17rL8fo2RniRf-RQnk21HjCGLkql4kF8_r_JYuzcSXBxUd-2DlfJ9$>.
You are receiving this because you commented.Message ID: ***@***.******@***.***>>
|
Looks like the newest version puts path() back like it once was and well as the other query funcs.
I am producing a draft gufi sql guide.
Some of it exists in the gufi.doc in docs but that is a very old doc so beware.
Sent with BlackBerry Work
(www.blackberry.com)
From: tacc-desantis ***@***.******@***.***>>
Date: Thursday, Apr 06, 2023 at 11:32 AM
To: mar-file-system/GUFI ***@***.******@***.***>>
Cc: Grider, Gary Alan ***@***.******@***.***>>, Comment ***@***.******@***.***>>
Subject: [EXTERNAL] Re: [mar-file-system/GUFI] Memory usage/leak (Issue #121)
Gary,
Try
./gufi_query -n 1 -E "select path(uid,uid),* from entries;" /gufi_index/lo_scemo/ | head -10
And of course
./gufi_query -n 1 -E ‘select path(uid,uid) || “\” || name, size, uid, uidtouser(uid,20) from entries;" /gufi_index/lo_scemo/ | head -10
These work perfectly, thank you for the insight!
We need to produce a “sql guide go gufi”
I imagine we will because we have a tutorial coming up at MSST in May
A guide would be most beneficial, especially since (as you already said) there are a lot more data points available via gufi_query vs. the wrapper tools gufi_find, gufi_ls, etc. I'd love to attend MSST, but it's too short of a notice and I'm in the middle of a cross country move :(
You can look at the entire schema in the code or just go to a dir and do
Sqlite3 dbname where dbname is the name of the database in that dir
Then something like .schema or something
Agreed, that's what I did since we all love to abuse and exploit SQL!
There is a LOT there and happy to describe if it helps
You may want to rescind that offer, hahaha!
Thanks again,
John DeSantis
—
Reply to this email directly, view it on GitHub<https://urldefense.com/v3/__https://github.com/mar-file-system/GUFI/issues/121*issuecomment-1499399611__;Iw!!Bt8fGhp8LhKGRg!DcI_y1BwxpaG4kfT8IXkEB8QxC1DW1fIRjtllLZsSuhWZkpKIxl3aUaQ7zXwOOcuPwbe3MpBmgu7SPoGwbGYawHw$>, or unsubscribe<https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/ACZXDGPNGVMI7L7PGPMFKLTW734Z3ANCNFSM6AAAAAAV5VMKHI__;!!Bt8fGhp8LhKGRg!DcI_y1BwxpaG4kfT8IXkEB8QxC1DW1fIRjtllLZsSuhWZkpKIxl3aUaQ7zXwOOcuPwbe3MpBmgu7SPoGwURm6wWt$>.
You are receiving this because you commented.Message ID: ***@***.***>
|
@tacc-desantis I just pushed GUFI-SQL.docx, Gary's guide on using GUFI's SQL schema and functions, to the docs directory. Please pull the latest version of GUFI to get all of the features mentioned in the guide. |
Hello,
First and foremost, thank you for this software! It works incredibly well, and we're seeing significant differences querying the index file system (multiple PB's) vs. using the traditional find command.
I've run into an issue, unfortunately, in terms of memory usage during a re-index process. Basically, the gufi_dir2index process consumes too much memory leading to system instability. The test system in question has 256 GB of RAM available and the re-index has caused many OOM events.
Are there any controls and/or configuration options I can set on the GUFI side in order to not consume all available memory, leading to OOM events? For what it's worth, there doesn't seem to be a correlation between the number of files and/or space consumed on the file system that we're indexing. I built GUFI from the latest commit at the time, which was a8d9328.
Thank you!
John DeSantis
The text was updated successfully, but these errors were encountered: