[#110][#111] List operation handles large collections #114

korydraughn · 2021-01-22T00:06:43Z

This needs to be tested.

trel · 2021-01-22T14:39:27Z

seems good - ready to # if we're tested/confirmed. what changed in the force push?

korydraughn · 2021-01-22T15:13:57Z

Working on and running test right now. The force push involved a change to the pom.xml file. I needed to revert part of the file so that the build succeeds.

trel · 2021-01-22T15:19:42Z

got it - thanks.

korydraughn · 2021-01-23T01:16:14Z

I've added a BATS test that verifies that the list operation produces the correct number of entries for a large collection (in this case 3000 files). The test passed.

We'll have to look into whether this PR resolves #110. We may need to use apache httpd to simulate that issue.

trel · 2021-01-23T02:16:03Z

great - yes, let's see if sanger may want to test this branch themselves, too. if you're not seeing hitches in the listings anymore, i'm fine to get it #'d and merged. can always leave the issue open until they confirm as fixed.

trel · 2021-01-25T18:48:32Z

@kript @bh9 please eyeball this when you get a chance

michael-conway · 2021-01-25T20:22:02Z

Sounds great, let me know how it's going. I'm interested to see how the removed ObjStat will improve listing. There remain some interior changes to the list all method I can do under the covers too, mostly in how it's paging. We can talk about that as a separate issue once we get the listings working properly.

kript · 2021-01-25T21:43:33Z

This sounds exiting! However I shall defer to my esteemed colleague @ac55-sanger who is leading the charge here....

ac55-sanger · 2021-01-26T09:49:07Z

Sure, sounds good, happy to test.
Will update once done. Thanks.

ac55-sanger · 2021-01-26T14:16:18Z

Hi,

docker build is failing on korydraughn:110 with sleepycat dependency issue:

[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO] 
[INFO] nfs4j-irodsvfs ..................................... SUCCESS [  1.144 s]
[INFO] nfsrods ............................................ FAILURE [03:58 min]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 04:03 min
[INFO] Finished at: 2021-01-26T14:08:52+00:00
[INFO] Final Memory: 20M/714M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal on project nfsrods: Could not resolve dependencies for project org.irods.jargon:nfsrods:jar:1.0.1: Could not find artifact com.sleepycat:je:jar:7.3.7 in dcache-snapshots (https://download.dcache.org/nexus/content/repositories/releases) -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn <goals> -rf :nfsrods
The command '/bin/sh -c cd irods_client_nfsrods &&     git checkout ${_sha} &&     mvn clean install -Dmaven.test.skip=true' returned a non-zero code: 1

I tried to add it in pom.xml and re-build but didn't work.
Could you please have a look and let me know?
Thanks!

korydraughn · 2021-01-26T14:26:18Z

Will look into this.

korydraughn · 2021-01-29T15:23:02Z

@ac55-sanger Mike and I have resolved the build issue. Please give this another shot when you get a chance.

ac55-sanger · 2021-01-29T15:24:11Z

Sure, thanks.
Will update.

ac55-sanger · 2021-01-29T15:45:44Z

@korydraughn
This is still failing to build with same

Could not resolve dependencies for project org.irods.jargon:nfsrods:jar:1.0.1: Could not find artifact com.sleepycat:je:jar:7.3.7 in dcache-snapshots

dependency issue.

Steps followed:
$ git clone https://github.com/korydraughn/irods_client_nfsrods.git
$ cd irods_client_nfsrods/
$ git checkout 110
$ docker build -t nfsrods .

Let me know if I'm doing something wrong here.

korydraughn · 2021-01-29T17:45:18Z

Your docker build command is actually trying to build the master branch. You need to point that command at this branch. To do that, run the following:

$ docker build -t nfsrods --build-arg='_github_account=korydraughn' --build-arg='_sha=110'

ac55-sanger · 2021-01-29T18:36:11Z

Ah, my bad. Thanks, it built successfully.
I will test it next week and let you guys know.

ac55-sanger · 2021-02-03T12:49:29Z

@korydraughn
I tested this build and here is my report:

"ls" on the directory containing 65K+ files took more than 24 hours and yet resulted in wrong data and count.

$ ls -l | wc -l
.
ls: cannot access 'DDD_MAIN5249029.cram': No such file or directory
ls: cannot access 'DDD_MAIN5249029.cram.crai': No such file or directory
ls: cannot access 'DDD_MAIN5249030.cram': No such file or directory
ls: cannot access 'DDD_MAIN5249030.cram.crai': No such file or directory
1631

whereas all these files exist on irods and I can list these using "ils" command.

This build also broke data mapping and random data is being mapped to a collection.

$ ls -l <mounted_path>/20140918/
ls: cannot access '<mounted_path>/20140918/cram': No such file or directory
total 0
d????????? ? ? ? ?            ? cram

$ ls -l <mounted_path>/20140918/ | wc -l
ls: cannot access '/mnt/humgen/projects/ddd/20140918/cram': No such file or directory
2

=====
$ ils <irods_path>/20140918/
<irods_path>/20140918:
  10:1-135534747.vcf.gz
  10:1-135534747.vcf.gz.tbi
  11:1-135006516.vcf.gz
  11:1-135006516.vcf.gz.tbi
.
.
.
$ ils <irods_path>/20140918/ | wc -l
49

NFSRODs mounted directory 1) doesn't list the data properly 2) lists some random data (eg. "cram" directory in this case) which doesn't exist in irods.

This mapping issue is not seen in the previous release build (tested it again).

Thanks

michael-conway · 2021-02-03T13:42:49Z

Kory let's get together on this.

…

On Wed, Feb 3, 2021, 7:49 AM Ashwini Chhipa ***@***.***> wrote: @korydraughn <https://github.com/korydraughn> I tested this build and here is my report: 1. "ls" on the directory containing 65K+ files took more than 24 hours and yet resulted in wrong data and count. $ ls -l | wc -l .. ls: cannot access 'DDD_MAIN5249029.cram': No such file or directory ls: cannot access 'DDD_MAIN5249029.cram.crai': No such file or directory ls: cannot access 'DDD_MAIN5249030.cram': No such file or directory ls: cannot access 'DDD_MAIN5249030.cram.crai': No such file or directory 1631 whereas all these files exist on irods and I can list these using "ils" command. 1. This build also broke data mapping and random data is being mapped to a collection. $ ls -l <mounted_path>/20140918/ ls: cannot access '<mounted_path>/20140918/cram': No such file or directory total 0 d????????? ? ? ? ? ? cram $ ls -l <mounted_path>/20140918/ | wc -l ls: cannot access '/mnt/humgen/projects/ddd/20140918/cram': No such file or directory 2 ===== $ ils <irods_path>/20140918/ <irods_path>/20140918: 10:1-135534747.vcf.gz 10:1-135534747.vcf.gz.tbi 11:1-135006516.vcf.gz 11:1-135006516.vcf.gz.tbi .. . .. $ ils <irods_path>/20140918/ | wc -l 49 NFSRODs mounted directory 1) doesn't list the data properly 2) lists some random data (eg. "cram" directory in this case) which doesn't exist in irods. This mapping issue is not seen in the previous release build (tested it again). Thanks — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#114 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAIL4LNR4CNV7VXYKGLASMTS5FA6TANCNFSM4WNWGL2A> .

korydraughn · 2021-02-03T14:41:55Z

We'll continue to look into this.

korydraughn · 2021-02-05T20:09:47Z

@ac55-sanger When you did your testing, did you make sure to unmount nfsrods before remounting it?

I know that weird things happen when the nfsrods server is bounced without remounting it.

michael-conway · 2021-02-05T20:21:26Z

fwiw Kory, Deep and I are setting up a test platform with 65K+ for our own testing and for profiling/optimizatoin

…

On Fri, Feb 5, 2021 at 3:10 PM Kory Draughn ***@***.***> wrote: @ac55-sanger <https://github.com/ac55-sanger> When you did your testing, did you make sure to unmount nfsrods before remounting it? I know that weird things happen when the nfsrods server is bounced without remounting it. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#114 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAIL4LIWP73K462K4TMNOBTS5RGBVANCNFSM4WNWGL2A> .

ac55-sanger · 2021-02-08T09:30:54Z

@ac55-sanger When you did your testing, did you make sure to unmount nfsrods before remounting it?

I know that weird things happen when the nfsrods server is bounced without remounting it.

Yes @korydraughn I remember unmounting the old one before mounting a server with the new build.
But I can re-check and let you know by the end of today.

korydraughn · 2021-04-15T12:09:44Z

That is surprising. Can you verify that the SHA for your build matches the one for this PR?

ac55-sanger · 2021-04-16T08:13:55Z

Sure,

~# docker run --rm ac55/nfsrods-patch:3.0 sha
Build Time    => 2021-04-14T11:29:39+0000
Build Version => 1.0.1
Build SHA     => 35e278cd3b89f4809f8859fe7e85197baccf24f9

ac55-sanger · 2021-04-20T08:17:11Z

It just gave up after ~38 hours. :(

ls: reading directory '<path_to_mounted_directory>': Remote I/O error
total 0

real    2233m20.581s
user    0m0.000s
sys     0m0.003s

trel · 2021-05-10T20:42:45Z

We narrowed the scope of @ac55-sanger's slow listings to an iRODS specific query that is not valid syntax for Oracle. We are moving ahead with the rest of these edits and will tackle the Oracle syntax issue separately.

korydraughn · 2021-05-13T15:50:06Z

@michael-conway Can you make a new snapshot of jargon (tip of master) available?

korydraughn · 2021-05-13T20:41:01Z

@ac55-sanger Please try using NFSRODS again. You'll need to rebuild the docker image.

You'll need to make some changes before running NFSRODS.

Add "using_oracle_database": true to the "nfs_server" section of the NFSRODS config file.
Replace the following specific queries in iRODS with ones that work with Oracle.
- ilsLACollections
- ilsLADataObjects

Below are the new specific queries. Please have Simon take a look at these. You're free to adjust these.

ilsLACollections

SELECT * FROM (
  SELECT c.parent_coll_name, c.coll_name, c.create_ts, c.modify_ts,
         c.coll_id, c.coll_owner_name, c.coll_owner_zone, c.coll_type, u.user_name, u.zone_name,
         a.access_type_id, u.user_id, rownum as limit_rn
  FROM R_COLL_MAIN c
  JOIN R_OBJT_ACCESS a ON c.coll_id = a.object_id
  JOIN R_USER_MAIN u ON a.user_id = u.user_id
  WHERE c.parent_coll_name = ?
  ORDER BY c.coll_name, u.user_name, a.access_type_id, c.parent_coll_name, c.create_ts, c.modify_ts,
           c.coll_id, c.coll_owner_name, c.coll_owner_zone, c.coll_type, u.zone_name, u.user_id DESC
) WHERE limit_rn > ? AND limit_rn <= ?

ilsLADataObjects

SELECT * FROM (
  SELECT s.coll_name, s.data_name, s.create_ts, s.modify_ts, s.data_id,
         s.data_size, s.data_repl_num, s.data_owner_name, s.data_owner_zone, u.user_name,
         u.user_id, a.access_type_id, u.user_type_name, u.zone_name, rownum as limit_rn
  FROM (
      SELECT c.coll_name, d.data_name, d.create_ts, d.modify_ts, d.data_id, d.data_repl_num,
             d.data_size, d.data_owner_name, d.data_owner_zone
      FROM R_COLL_MAIN c
      JOIN R_DATA_MAIN d ON c.coll_id = d.coll_id
      WHERE c.coll_name = ?
  ) s
  JOIN R_OBJT_ACCESS a ON s.data_id = a.object_id
  JOIN R_USER_MAIN u ON a.user_id = u.user_id
  ORDER BY s.coll_name, s.data_name, u.user_name, a.access_type_id, s.create_ts, s.modify_ts,
           s.data_id, s.data_size, s.data_repl_num, s.data_owner_name, s.data_owner_zone,
           u.user_id, u.user_type_name, u.zone_name DESC
) WHERE limit_rn > ? and limit_rn <= ?

korydraughn · 2021-05-17T13:23:47Z

@ac55-sanger Please hold on replacing the existing specific queries.

ac55-sanger · 2021-05-17T13:36:00Z

Sure. Thankfully I haven't looked into these yet.

korydraughn · 2021-05-17T13:40:38Z

@ac55-sanger How long does it take ils -A to run against a fairly large collection on your system?

I'm trying to determine if anything breaks by changing those queries. I feel anything attempting to invoke the existing ones will result in the poor performance we see in NFSRODS.

ac55-sanger · 2021-05-17T13:54:45Z

Here is the time taken to list a directory having 65K files using ils command -

real	3m31.221s
user	0m15.197s
sys	0m4.812s

korydraughn · 2021-05-17T14:25:08Z

@ac55-sanger You can proceed with replacing those specific queries. They were added to improve Jargon's performance around large data sets.

See https://github.com/irods/irods-legacy/blob/ff4eaa47a34f1bb5990d5560f825975c26bab118/iRODS/server/icat/patches/patch3.2to3.3.sh

ac55-sanger · 2021-05-17T15:33:15Z

Sure.
Should I go ahead with the queries shared last week or the ones mentioned in

See https://github.com/irods/irods-legacy/blob/ff4eaa47a34f1bb5990d5560f825975c26bab118/iRODS/server/icat/patches/patch3.2to3.3.sh

?

korydraughn · 2021-05-17T15:40:33Z

Don't use the ones from irods-legacy. Use the replacements mentioned here: #114 (comment)

ac55-sanger · 2021-05-17T15:57:09Z

Sure, thanks.
Will update on how it goes.

Dockerfile

- iRODS permissions are now cached - iRODS user type information is now cached - Fixed list operation result truncation - Replaced parallelStream() w/ stream() - Use counter instead of inode number as cookie for directory entries - Server caches query results for list operation - List operation jumps over previously handled entries instead of looping/skipping them - Experimenting with connection cache - Exposed cache eviction time options - Added new configuration option: using_oracle_database - Bumped Jargon version for Oracle support - Updated the README.

trel mentioned this pull request Jan 22, 2021

Directory listing shows wrong count #111

Closed

korydraughn force-pushed the 110 branch from 0bfcc2b to ef8778b Compare January 22, 2021 13:14

korydraughn force-pushed the 110 branch from 057a441 to d2dbbe9 Compare January 23, 2021 01:12

korydraughn force-pushed the 110 branch from d2dbbe9 to f08b187 Compare January 23, 2021 01:17

korydraughn force-pushed the 110 branch 2 times, most recently from 7e4b080 to 689d36e Compare January 29, 2021 15:15

korydraughn force-pushed the 110 branch 2 times, most recently from 83b4807 to 2928653 Compare March 10, 2021 17:22

korydraughn force-pushed the 110 branch from ad3aa93 to a9b14d7 Compare May 11, 2021 17:13

korydraughn force-pushed the 110 branch 4 times, most recently from 8f2c58e to 5304ba5 Compare May 26, 2021 16:29

trel reviewed May 26, 2021

View reviewed changes

Dockerfile Outdated Show resolved Hide resolved

korydraughn and others added 7 commits May 26, 2021 13:04

[irods#117] Fixed build dependency.

c725b68

[irods#118] Bumped nfs4j version.

8f0c12c

[irods#122] Added version property to mockito maven dependency.

4eb50c0

[irods#121] Made shutdown process more deterministic.

aecc98e

[irods#48] Resolved portmap service warning.

6eab370

[irods#123] Prints only the SHA information on request.

31d5c21

korydraughn force-pushed the 110 branch from 5304ba5 to 31d5c21 Compare May 26, 2021 17:08

trel merged commit f4a5ce2 into irods:master May 26, 2021

korydraughn deleted the 110 branch May 26, 2021 17:54

[#110][#111] List operation handles large collections #114

[#110][#111] List operation handles large collections #114

Conversation

korydraughn commented Jan 22, 2021

trel commented Jan 22, 2021

korydraughn commented Jan 22, 2021

trel commented Jan 22, 2021

korydraughn commented Jan 23, 2021 • edited Loading

trel commented Jan 23, 2021

trel commented Jan 25, 2021

michael-conway commented Jan 25, 2021 via email • edited by trel Loading

kript commented Jan 25, 2021

ac55-sanger commented Jan 26, 2021

ac55-sanger commented Jan 26, 2021

korydraughn commented Jan 26, 2021

korydraughn commented Jan 29, 2021

ac55-sanger commented Jan 29, 2021

ac55-sanger commented Jan 29, 2021 • edited Loading

korydraughn commented Jan 29, 2021

ac55-sanger commented Jan 29, 2021

ac55-sanger commented Feb 3, 2021

michael-conway commented Feb 3, 2021 via email

korydraughn commented Feb 3, 2021

korydraughn commented Feb 5, 2021

michael-conway commented Feb 5, 2021 via email

ac55-sanger commented Feb 8, 2021

korydraughn commented Apr 15, 2021

ac55-sanger commented Apr 16, 2021

ac55-sanger commented Apr 20, 2021

trel commented May 10, 2021

korydraughn commented May 13, 2021

korydraughn commented May 13, 2021 • edited Loading

ilsLACollections

ilsLADataObjects

korydraughn commented May 17, 2021

ac55-sanger commented May 17, 2021

korydraughn commented May 17, 2021

ac55-sanger commented May 17, 2021

korydraughn commented May 17, 2021 • edited Loading

ac55-sanger commented May 17, 2021

korydraughn commented May 17, 2021

ac55-sanger commented May 17, 2021

korydraughn commented Jan 23, 2021 •

edited

Loading

michael-conway commented Jan 25, 2021 via email •

edited by trel

Loading

ac55-sanger commented Jan 29, 2021 •

edited

Loading

korydraughn commented May 13, 2021 •

edited

Loading

korydraughn commented May 17, 2021 •

edited

Loading