Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Implement listing version 3.0 #12605

Merged
merged 28 commits into from
Jul 5, 2021

Conversation

klauspost
Copy link
Contributor

@klauspost klauspost commented Jun 30, 2021

Description

design document

Motivation and Context

See design doc above

How to test this PR?

Start server, perform listing operations.

minio/mint#327 required for mint.

Types of changes

  • New feature (non-breaking change which adds functionality)
  • Optimization (provides speedup with no functional changes)

Checklist:

  • Unit tests added/updated

# Conflicts:
#	cmd/metacache-set.go
@klauspost klauspost marked this pull request as ready for review July 2, 2021 11:18
klauspost added a commit to klauspost/minio that referenced this pull request Jul 2, 2021
Fixes brought forward from minio#12605

Fixes resolution when an object is in prefix of another and one zone returns the directory and another the object.

Fixes resolution on single entries that arrive first, so resolution doesn't depend on order.
@harshavardhana
Copy link
Member

I ended up squashing all the commits and pushing to this branch @klauspost - since this is a simple enough change and also fixes the local erasure-coded setup behavior.

@harshavardhana harshavardhana changed the title Listing v3 refactor feat: Implement listing version 3.0 Jul 5, 2021
@harshavardhana harshavardhana self-requested a review July 5, 2021 06:21
Copy link
Member

@harshavardhana harshavardhana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works great in initial tests, ready to take it in.

@harshavardhana
Copy link
Member

Although this PR is good there are perhaps a lot more tuning that we need to do, there are subtle bugs like this

warp list --warp-client 192.168.103.{11...16} --host 192.168.103.{5...10}:9000 --access-key minio --secret-key minio123 --storage-class REDUCED_REDUNDANCY --noprefix
Throughput, split into 103 x 1s:
 * Fastest: 607.7KiB/s, 622.38 obj/s (1s, starting 00:30:11 MDT)
 * 50% Median: 570.2KiB/s, 584.04 obj/s (1s, starting 00:31:08 MDT)
 * Slowest: 420.8KiB/s, 430.95 obj/s (1s, starting 00:31:53 MDT)

----------------------------------------
Operation: LIST (409). Concurrency: 0.
Errors: 409
First Errors:
 * http://192.168.103.6:9000, 2021-07-05 00:34:08 -0600 MDT: Unexpected object count, want 10000, got 60000
 * http://192.168.103.8:9000, 2021-07-05 00:34:11 -0600 MDT: Unexpected object count, want 10000, got 60000
 * http://192.168.103.10:9000, 2021-07-05 00:33:40 -0600 MDT: Unexpected object count, want 10000, got 60000
 * http://192.168.103.5:9000, 2021-07-05 00:33:34 -0600 MDT: Unexpected object count, want 10000, got 60000
 * http://192.168.103.7:9000, 2021-07-05 00:33:34 -0600 MDT: Unexpected object count, want 10000, got 60000
 * http://192.168.103.9:9000, 2021-07-05 00:33:53 -0600 MDT: Unexpected object count, want 10000, got 60000
 * http://192.168.103.6:9000, 2021-07-05 00:35:19 -0600 MDT: Unexpected object count, want 10000, got 60000
 * http://192.168.103.8:9000, 2021-07-05 00:34:16 -0600 MDT: Unexpected object count, want 10000, got 60000
 * http://192.168.103.5:9000, 2021-07-05 00:33:55 -0600 MDT: Unexpected object count, want 10000, got 60000
 * http://192.168.103.10:9000, 2021-07-05 00:33:38 -0600 MDT: Unexpected object count, want 10000, got 60000

Skipping LIST too few samples. Longer benchmark run required for reliable results.

Without --noprefix things run fine of course because each prefix only has limited entries.

@harshavardhana
Copy link
Member

warp list --warp-client 192.168.103.{11...16} --host 192.168.103.{5...10}:9000 --access-key minio --secret-key minio123 --storage-class REDUCED_REDUNDANCY --noprefix --concurrent 1
----------------------------------------
Operation: LIST (154). Concurrency: 0.
Errors: 154
First Errors:
 * http://192.168.103.10:9000, 2021-07-05 00:53:41 -0600 MDT: Unexpected object count, want 10000, got 60000
 * http://192.168.103.10:9000, 2021-07-05 00:53:40 -0600 MDT: Unexpected object count, want 10000, got 60000
 * http://192.168.103.10:9000, 2021-07-05 00:53:40 -0600 MDT: Unexpected object count, want 10000, got 60000
 * http://192.168.103.10:9000, 2021-07-05 00:53:41 -0600 MDT: Unexpected object count, want 10000, got 60000
 * http://192.168.103.10:9000, 2021-07-05 00:53:40 -0600 MDT: Unexpected object count, want 10000, got 60000
 * http://192.168.103.10:9000, 2021-07-05 00:53:40 -0600 MDT: Unexpected object count, want 10000, got 60000
 * http://192.168.103.5:9000, 2021-07-05 00:53:47 -0600 MDT: Unexpected object count, want 10000, got 60000
 * http://192.168.103.5:9000, 2021-07-05 00:53:47 -0600 MDT: Unexpected object count, want 10000, got 60000
 * http://192.168.103.5:9000, 2021-07-05 00:55:46 -0600 MDT: Unexpected object count, want 10000, got 60000
 * http://192.168.103.5:9000, 2021-07-05 00:55:46 -0600 MDT: Unexpected object count, want 10000, got 60000

Skipping LIST too few samples. Longer benchmark run required for reliable results.

Also reproduces this situation @klauspost

Copy link
Member

@harshavardhana harshavardhana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👎🏽 until we investigate the warp --noprefix issue

@klauspost
Copy link
Contributor Author

Sure 👌

@klauspost
Copy link
Contributor Author

Interesting. Getting fine results, with set: SET WARP_HOST=127.0.0.1:{9001...9002}. Using 1 ss, 2 servers, 4 disks each or 4ss, 1 set each, 4 disks each.

Using warp list -noprefix --concurrent 1 -duration=30s. @harshavardhana What server setup are you using?

@harshavardhana
Copy link
Member

Interesting. Getting fine results, with set: SET WARP_HOST=127.0.0.1:{9001...9002}. Using 1 ss, 2 servers, 4 disks each or 4ss, 1 set each, 4 disks each.

Using warp list -noprefix --concurrent 1 -duration=30s. @harshavardhana What server setup are you using?

It's a 6 node setup with 20 drives each @klauspost and 6 clients..

… listing-v3-refactor

# Conflicts:
#	cmd/metacache-server-pool.go
#	cmd/metacache-set.go
@klauspost
Copy link
Contributor Author

@harshavardhana Interesting. Are you cleaning between runs? Without prefixes all objects will be in the root, so without cleanup there will be too many objects.

I got a want 50000, got 50001 when it failed to clean up a single file from the last run. Testing on 6 servers in one pool, 4 disks each ATM, but I guess I can add more disks.

@harshavardhana
Copy link
Member

@harshavardhana Interesting. Are you cleaning between runs? Without prefixes all objects will be in the root, so without cleanup there will be too many objects.

I got a want 50000, got 50001 when it failed to clean up a single file from the last run. Testing on 6 servers in one pool, 4 disks each ATM, but I guess I can add more disks.

Yes I just let warp clean itself, I don't specify --noclear

@harshavardhana
Copy link
Member

I suspect this issue to be some kind of a timeout problem, where the listing timesout and then ends up listing all the entries without honoring the max-keys @klauspost

@harshavardhana
Copy link
Member

@klauspost another thing I found is the master branch

warp list --warp-client 192.168.103.{11...16} --host 192.168.103.{5...10}:9000 --access-key minio --secret-key minio123 --storage-class REDUCED_REDUNDANCY --noprefix

Works fine with this where as this PR does not, in-fact it seems to increase the load on the system significantly higher is that expected?

@harshavardhana
Copy link
Member

@klauspost another thing I found is the master branch

warp list --warp-client 192.168.103.{11...16} --host 192.168.103.{5...10}:9000 --access-key minio --secret-key minio123 --storage-class REDUCED_REDUNDANCY --noprefix

Works fine with this where as this PR does not, in-fact it seems to increase the load on the system significantly higher is that expected?

Actually this is not correct - looks like even master branch has this issue

----------------------------------------
Operation: LIST (2276). Concurrency: 0.
Errors: 2276
First Errors:
 * http://192.168.103.5:9000, 2021-07-05 12:16:36 -0600 MDT: Unexpected object count, want 10000, got 59999
 * http://192.168.103.6:9000, 2021-07-05 12:16:38 -0600 MDT: Unexpected object count, want 10000, got 59999
 * http://192.168.103.8:9000, 2021-07-05 12:16:39 -0600 MDT: Unexpected object count, want 10000, got 59999
 * http://192.168.103.9:9000, 2021-07-05 12:16:39 -0600 MDT: Unexpected object count, want 10000, got 59999
 * http://192.168.103.7:9000, 2021-07-05 12:16:38 -0600 MDT: Unexpected object count, want 10000, got 59999
 * http://192.168.103.10:9000, 2021-07-05 12:16:36 -0600 MDT: Unexpected object count, want 10000, got 59999
 * http://192.168.103.5:9000, 2021-07-05 12:16:36 -0600 MDT: Unexpected object count, want 10000, got 59999
 * http://192.168.103.6:9000, 2021-07-05 12:16:38 -0600 MDT: Unexpected object count, want 10000, got 59999
 * http://192.168.103.8:9000, 2021-07-05 12:16:40 -0600 MDT: Unexpected object count, want 10000, got 59999
 * http://192.168.103.9:9000, 2021-07-05 12:16:38 -0600 MDT: Unexpected object count, want 10000, got 59999

Skipping LIST too few samples. Longer benchmark run required for reliable results.

@klauspost
Copy link
Contributor Author

klauspost commented Jul 5, 2021

@harshavardhana I can only repro if it is not cleaned up. I did find a bug where disks would be duplicated. Just sent a fix for that. That is this PR only, though.

I added a check for results in warp:

				// Wait for errCh to close.
				last := ""
				for oi := range listCh {
					if oi.Err != nil {
						d.Error(oi.Err)
						op.Err = oi.Err.Error()
					}
					if last >= oi.Key {
						console.Errorln("duplicate entry", oi.Key, "last was", last)
					}
					if true {
						found := false
						for _, o := range d.objects {
							for _, obj := range o {
								if obj.Name == oi.Key {
									found = true
									break
								}
							}
						}
						if !found {
							console.Errorln("Unknown key:", oi.Key)
						}
					}
					last = oi.Key

This should detect unknown or duplicated entries returned. Insert after // Wait for errCh to close.

Oh you are running multiple clients. Probably a bug with that.

@klauspost
Copy link
Contributor Author

@harshavardhana Yes, multiple clients will do that, since objects are dumped in the same folder, they cannot separate them when listing.

cmd/metacache-server-pool.go Outdated Show resolved Hide resolved
@minio-trusted
Copy link
Contributor

Mint Automation

Test Result
mint-large-bucket.sh ✔️
mint-fs.sh ✔️
mint-gateway-s3.sh ✔️
mint-erasure.sh ✔️
mint-dist-erasure.sh ✔️
mint-zoned.sh ✔️
mint-gateway-nas.sh ✔️
mint-compress-encrypt-dist-erasure.sh ✔️
Deleting image on docker hub
Deleting image locally

@harshavardhana
Copy link
Member

The PR looks good and also fixes actually some existing problems in the server implementation. Taking this in to further tweak the master branch.

@harshavardhana harshavardhana merged commit 05aebc5 into minio:master Jul 5, 2021
@klauspost klauspost deleted the listing-v3-refactor branch July 6, 2021 08:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants