Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Write Performance between 'MinIO Client SDK' and 'mc' #816

Closed
adrian-tarau opened this issue Nov 11, 2019 · 25 comments
Closed

Write Performance between 'MinIO Client SDK' and 'mc' #816

adrian-tarau opened this issue Nov 11, 2019 · 25 comments
Labels

Comments

@adrian-tarau
Copy link

I was asked to open an issue after I posted a question on slack: https://minio.slack.com/archives/C3NDUB8UA/p1573484464253800

I'll provide additional information once I set up a performance test in my local environment.

@adrian-tarau
Copy link
Author

Initial numbers ... I have one script which pipes bytes to mc and then one test for each client, Minio and AWS.

I used a local Minio Server instance to avoid network cost started with:

minio --config-dir ${MINIO_HOME}/config server --address ":9001" ${MINIO_DATA}/d{1...16}

Using dd && mc, I get on average 120MB/s

1048576000 bytes (1.0 GB) copied, 6.89415 s, 152 MB/s
1048576000 bytes (1.0 GB) copied, 7.61475 s, 138 MB/s
1048576000 bytes (1.0 GB) copied, 9.36801 s, 112 MB/s
1048576000 bytes (1.0 GB) copied, 11.2864 s, 92.9 MB/s
1048576000 bytes (1.0 GB) copied, 7.22005 s, 145 MB/s
1048576000 bytes (1.0 GB) copied, 11.3493 s, 92.4 MB/s
1048576000 bytes (1.0 GB) copied, 8.70096 s, 121 MB/s
1048576000 bytes (1.0 GB) copied, 6.90379 s, 152 MB/s

Using Minio client, flat around 19MB/sec. The "good" news, AWS is not "much" faster ... twice as fast as Minio, but still way bellow mc, around 42MB/sec

@adrian-tarau
Copy link
Author

On the Java side, the test-bed is pretty simple:

  • generate one 1GB file
  • create the client, default settings
  • in a loop, call putObject for both clients and measure number of bytes and total duration (and print out throughput, duration, etc)

In the case of Minio client, the hot areas are:

  • MinioClient.createRequest, which uses Digest.sha256Hash, which consumes 99% of the time
  • and inside MinioClient.executeRequest, the part where the HTTP call is done (his.httpClient.newCall(request).execute())

In the case of AWS (just to have a baseline for another Java client), the hot areas are:

  • AmazonS3Client.getInputStream, which uses Md5Utils.computeMD5Hash, which uses 99% of the time
  • and AmazonS3Client.PutObjectStrategy.invokeServiceCall, which does a lot of HTTP but spends most of the time in AWS4Signer.calculateContentHash

In both cases, both clients are spending significant time to calculate hashes (SHA256) ...

@adrian-tarau
Copy link
Author

I don't think native code compiled with Go is faster (or slower) than Java and I presume MC is using Minio Go Client, which should do about the same thing that Minio Java Client does .... so why is much faster (related to both Java Clients)?

I also presume that the difference in performance between sha256 in Java and Go are pretty much the same (hopefully?) ...

@adrian-tarau
Copy link
Author

Made another test, this time for io.minio.Digest.sha256Hash ... even if it is protected, a little bit of reflection code and I was able to call it. Hashing 1MB array takes in avg 9ms, so I got ~112MB/sec ... seems pretty low? Any idea how much it takes in Go?

By the way, I'm using Oracle JDK 1.8.0 build 202 ...

@adrian-tarau
Copy link
Author

Sorry, the AWS client did not have the default options, I disabled chunked encoding...with chunk encoding turned on, I get 62MB/sec.

This is how the client is created (path style access is required to be able to access Minio/non-AWS S3 stores):

AWSCredentials credentials = new BasicAWSCredentials("minio", "minio123");
        ClientConfiguration clientConfiguration = new ClientConfiguration()
                .withProtocol(Protocol.HTTP)
                .withTcpKeepAlive(true);
        client = new AmazonS3Client(credentials, clientConfiguration);
        client.setS3ClientOptions(S3ClientOptions.builder()
                .setPathStyleAccess(true)
                .build());
        client.setEndpoint(DEFAULT_URI.toASCIIString());

@adrian-tarau
Copy link
Author

... and the secret why it got faster? skipping SHA 256 calculation ... in case of chunked encoding, ASW client skips SHA 256 calculation (basically skipping x-amz-content-sha256 header), even if it's not HTTPS (for HTTPS, it skips the calculation even if chunk encoding is disabled.

@adrian-tarau
Copy link
Author

They even have a warning in S3ClientOptions...so I guess it is expensive ... so ... is the Go client taking a shortcut and not calculating all these hashes like Java implementation does?

/**
     * <p>
     * Returns whether the client is configured to sign payloads in all situations.
     * </p>
     * <p>
     * Payload signing is optional when chunked encoding is not used and requests are made
     * against an HTTPS endpoint.  Under these conditions the client will by default
     * opt to not sign payloads to optimize performance.  If this flag is set to true the
     * client will instead always sign payloads.
     * </p>
     * <p>
     * <b>Note:</b> Payload signing can be expensive, particularly if transferring
     * large payloads in a single chunk.  Enabling this option will result in a performance
     * penalty.
     * </p>
     *
     * @return True if body signing is explicitly enabled for all requests
     */
    public boolean isPayloadSigningEnabled() {
        return payloadSigningEnabled;
    }

@adrian-tarau
Copy link
Author

It looks like AWS client also has a few more secrets...with the eTag validation off (we trust the content, we are in a private network), I got to 82MB/sec. Still behind mc, but good enough.

/**
     * System property to disable MD5 validation for GetObject. Any value set for this property will
     * disable validation.
     */
    public static final String DISABLE_GET_OBJECT_MD5_VALIDATION_PROPERTY = "com.amazonaws.services.s3.disableGetObjectMD5Validation";

    /**
     * System property to disable MD5 validation for both PutObject and UploadPart. Any value set
     * for this property will disable validation.
     */
    public static final String DISABLE_PUT_OBJECT_MD5_VALIDATION_PROPERTY = "com.amazonaws.services.s3.disablePutObjectMD5Validation";

@adrian-tarau
Copy link
Author

Ok, Go client is cheating ;) Found the creation of MD5/SHA256 .... it looks like MD5 comes from base libraries (crypto/md5), but sha256 comes from github.com/minio/sha256-simd, which is Minio project: https://github.com/minio/sha256-simd

Also found this blog entry: https://blog.minio.io/highwayhash-fast-hashing-at-over-10-gb-s-per-core-in-golang-fee938b5218a

If SHA256 is that expensive (and it looks like it is)...no wonder why mc is faster ...

The problem that I have is that I did some benchmarks in the past with Minio Client and AWS and Minio was slower than AWS even back then...However, it got even slower with the latest Minio Java SDK? 1.5-2 times slower? I do not have the right numbers right now, I could try to revert to an older version and compare.

But now, with a tweaked AWS client, I can get 80MB/sec, which is good enough. I'm wondering if I can do the same thing with Minio Client?

I think I have posted enough for a day :) I'll leave you guys some time to go over all my posts ...

@sinhaashish
Copy link
Contributor

@adrian-tarau Adding to the chain of comments
in minio-go :
The Minimum part size is 128 MiB.
https://github.com/minio/minio-go/blob/437215bf4b6f14ae8344ed60ade08296b0d0a753/constants.go#L28

// minPartSize - minimum part size 128MiB per object after which
// putObject behaves internally as multipart.
const minPartSize = 1024 * 1024 * 128

while in Minio-java its is 5 MiB

private static final int MIN_MULTIPART_SIZE = 5 * 1024 * 1024;

  // minimum allowed multipart size is 5MiB
  private static final int MIN_MULTIPART_SIZE = 5 * 1024 * 1024;

Can you share the stats by putting MIN_MULTIPART_SIZE to 128 MiB in minio-java.

@adrian-tarau
Copy link
Author

Probably not the reason why ... Amazon Client has 128K chunk size and a 256k read buffer. I'll take the source code and change things (buffers) here and there and see how it goes.

@adrian-tarau
Copy link
Author

Actually, that's the limit to switch to chunk upload...the real chunk size is 64k

@adrian-tarau
Copy link
Author

What it looks like it kills the performance (and that was a surprise to me too) is that sha256 in Java is not that fast. On my machine, I got a maximum of 115MB/sec. And Minio does a full sha256 on the whole file and then it does sha256 on each chunk.
The Amazon Client, when it switches to chunk, it does sha256 only on chunks and they warn not to enable the full sha256.

@adrian-tarau
Copy link
Author

My bad, Minio is not doing sha256 ones per file and ones per chunk, only per chunk.

I went down a few levels, to the OS.

  • strace shows (for 10 seconds) 24220 reads with Minio vs 3355 with Amazon.
  • dstat shows ~5MB/sec writes with Minio vs 800k/sec writes with Amazon, with some large > 100MB/ sec write once a while (compared with Minio which constantly writes)

This shows me the following:

  • Minio buffers for reading from the file system are too small. Even if in this test I have basically almost no reads 'cause the test file is in OS cache, the fact that it goes so often to do a syscall, that's expensive. The Digest is using a 16k buffer, compared with 256k buffer for Amazon (with a 128K chunk size).
  • Minio chunks?? are forcing Minio Server to perform less than optimal I/O requests.

It would be interesting to know what are these limits on the Minio Go client.

@adrian-tarau
Copy link
Author

By the way, Minio Client vs Amazon Client for reads ... ~690MB/sec vs ~780MB/sec (reading the same file, so everything in cached in OS)
There are some differences also for reads but at this throughput, it's not really a huge impact for clients ... if you can read with more than 500MB/sec (which most of the time is not possible since the I/O or Network will not be able to give you that), it's more than enough.

@balamurugana
Copy link
Member

balamurugana commented Nov 13, 2019

Comparing a SDK with a tool is incorrect. I did little test by running minio with single endpoint locally. Below is the result.

For writing 1 GiB data

dd:         2.92624s, ~ 366.935 mb/s
minio-go:   11.6195s, ~ 92.408 mb/s
minio-java: 16.1639s, ~ 66.428 mb/s

Detailed output

[bala@localhost tmp]$ dd oflag=direct if=/dev/zero of=1gb bs=1M count=1024
724566016 bytes (725 MB, 691 MiB) copied, 2 s, 362 MB/s
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 2.96655 s, 362 MB/s

[bala@localhost tmp]$ dd oflag=direct if=/dev/zero of=1gb bs=1M count=1024
732954624 bytes (733 MB, 699 MiB) copied, 2 s, 366 MB/s
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 2.93666 s, 366 MB/s

[bala@localhost tmp]$ dd oflag=direct if=/dev/zero of=1gb bs=1M count=1024
756023296 bytes (756 MB, 721 MiB) copied, 2 s, 378 MB/s
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 2.8554 s, 376 MB/s

[bala@localhost tmp]$ dd oflag=direct if=/dev/zero of=1gb bs=1M count=1024
734003200 bytes (734 MB, 700 MiB) copied, 2 s, 367 MB/s
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 2.94635 s, 364 MB/s

[bala@localhost checkspeed]$ time ./checkspeed 
2019/11/13 11:31:48 Uploaded successfully

real    0m8.378s
user    0m6.826s
sys     0m1.264s

[bala@localhost checkspeed]$ time ./checkspeed 
2019/11/13 11:32:07 Uploaded successfully

real    0m11.286s
user    0m6.008s
sys     0m0.640s

[bala@localhost checkspeed]$ time ./checkspeed 
2019/11/13 11:32:24 Uploaded successfully

real    0m14.595s
user    0m6.356s
sys     0m0.456s

[bala@localhost checkspeed]$ time ./checkspeed 
2019/11/13 11:32:41 Uploaded successfully

real    0m12.219s
user    0m6.008s
sys     0m0.524s


[bala@localhost checkspeed]$ time java -cp minio-6.0.12-DEV-all.jar:. PutObject
uploaded successfully

real    0m16.610s
user    0m12.000s
sys     0m1.244s

[bala@localhost checkspeed]$ time java -cp minio-6.0.12-DEV-all.jar:. PutObject
uploaded successfully

real    0m16.246s
user    0m11.759s
sys     0m1.232s

[bala@localhost checkspeed]$ time java -cp minio-6.0.12-DEV-all.jar:. PutObject
uploaded successfully

real    0m16.273s
user    0m11.784s
sys     0m1.199s

[bala@localhost checkspeed]$ time java -cp minio-6.0.12-DEV-all.jar:. PutObject
uploaded successfully

real    0m15.527s
user    0m11.632s
sys     0m1.267s

Test sources

package main

import (
	"log"
	"os"

	minio "github.com/minio/minio-go/v6"
)

func main() {
	s3Client, err := minio.New("localhost:9000", "minio", "minio123", false)
	if err != nil {
		log.Fatalln(err)
	}

	object, err := os.Open("/home/bala/tmp/1gb")
	if err != nil {
		log.Fatalln(err)
	}
	defer object.Close()
	objectStat, err := object.Stat()
	if err != nil {
		log.Fatalln(err)
	}

	_, err = s3Client.PutObject("mybucket", "myobject", object, objectStat.Size(), minio.PutObjectOptions{ContentType: "application/octet-stream"})
	if err != nil {
		log.Fatalln(err)
	}
	log.Println("Uploaded successfully")
}
import io.minio.MinioClient;

public class PutObject {
  public static void main(String[] args) throws Exception {
      MinioClient minioClient = new MinioClient("http://localhost:9000", "minio", "minio123");
      minioClient.putObject("mybucket", "myobject", "/home/bala/tmp/1gb", 1073741824L, null, null, null);
      System.out.println("uploaded successfully");
  }
}

Language versions

[bala@localhost checkspeed]$ go version
go version go1.13.4 linux/amd64

[bala@localhost checkspeed]$ java -version
openjdk version "1.8.0_232"
OpenJDK Runtime Environment (build 1.8.0_232-b09)
OpenJDK 64-Bit Server VM (build 25.232-b09, mixed mode)

@adrian-tarau
Copy link
Author

Do you mean comparing mc with Java SDK? It should be the same, I would think? mc uses Minio Go and comparing write throughput between mc and Minio Java should be the same as comparing Minio Go with Minio Java, right? I understand now that Minio Go is using an optimized sha256, which might/will be faster even in the absence of those CPU instructions designed to speed sha256.

Anyway, your throughput, 92 MB/s vs 66 MB/s is great and I'm wondering why I have such a big difference between pure disk performance (dd), mc and Java.

Don't get me wrong, I'm not criticizing Minio or Minio Java, you guys did a great job building these amazing tools....

@balamurugana
Copy link
Member

Do you mean comparing mc with Java SDK? It should be the same, I would think? mc uses Minio Go and comparing write throughput between mc and Minio Java should be the same as comparing Minio Go with Minio Java, right?

No. mc does better/optimal/parallel uploads/downloads on top of minio-go whereas minio SDKs give APIs for application development i.e. logic of parallel uploads/downloads using those APIs are needed on top level (like mc).

@adrian-tarau
Copy link
Author

This is what my disk can give me:

ady@zeus2 ~/tools/minio $ dd oflag=direct if=/dev/zero of=1gb bs=1M count=1024
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 6.16744 s, 174 MB/s

Minio Java : 26.11 MB/s
Amazon SDK: 97.64 MB/s

@adrian-tarau
Copy link
Author

I was wondering if you could run a test for Amazon SDK? This is the code:

public void setup() throws Exception {
        super.setup();

        System.setProperty(SkipMd5CheckStrategy.DISABLE_GET_OBJECT_MD5_VALIDATION_PROPERTY, "true");
        System.setProperty(SkipMd5CheckStrategy.DISABLE_PUT_OBJECT_MD5_VALIDATION_PROPERTY, "true");

        ClientConfiguration clientConfiguration = new ClientConfiguration()
                .withProtocol(Protocol.HTTP)
                .withTcpKeepAlive(true);

        AWSCredentials credentials = new BasicAWSCredentials("minio", "minio123");
        client = AmazonS3ClientBuilder.standard()
                .withCredentials(new AWSStaticCredentialsProvider(credentials))
                .withClientConfiguration(clientConfiguration)
                .withPathStyleAccessEnabled(true)
                .withEndpointConfiguration(new AwsClientBuilder.EndpointConfiguration(getDefaultUri().toASCIIString(), null))
                .build();
    }

    @Test
    public void testWriteFile() {
        testWrite(getFile(), file -> {
            try {
                client.putObject("tmp", "big_file_aws_client", file);
            } catch (Exception e) {
                LOGGER.warning("Failed to upload file: " + file);
            }
        });
    }

the method testWrite comes from a base class...it only does looping over one file, calls putObject and times it.

@adrian-tarau
Copy link
Author

Hmmm...so how can I do parallel uploads with Minio Java? :) I do not expect amazing write throughput, most applications care about reading throughput from S3 ... however, when the write throughput is 10MB/s-20MB/s, that would become a problem. I'm trying to understand why do I get 10MB/sec.

Would the number of Minio nodes affect the write throughput in a considerable way? Should I expect a significant difference between 5 nodes, 10 nodes or 30 nodes? I tried looking in the Minio documentation but I could not find clearly spelled out this aspect.

@adrian-tarau
Copy link
Author

To be clear I get 10MB/s from Java in a large cluster when mc flies. 10MB/s is way too low, 30MB/s-40MB/s would be what I would find reasonable.

@balamurugana
Copy link
Member

I did little check with aws-sdk-java which performs 12s.

package examples;

import com.amazonaws.services.s3.internal.SkipMd5CheckStrategy;
import com.amazonaws.ClientConfiguration;
import com.amazonaws.Protocol;
import com.amazonaws.auth.BasicAWSCredentials;
import com.amazonaws.auth.AWSCredentials;
import com.amazonaws.auth.AWSStaticCredentialsProvider;
import com.amazonaws.client.builder.AwsClientBuilder;
import com.amazonaws.services.s3.AmazonS3;
import com.amazonaws.services.s3.AmazonS3ClientBuilder;

import java.io.File;

public class PutObject {
  public static void main(String[] args) throws Exception {
    System.setProperty(SkipMd5CheckStrategy.DISABLE_GET_OBJECT_MD5_VALIDATION_PROPERTY, "true");
    System.setProperty(SkipMd5CheckStrategy.DISABLE_PUT_OBJECT_MD5_VALIDATION_PROPERTY, "true");

    ClientConfiguration clientConfiguration = new ClientConfiguration()
      .withProtocol(Protocol.HTTP)
      .withTcpKeepAlive(true);

    AmazonS3 client = AmazonS3ClientBuilder.standard()
      .withCredentials(new AWSStaticCredentialsProvider(new BasicAWSCredentials("minio", "minio123")))
      .withClientConfiguration(clientConfiguration)
      .withPathStyleAccessEnabled(true)
      .withEndpointConfiguration(new AwsClientBuilder.EndpointConfiguration("http://localhost:9000", null))
      .build();

    client.putObject("mybucket", "myobject", new File("/home/bala/tmp/1gb"));
    System.out.println("uploaded successfully");
  }
}

I modified minio-java source by increasing part size, I am able to get below results.

minio-java-with-part-size-1gb:      12.333s
minio-java-with-part-size-512mb: 14.324s
minio-java-with-part-size-128mb: 16.942s

minio-java (with no code change) is 28% slower than minio-go as per my testing. If you get too slow, there is some other problem in your testing. Its better to do the testing in a controlled environment with bare minimal code.

Solution to 28% slower problem is to increase part size reasonably.

@adrian-tarau
Copy link
Author

That's 12s vs 16s, right? So Amazon AWS is faster for you too...about 25% faster, basically close to Go SDK. I'll play with different settings of Minio too, see how it goes.

@harshavardhana
Copy link
Member

Solution to 28% slower problem is to increase part size reasonably.

This is already fixed with PutObjectOptions support in 7.0.0 - closing as fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants