New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed inefficient auths check #410

Merged
merged 2 commits into from Mar 29, 2018

Conversation

Projects
None yet
3 participants
@keith-turner
Contributor

keith-turner commented Mar 28, 2018

On the mailing list a performance problem with authorizations was identified. For the case when a user has a large number of authorizations scans can be really slow. For example if a user has 100,000 auths and does a scan with 90,000 auths, its very slow. This is caused by a subset check on the server side that uses two lists. This PR changes the subset check to use an existing hashset.

The following is a performance test that was written to explore this problem. This code creates a user with 100,000 auths and then times scans with 0, 10, 100, 1000, 10000, and 100000 auths.

  public void testManyAuths(Connector conn) throws Exception {
    conn.securityOperations().createLocalUser("bob", new PasswordToken("bob"));

    List<byte[]> al = new ArrayList<>();

    for (int i = 0; i < 100000; i++) {
      al.add(String.format("%06x", i).getBytes(StandardCharsets.UTF_8));
    }

    Authorizations auths = genAuths(100_000, 1);

    conn.securityOperations().changeUserAuthorizations("bob", auths);

    conn.tableOperations().create("bobsSpecialTable");
    conn.securityOperations().grantTablePermission("bob", "bobsSpecialTable", TablePermission.READ);

    conn = conn.getInstance().getConnector("bob", "bob");

    runTest(conn, Authorizations.EMPTY);
    runTest(conn, genAuths(100_000, 10_000));
    runTest(conn, genAuths(100_000, 1_000));
    runTest(conn, genAuths(100_000, 100));
    runTest(conn, genAuths(100_000, 10));
    runTest(conn, auths);
  }

  void runTest(Connector conn, Authorizations auths) throws Exception {

    try (Scanner scanner = conn.createScanner("bobsSpecialTable", auths)) {
      long start = System.currentTimeMillis();
      // do a few warm up scans
      for (int i = 0; i < 50 && System.currentTimeMillis() - start < 60000; i++) {
        int count = 0;
        for (Entry<Key,Value> entry : scanner) {
          count++;
        }

      }

      start = System.currentTimeMillis();
      SummaryStatistics stats = new SummaryStatistics();
      for (int i = 0; i < 100 && System.currentTimeMillis() - start < 120000; i++) {
        long t1 = System.currentTimeMillis();
        for (Entry<Key,Value> entry : scanner) {

        }
        long t2 = System.currentTimeMillis();
        stats.addValue(t2 - t1);

      }
      System.out.printf("auths.size:%,7d  mean:%.2f  stddev:%.2f  min:%.2f  max:%.2f samples:%d\n",
          auths.size(), stats.getGeometricMean(), stats.getStandardDeviation(), stats.getMin(), 
          stats.getMax(), stats.getN());
    }
  }

  Authorizations genAuths(int max, int step) {
    List<byte[]> al = new ArrayList<>();

    for (int i = 0; i < max; i += step) {
      al.add(String.format("%06x", i).getBytes(StandardCharsets.UTF_8));
    }

    return new Authorizations(al);
  }

Before this PR, this test output :

auths.size:      0  mean:134.39  stddev:29.88  min:98.00  max:289.00 samples:100
auths.size:     10  mean:147.82  stddev:33.17  min:104.00  max:242.00 samples:100
auths.size:    100  mean:219.07  stddev:44.38  min:162.00  max:351.00 samples:100
auths.size:  1,000  mean:475.67  stddev:44.17  min:419.00  max:620.00 samples:100
auths.size: 10,000  mean:3516.51  stddev:105.13  min:3319.00  max:3785.00 samples:35
auths.size:100,000  mean:34615.27  stddev:400.12  min:34167.00  max:35109.00 samples:4

This shows that a scan with 0 auths took 134 milliseconds on average. A scan with 100,000 auths took 34 seconds on average. After this PR the test outputs :

auths.size:      0  mean:1.52  stddev:0.73  min:1.00  max:5.00 samples:100
auths.size:     10  mean:137.23  stddev:26.14  min:102.00  max:228.00 samples:100
auths.size:    100  mean:136.70  stddev:28.12  min:104.00  max:247.00 samples:100
auths.size:  1,000  mean:137.59  stddev:26.18  min:101.00  max:243.00 samples:100
auths.size: 10,000  mean:154.95  stddev:24.81  min:115.00  max:232.00 samples:100
auths.size:100,000  mean:347.62  stddev:56.12  min:250.00  max:549.00 samples:100

@keith-turner keith-turner self-assigned this Mar 28, 2018

@keith-turner keith-turner changed the base branch from master to 1.8 Mar 28, 2018

@ctubbsii

Except for isEmpty() use, looks good to me.

Collection<ByteBuffer> userauths = getCachedUserAuthorizations(user).getAuthorizationsBB();
for (ByteBuffer auth : auths)
if (!userauths.contains(auth))
if (auths.size() == 0) {

This comment has been minimized.

@ctubbsii

ctubbsii Mar 28, 2018

Member

You should call .isEmpty()

@joshelser

This comment has been minimized.

Member

joshelser commented Mar 29, 2018

Beautiful write-up, @keith-turner. A pleasure to read.

@keith-turner keith-turner merged commit 226288b into apache:1.8 Mar 29, 2018

2 checks passed

Jenkins This pull request looks good
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

@keith-turner keith-turner deleted the keith-turner:fix-auth-check branch Mar 29, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment