Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Taxonomy facets: can we change massive int[] for parent/child/sibling tree to paged/block int[] to reduce RAM pressure? #12989

Closed
mikemccand opened this issue Jan 2, 2024 · 8 comments · Fixed by #12995

Comments

@mikemccand
Copy link
Member

Description

At Amazon product search we use taxonomy facets for the facet filtering we show customers, but this causes high RAM pressure on every refresh as the current implementation allocates a new massive int[] on each refresh, requiring ~2X the transient RAM usage until the old taxonomy reader is fully closed / dereferenced, causing us to over-size our heaps just to handle this short RAM surge.

Yet, on each refresh, all that really is happening is a few new ints might be effectively appended to the end of the old int[] -- it is inherently a write once append only data structure. Reallocating the full massive int[] every time is silly.

I think we could switch to a paged int[] structure? This way the new reader could share nearly all of the old int[] pages, and only make a new last page to hold the few new append ints?

@msfroh
Copy link
Contributor

msfroh commented Jan 5, 2024

If nobody else is working on this, I think I'd like to take it!

@stefanvodita
Copy link
Contributor

@msfroh - I was looking into this as well and had some thoughts about how to do it.

We could replace ParallelTaxonomyArrays with a new interface that offers three operations for each of the arrays:

interface ChunkedParallelTaxonomyArrays {

  /* Record new entry. */
  public void appendParent(int parent);

  /* Retrieve this ordinal's parent. From the user's perspective, this is like an array look-up. */
  public int getParent(int ord);

  /* There are some places where we need to know how many parents exist in total. */
  public int sizeParents();

  // Same for children and siblings
  ...
}

To implement this inteface, we could use an IntBlockPool. We would allocate new int buffers in the block pool as needed and preserve the block pool across DirectoryTaxonomyReader refreshes.

There are definitely some disadvantages with the block pool idea:

  1. We're preserving a mutable data-structure across taxonomy refreshes. There is precedent though, with the caches in DirectoryTaxonomyReader.
  2. We would be slightly overallocating by having the last buffer in the pool not be completely used, but I think this is a good trade-off to take for the increased efficiency and simplicity.

What do you think, did you have something else in mind?

@msfroh
Copy link
Contributor

msfroh commented Jan 5, 2024

What do you think, did you have something else in mind?

Oh -- I didn't have anything in mind. I just saw the issue and thought, "Hey, I could figure out how to do that!" Sounds like you've got it in hand, though!

@stefanvodita
Copy link
Contributor

I'd be happy to work together on it! If we go the route I was proposing, there's a non-trivial amount of work to do:

  1. Create the new interface for taxonomy arrays and use it with taxonomy facets and in the taxo reader.
  2. Augment the IntBlockPool with conveience methods that support this new use-case and implement the new taxo array interface using the IntBlockPool.

1 and 2 can be done independently, so we could each take one of those work streams. I'll start on it in the next few days, but feel free to jump in if you get the chance.

@msfroh
Copy link
Contributor

msfroh commented Jan 5, 2024

I took a look and I think we might be able to do it a little easier:

public abstract class ParallelTaxonomyArrays {
  public class ChunkedArray {
    private final int chunkSize;
    private final int[][] chunks;

    public ChunkedArray(int chunkSize, int[][] chunks) {
      this.chunkSize = chunkSize;
      this.chunks = chunks;
    }

    public int get(int i) {
      int chunkNum = i / chunkSize;
      return chunks[chunkNum][i - (chunkNum * chunkSize)];
    }

    public int length() {
      return chunkSize * chunks.length;
    }
  }

  /** Sole constructor. */
  public ParallelTaxonomyArrays() {}

  public abstract ChunkedArray parents();
  public abstract ChunkedArray children();
  public abstract ChunkedArray siblings();
}

Then within TaxonomyIndexArrays, we can focus on building int[][] instances that get wrapped in ChunkedArray. I feel like IntBlockPool might be overkill?

@msfroh
Copy link
Contributor

msfroh commented Jan 6, 2024

I ended up running with that idea (sort of) and implemented this: #12995

The unit tests pass, but I don't think any of them allocate more than 8192 ordinals (the size of chunk that I set).

@stefanvodita
Copy link
Contributor

Thanks @msfroh! The PR looks neat and you might be right that, while IntBlockPool basically maintains an int[][] like our ChunkedArray, it is a bit inconveient to work with.
I left more detailed comments on the PR, but the high-level question is if we've actually reduced the memory footprint during taxonomy refreshes. It's very possible I'm missing something, but right now it looks to me like we haven't improved on that front. Doing shallow copies of the old array without allocating new memory would solve it though.

@msfroh
Copy link
Contributor

msfroh commented Jan 7, 2024

It's very possible I'm missing something, but right now it looks to me like we haven't improved on that front. Doing shallow copies of the old array without allocating new memory would solve it though.

What you've missed is that I'm a big dum-dum 😁

Thanks for catching that! I refactored some code into a shared method (between the "reuse old arrays" case and the "start fresh with a TaxonomyReader" case) and foolishly applied the "start fresh" logic every time. I've fixed it in a subsequent commit (allocating chunks only starting from the index of the last chunk of the old array).

I also incorporated several of the other changes that you suggested. Thanks a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants