DRILL-5270: Improve loading of profiles listing in the WebUI #1250

kkhatua · 2018-05-04T20:42:06Z

Note: Closed the old PR #755 and opening this.

When Drill is displaying profiles stored on the file system (Local or Distributed), it does so by loading the entire list of .sys.drill files in the profile directory, sorting and deserializing. This can get expensive, since only a single CPU thread does this.
As an example, a directory of 120K profiles, the time to just fetch the list of files alone is over 6 seconds. After that, based on the number of profiles being rendered, the time varies. An average of 30ms is needed to deserialize a standard profile, which translates to an additional 3sec for therendering of default 100 profiles.

A user reported issue confirms just that:
DRILL-5028 Opening profiles page from web ui gets very slow when a lot of history files have been stored in HDFS or Local FS

Additional JIRAs filed ask for managing these profiles
DRILL-2362 Drill should manage Query Profiling archiving
DRILL-2861 enhance drill profile file management

This PR brings the following enhancements to achieve that:

Mimick the In-memory persistence of profiles (DRILL-5481), by keeping only a predefined max-capacity number of profiles in the directory and moving the oldest to an 'archived' sub-directory.
Improve loading times by pinning the deserialized list in memory (TreeSet; for maintaining a memory-efficient sortedness of the profiles). That way, if we do not detect any new profiles in the profileStore (i.e. profile directory) since the last time a web-request for rendering the profiles was made, we can re-serve the same listing and skip making a trip to the filesystem to re-fetch all the profiles.

Reload & reconstruction of the profiles in the Tree is done in the event of any of the following states changing:
i. Modification Time of profile dir
ii. Number of profiles in the profile dir
iii. Number of profiles requested exceeds existing the currently available list

When 2 or more web-requests for rendering arrive, the WebServer code already processes the requests sequentially. As a result, the earliest request will trigger the reconstruction of the in-memory profile-set, and the last-modified timestamp of the profileStore is tracked. This way, the remaining blocked requests can re-use the freshly-reconstructed profile-set for rendering if the underlying profileStore has not been modified. There is an assumption made here that the rate of profiles being added to the profileStore is not too high to trigger a reconstruction for every queued up request.
To prevent frequent archiving, there is a threshold (max-capacity) defined for triggering the archive. However, the number of profiles archived is selected to ensure that the profiles not archived is 90% of the threshold.
To prevent the archiving process from taking too long, an archival rate (drill.exec.profiles.store.archive.rate) is defined so that upto that many number of profiles are archived in one go, before resumption of re-rendering takes place.
On a Distributed FileSystem (e.g. HDFS), multiple Drillbits might attempt to archive. To mitigate that, if a Drillbit detects that it is unable to archive a profile, it will assume that another Drillbit is also archiving, and stop archiving any more.

kkhatua · 2018-05-04T20:53:04Z

[Current Apache Master] User latency when 8 web-clients (wget) request for /profiles against a profile store of 123K profiles (max scale range= 2min)
Notice how all the response end times are staggered by ~13 secs from the previous, because of the profiles being re-read from the disk despite there being no change

kkhatua · 2018-05-04T20:55:29Z

[DRILL-5270] User latency when 8 web-clients (wget) request for /profiles against a profile store of 123K profiles (max scale range= 2min).
Note: Only caching is enabled and no new profiles have been written to the store during the 2 min window.
Notice how all the subsequent responses go fast the moment the first response is complete, because of the profile cache.

kkhatua · 2018-05-04T21:00:46Z

[DRILL-5270] User latency when 8 web-clients (wget) request for /profiles against a profile store of 123K profiles (max scale range= 2min). The requests are done in 2 waves
Note: Both caching and archiving is enabled and no new profiles have been written to the store during the 2 min window.
Notice how all the subsequent responses go fast the moment the third response is complete. The first 3 clients triggered archiving of profiles from 123K down to about 92K, each time trying to build the cache. By the time the fourth request comes, there is no more archiving, so the requests are served from cache (and, hence, they are barely 2-3 seconds apart). The second wave of requests from the 8 clients is now completely served by the cache.

Backend logging reveals the archiving process:

2018-05-01 22:47:37,870 kk127.qa.lab [qtp132047013-85] INFO  o.a.d.e.s.s.s.LocalPersistentStore - Requesting thread: qtp132047013-85-85
2018-05-01 22:47:45,131 kk127.qa.lab [qtp132047013-85] INFO  o.a.d.e.s.s.s.LocalPersistentStore - Found 32935 excess profiles. For now, will attempt archiving 10000 profiles to maprfs:/drillbit/profiles/archived
2018-05-01 22:48:04,771 kk127.qa.lab [qtp132047013-85] INFO  o.a.d.e.s.s.s.LocalPersistentStore - Archived 10000 profiles to maprfs:/drillbit/profiles/archived in 19635 ms
2018-05-01 22:48:04,774 kk127.qa.lab [qtp132047013-85] WARN  o.a.d.e.s.s.s.LocalPersistentStore - Took 26902 ms to list & map 300 profiles (out of 122935 profiles in store)
2018-05-01 22:48:12,310 kk127.qa.lab [qtp132047013-85] INFO  o.a.d.e.s.s.s.LocalPersistentStore - Requesting thread: qtp132047013-85-85
2018-05-01 22:48:18,439 kk127.qa.lab [qtp132047013-85] INFO  o.a.d.e.s.s.s.LocalPersistentStore - Found 22935 excess profiles. For now, will attempt archiving 10000 profiles to maprfs:/drillbit/profiles/archived
2018-05-01 22:48:38,234 kk127.qa.lab [qtp132047013-85] INFO  o.a.d.e.s.s.s.LocalPersistentStore - Archived 10000 profiles to maprfs:/drillbit/profiles/archived in 19791 ms
2018-05-01 22:48:38,236 kk127.qa.lab [qtp132047013-85] WARN  o.a.d.e.s.s.s.LocalPersistentStore - Took 25924 ms to list & map 300 profiles (out of 112935 profiles in store)
2018-05-01 22:48:43,275 kk127.qa.lab [qtp132047013-85] INFO  o.a.d.e.s.s.s.LocalPersistentStore - Requesting thread: qtp132047013-85-85
2018-05-01 22:48:48,911 kk127.qa.lab [qtp132047013-85] INFO  o.a.d.e.s.s.s.LocalPersistentStore - Found 12935 excess profiles. For now, will attempt archiving 10000 profiles to maprfs:/drillbit/profiles/archived
2018-05-01 22:49:09,757 kk127.qa.lab [qtp132047013-85] INFO  o.a.d.e.s.s.s.LocalPersistentStore - Archived 10000 profiles to maprfs:/drillbit/profiles/archived in 20842 ms
2018-05-01 22:49:09,759 kk127.qa.lab [qtp132047013-85] WARN  o.a.d.e.s.s.s.LocalPersistentStore - Took 26482 ms to list & map 300 profiles (out of 102935 profiles in store)
2018-05-01 22:49:14,119 kk127.qa.lab [qtp132047013-85] INFO  o.a.d.e.s.s.s.LocalPersistentStore - Requesting thread: qtp132047013-85-85
2018-05-01 22:49:19,339 kk127.qa.lab [qtp132047013-85] WARN  o.a.d.e.s.s.s.LocalPersistentStore - Took 5217 ms to list & map 300 profiles (out of 92935 profiles in store)
2018-05-01 22:49:23,656 kk127.qa.lab [qtp132047013-85] INFO  o.a.d.e.s.s.s.LocalPersistentStore - Requesting thread: qtp132047013-85-85
2018-05-01 22:49:24,214 kk127.qa.lab [qtp132047013-85] INFO  o.a.d.e.s.s.s.LocalPersistentStore - Requesting thread: qtp132047013-85-85
2018-05-01 22:49:24,798 kk127.qa.lab [qtp132047013-85] INFO  o.a.d.e.s.s.s.LocalPersistentStore - Requesting thread: qtp132047013-85-85
2018-05-01 22:49:25,365 kk127.qa.lab [qtp132047013-85] INFO  o.a.d.e.s.s.s.LocalPersistentStore - Requesting thread: qtp132047013-85-85
2018-05-01 22:55:12,247 kk127.qa.lab [qtp132047013-92] INFO  o.a.d.e.s.s.s.LocalPersistentStore - Requesting thread: qtp132047013-92-92
2018-05-01 22:55:12,791 kk127.qa.lab [qtp132047013-92] INFO  o.a.d.e.s.s.s.LocalPersistentStore - Requesting thread: qtp132047013-92-92
2018-05-01 22:55:13,276 kk127.qa.lab [qtp132047013-92] INFO  o.a.d.e.s.s.s.LocalPersistentStore - Requesting thread: qtp132047013-92-92
2018-05-01 22:55:13,770 kk127.qa.lab [qtp132047013-92] INFO  o.a.d.e.s.s.s.LocalPersistentStore - Requesting thread: qtp132047013-92-92
2018-05-01 22:55:30,477 kk127.qa.lab [qtp132047013-92] INFO  o.a.d.e.s.s.s.LocalPersistentStore - Requesting thread: qtp132047013-92-92
2018-05-01 22:55:31,018 kk127.qa.lab [qtp132047013-92] INFO  o.a.d.e.s.s.s.LocalPersistentStore - Requesting thread: qtp132047013-92-92
2018-05-01 22:55:31,578 kk127.qa.lab [qtp132047013-92] INFO  o.a.d.e.s.s.s.LocalPersistentStore - Requesting thread: qtp132047013-92-92
2018-05-01 22:55:32,140 kk127.qa.lab [qtp132047013-92] INFO  o.a.d.e.s.s.s.LocalPersistentStore - Requesting thread: qtp132047013-92-92

kkhatua · 2018-05-04T21:01:44Z

@arina-ielchiieva / @parthchandra could you review this?

ilooner · 2018-05-08T03:44:56Z

@kkhatua Why not use the Guava Cache? http://www.baeldung.com/guava-cache . I think it would simplify the implementation.

kkhatua · 2018-05-08T17:39:46Z

I did consider using the Gauva Cache initially, but I could not figure out how to specify the eviction policy based on the profile name. Guava provides a mechanism to limit the cache size and evict the oldest entry, but I wanted to override the mechanism that defines 'oldest'. Lastly, the TreeSet allows us to access the elements in a sorted order, which seemed missing in Guava.

Do you think it makes the code cleaner if I were to extract the mechanism into a separate implementation of this 'cache' ?

ilooner · 2018-05-08T18:42:04Z

@kkhatua I'm still not sure why you want to override the definition of oldest? Why is the default LRU eviction policy not sufficient?

If you need an ordered list of keys for the cache you can accomplish this with the Guava cache by adding a key to a TreeSet when the Loader is called, and removing a key from a TreeSet when the Removal Listener is called.

My main concern is that implementing our own cache creates complexity and opens up the possibility for bugs. Whereas a pre-existing cache is already debugged and tested for us.

kkhatua · 2018-05-09T18:51:11Z

The way the cache is constructed is by first listing all the profile files and sorting them (the profile ID is generated in a monotonically decreasing value to ensure sortedness in stores like HBase), This customized TreeSet is used to inject profiles (since the FileSystem is not guaranteed to return the list in order), so the TreeSet provides the ordering. We retain only the first N (which are, implicitly, the latest profiles). If we were to add more profiles than the max capacity, the TreeSet is pruned at the rightmost end.
With Guava, the eviction policy provides the option of limiting the size, but the basis on which it would evict a profile would not work with the least-recently used/accessed profile.
Also, this is currently not a true cache, because the moment we detect changes in the underlying store, we reconstruct this 'cache'. Ideally, we'd want to identify the newest profiles returned from the FileSystem (using filename filters), but the Hadoop API performance is the same (irrespective of the filter).
We, primarily, save the time in fetching file list from the FS and in deserializing.
I can move the implementation of the TreeSet to a separate class to clean up the code. That would make debugging simpler too. With Guava, I don't see the value add beyond a lower risk of bugs, which should be minimal with the TreeSet too.

ilooner · 2018-05-09T20:59:28Z

@kkhatua

I think I understand the difference in our two perspectives. You wanted a cache that will always only contain the N most recently created profiles. If you happen to access the N + 1th youngest profile, the cache will not contain it and will never contain it, the cache will only hold the N most recently created profiles.

I still prefer the approach with the Guava cache because you can still effectively achieve the same result. As new profiles are created they can be added to the cache. If you access a very old profile, one more recently created profile will be evicted from the cache and the old profile will be added to the cache since a user just requested it. I would argue this behavior is not only easier to implement since we are leveraging a library, but actually more desirable since it caches a profile based on when it is used, not when it was created.

If you still disagree with using the Guava cache. I agree with your proposal of moving your cache into a separate class. I think you should also add some unit tests for the cache to verify that it works as expected. The unit tests will also make maintaining and enhancing the class easier for future developers.

kkhatua · 2018-05-10T19:01:21Z

I actually like the Guava cache approach for its elegance and capabilities, but it expands the scope significantly without a huge benefit from what we currently have. The concept of the cache that you are envisioning is with the complete profile. This is only for listing of the profiles. When an individual profile is accessed, Drill ends up fetching a new copy from the PStore to serialize the contents to visualize it.
I'll move the class and add some unit tests as well.

ilooner · 2018-05-10T19:53:52Z

@kkhatua Sounds good. Thanks for the explanations and thanks for improving the performance so much :) !

kkhatua · 2018-05-12T15:57:27Z

Done all the changes. Found an unused import in an unrelated file, so I fixed that to make sure the code builds after rebasing to latest master.
@arina-ielchiieva / @parthchandra / @ilooner
Can any (or all) of you do a review?

arina-ielchiieva

@kkhatua, before we proceed with code review, please clean up the code a little bit since LocalPersistentStore. looks too overloaded. I suggest we move archiving logic to some helper class as mentioned in my comments... Actually, this would allow us to unit test it much better...

arina-ielchiieva · 2018-05-14T14:47:56Z

exec/java-exec/src/test/java/org/apache/drill/exec/store/sys/TestProfileSet.java

+      assertEquals(null, poppedProfile);
+    }
+
+    assert(testSet.size() == initCapacity);


Please use junit assertions in tests.

arina-ielchiieva · 2018-05-14T14:48:31Z

exec/java-exec/src/main/java/org/apache/drill/exec/store/sys/store/ProfileSet.java

+   */
+  public String add(String profile, boolean retainOldest) {
+    store.add(profile);
+    if ( size.incrementAndGet() > maxCapacity ) {


Please remove spaces.

arina-ielchiieva · 2018-05-14T14:49:22Z