Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Active page cache warmup #10957

Merged
merged 14 commits into from
Feb 23, 2018
Merged

Active page cache warmup #10957

merged 14 commits into from
Feb 23, 2018

Conversation

chrisvest
Copy link
Contributor

@chrisvest chrisvest commented Feb 2, 2018

[cl] Introduces a new active page cache warmup feature in the Enterprise Edition. The memory contents of the Neo4j page cache are periodically profiled, and these profiles will then be used to quickly reheat the cache on startup. This decreases the time-to-performance for restarts or crashes.

Copy link
Member

@klaren klaren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really good. An awesome feature to have!

I'm not sure about the default profile interval of 1m, do we have any feeling of how long the profiling will take on a larger database?

@@ -197,6 +197,8 @@ public String threadName( Map<String, String> metadata )
*/
public static Group cypherWorker = new Group( "CypherWorker" );

public static Group pageCacheIOHelper = new Group( "PageCacheIOHelper" );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

final

@Internal
@Description( "The profiling frequency for the page cache. Accurate profiles allow the page cache to do active " +
"warmup after a restart, reducing the mean time to performance." )
public static Setting<Duration> pagecache_warmup_profiling_interval =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

final

@@ -159,6 +159,13 @@ public static String duration( long durationMillis, TimeUnit highestGranularity,
}
}

if ( builder.length() == 0 )
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch!


try ( OutputStream outputStream = compressedOutputStream( outputNext );
PageCursor cursor = file.io( 0, PF_SHARED_READ_LOCK | PF_NO_FAULT ) )
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While correct, all this shifting makes me dizzy, might I suggest:

        {
            int stepper = 0;
            int b = 0;
            for ( ; ; )
            {
                if ( stopped )
                {
                    return pagesInMemory;
                }
                if ( !cursor.next() )
                {
                    break; // Exit the loop if there are no more pages.
                }
                if ( cursor.getCurrentPageId() != PageCursor.UNBOUND_PAGE_ID )
                {
                    pagesInMemory++;
                    b |= (1 << stepper);
                }
                stepper++;
                if ( stepper == 8 )
                {
                    outputStream.write( b );
                    b = 0;
                    stepper = 0;
                }
            }
            outputStream.write( b );
            outputStream.flush();
        }

instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks simpler 👍


private CompressorStreamFactory compressors()
{
return new CompressorStreamFactory( true, 1024 );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we reuse the factory?

private File profileOutputFileFinal( PagedFile file )
{
File mappedFile = file.file();
String profileOutputName = "." + mappedFile.getName() + SUFFIX_CACHEPROF;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want the files to be hidden?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think people want to see these by default since they're not super important. I hope hiding them means we'll get fewer support cases that are just about asking what these files are.

{
if ( profilingStarted.compareAndSet( false, true ) )
{
// At this point, we are currently executing inside a *scheduled* executor thread. There are, at this time
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This relies on too much undocumented implementation details of the Neo4jJobScheduler. If the problem is that we only have two threads, we can increase that.

If the purpose of the scheduled executor is to just spawn new jobs at a delay, refactor Neo4jJobScheduler to do this instead of implementing it here. If you had this problem others will surely follow, and this hidden relationship with the causal clustering is very easy to miss.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's do that in a different PR.

import java.io.Closeable;
import java.io.IOException;

abstract class PageLoader implements Closeable
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't this be an interface?

import static org.hamcrest.Matchers.is;
import static org.junit.Assert.assertThat;

public class PageCacheWarmupCcIT extends PageCacheWarmupTestSupport
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It really nice we get warmup for cluster members as well! 🎉

for ( int i = 0; i < 1000; i++ )
{
Node node = db.createNode( label );
node.setProperty( "Niels", "Borh" );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🇩🇰

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚛️

@klaren
Copy link
Member

klaren commented Feb 22, 2018

Also, shouldn't this be tagged as changelog?

@chrisvest
Copy link
Contributor Author

@klaren We don't have very much feeling about how good a default 1m is. I observed in our soak testing that a profile takes about 50ms to 100ms to complete, so I saw no reason to increase the interval. We do want the profiles to be fairly recent, since we at this level don't have any idea about how the workload changes over time.

@chrisvest
Copy link
Contributor Author

@klaren I believe I've addressed all of your comments.

@chrisvest chrisvest merged commit 499efdd into neo4j:3.4 Feb 23, 2018
@chrisvest chrisvest deleted the 3.4-pc-cache-warmer branch February 23, 2018 12:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants