Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add import of v8 heap allocation profile #170

Merged
merged 6 commits into from Oct 8, 2018
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
4 changes: 4 additions & 0 deletions src/import/index.ts
Expand Up @@ -12,6 +12,7 @@ import {importFromLinuxPerf} from './linux-tools-perf'
import {ProfileDataSource, TextProfileDataSource, MaybeCompressedDataReader} from './utils'
import {importAsPprofProfile} from './pprof'
import {decodeBase64} from '../lib/utils'
import {importFromChromeHeapProfile} from './v8heapalloc'

export async function importProfileGroupFromText(
fileName: string,
Expand Down Expand Up @@ -103,6 +104,9 @@ async function _importProfileGroup(dataSource: ProfileDataSource): Promise<Profi
} else if (fileName.endsWith('.v8log.json')) {
console.log('Importing as --prof-process v8 log')
return toGroup(importFromV8ProfLog(JSON.parse(contents)))
} else if (fileName.endsWith('.heapprofile')) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I try to make every kind of profile import even if it has the incorrect extension by using heuristics. Can you please add a sensible heuristic to the list below for the second pass?

Maybe something like:

'head' in parsed && 'callFrame' in parsed['head']

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went for 'head' in parsed && 'selfSize' in parsed['head'] because it's more specific to the heap profile format

console.log('Important as Chrome Heap Profile')
return toGroup(importFromChromeHeapProfile(JSON.parse(contents)))
}

// Second pass: Try to guess what file format it is based on structure
Expand Down
97 changes: 97 additions & 0 deletions src/import/v8heapalloc.ts
@@ -0,0 +1,97 @@
import {Profile, FrameInfo, StackListProfileBuilder} from '../lib/profile'
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a URL you can add to a comment here that provides documentation on this file format? Or at least a link that explains how to record this kind of profile?

import {getOrInsert} from '../lib/utils'
import {ByteFormatter} from '../lib/value-formatters'

interface HeapProfileCallFrame {
columnNumber: number
functionName: string
lineNumber: number
scriptId: string
url: string
}

interface HeapProfileNode {
callFrame: HeapProfileCallFrame
selfSize: number
children: HeapProfileNode[]
id: number
parent?: number
totalSize: number
}

interface HeapProfile {
head: HeapProfileNode
}

const callFrameToFrameInfo = new Map<HeapProfileCallFrame, FrameInfo>()
function frameInfoForCallFrame(callFrame: HeapProfileCallFrame) {
return getOrInsert(callFrameToFrameInfo, callFrame, callFrame => {
const name = callFrame.functionName || '(anonymous)'
const file = callFrame.url
const line = callFrame.lineNumber
const col = callFrame.columnNumber
return {
key: `${name}:${file}:${line}:${col}`,
name,
file,
line,
col,
}
})
}

export function importFromChromeHeapProfile(chromeProfile: HeapProfile): Profile {
const nodeById = new Map<number, HeapProfileNode>()
let currentId = 0
const computeId = (node: HeapProfileNode, parent?: HeapProfileNode) => {
node.id = currentId++
nodeById.set(node.id, node)
if (parent) {
node.parent = parent.id
}

node.children.forEach(children => computeId(children, node))
}
computeId(chromeProfile.head)

// compute the total size
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A small stylistic thing: I try to start all my comments with a capitalized letter

const computeTotalSize = (node: HeapProfileNode): number => {
if (node.children.length === 0) return node.selfSize || 0
const totalChild = node.children.reduce((total: number, children) => {
total += computeTotalSize(children)
return total
}, node.selfSize)
node.totalSize = totalChild
return totalChild
}
const total = computeTotalSize(chromeProfile.head)

// compute all stacks by taking each last node and going upward
const stacks: HeapProfileNode[][] = Array.from(nodeById.values()).map(currentNode => {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a pretty small thing, but I generally use for loops rather than map for a slight performance boost. (https://jsperf.com/map-vs-for-loop-performance/6)

So something like:

const stacks: HeapProfileNode[][] = []
for (let currentNode of nodeById.values()) {
  ...
}

If I'm misunderstanding the perf implications here, please let me know!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally prefer to use native methods of Array for readability but i switched the logic to use for as requested

let stack: HeapProfileNode[] = []
stack.push(currentNode)
// while we found a parent
while (true) {
if (typeof currentNode.parent !== 'number') break
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is correct, but I found it a bit misleading to read. I think what this is checking is if the parent is undefined or not, but it shouldn't be possible or parent to be any type other than number or undefined. I think the idiomatic way of doing that would be:

if (currentParent.parent == null) break

(intentionally using == so that it captures both undefined and null, but still not 0, because JavaScript is a craaazy language)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

true, i switched to a null check but with a strict comparaison (see my comment bellow)

const parent = nodeById.get(currentNode.parent)
if (parent === undefined) break
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case I would just do if (!parent) break, since any node returned by nodeById.get(...) will not be falsy. if (parent != null) would also be fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Performance wise, it's better to check against the type you need than relying on V8 adding the check for 0/null/undefined for you
Since any Map return undefined if no entry is found, i believe we can check only for undefined

// push the parent at the begining of the stack
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: "beginning"

stack.unshift(parent)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Weird! I would've expected doing repeated stack.push(parent) followed by a stack.reverse() at the end to be more efficient than doing repeated unshift calls, but it looks like that's wrong? https://jsperf.com/array-unshift-vs-array-push-and-array-reverse. I would've thought that arrays would be implemented as vectors, making unshift an O(n) operation, but it looks like it's optimized really well somehow.

currentNode = parent
}
return stack
})

const profile = new StackListProfileBuilder(total)

for (let i = 0; i < stacks.length; i++) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can just do for (let stack of stacks) { ... } here. TypeScript will compile it down to roughly the same thing, and it doesn't look like you need the index i anywhere else in the loop.

const stack = stacks[i]
const lastFrame = stack[stack.length - 1]
profile.appendSampleWithWeight(
stack.map(frame => frameInfoForCallFrame(frame.callFrame)),
lastFrame.selfSize,
)
}
profile.setValueFormatter(new ByteFormatter())
return profile.build()
}