Skip to content

Main grouping algorithm needs improvement #358

@yangmillstheory

Description

@yangmillstheory

For reference, I mean s:GroupingAlgorithmSCTree

function! s:GroupingAlgorithmSCTree(targets, keys) "{{{
" Prepare variables for working
let targets_len = len(a:targets)
let keys_len = len(a:keys)
let groups = {}
let keys = reverse(copy(a:keys))
" Semi-recursively count targets {{{
" We need to know exactly how many child nodes (targets) this branch will have
" in order to pass the correct amount of targets to the recursive function.
" Prepare sorted target count list {{{
" This is horrible, I know. But dicts aren't sorted in vim, so we need to
" work around that. That is done by having one sorted list with key counts,
" and a dict which connects the key with the keys_count list.
let keys_count = []
let keys_count_keys = {}
let i = 0
for key in keys
call add(keys_count, 0)
let keys_count_keys[key] = i
let i += 1
endfor
" }}}
let targets_left = targets_len
let level = 0
let i = 0
while targets_left > 0
" Calculate the amount of child nodes based on the current level
let childs_len = (level == 0 ? 1 : (keys_len - 1) )
for key in keys
" Add child node count to the keys_count array
let keys_count[keys_count_keys[key]] += childs_len
" Subtract the child node count
let targets_left -= childs_len
if targets_left <= 0
" Subtract the targets left if we added too many too
" many child nodes to the key count
let keys_count[keys_count_keys[key]] += targets_left
break
endif
let i += 1
endfor
let level += 1
endwhile
" }}}
" Create group tree {{{
let i = 0
let key = 0
call reverse(keys_count)
for key_count in keys_count
if key_count > 1
" We need to create a subgroup
" Recurse one level deeper
let groups[a:keys[key]] = s:GroupingAlgorithmSCTree(a:targets[i : i + key_count - 1], a:keys)
elseif key_count == 1
" Assign single target key
let groups[a:keys[key]] = a:targets[i]
else
" No target
continue
endif
let key += 1
let i += key_count
endfor
" }}}
" Finally!
return groups
endfunction "}}}
" }}}

The algorithm is correct, but should be cleaned up and refactored, because:

  • there are unused variables, causing confusion
  • this call to reverse is unnecessary
  • building this index (which happens at every recursive call but is always the same) is unnecessary
  • it has too many responsibilities (you can tell by the existence of variables scoped to the entire function, but used only in one or two places):
    • creating a list of child node counts per key
    • creating an index of keys into that list
    • recursively building the tree

I would propose splitting s:GroupingAlgorithmSCTree out into two functions, the first is a helper function

function! s:get_children_counts(hits_rem)
  " returns a list corresponding to s:jump_tokens; each
  " count represents how many hits are in the subtree
  " rooted at the corresponding jump token
  let n_jump_tokens = len(s:jump_tokens)
  let n_hits = repeat([0], n_jump_tokens)
  let hits_rem = a:hits_rem

  let is_first_lvl = 1
  while hits_rem > 0
    " if we can't fit all the hits in the first lvl,
    " fit the remainder starting from the last jump token
    let n_children = is_first_lvl
          \ ? 1
          \ : n_jump_tokens - 1
    for j in range(n_jump_tokens)
      let n_hits[j] += n_children
      let hits_rem -= n_children
      if hits_rem <= 0
        let n_hits[j] += hits_rem
        break
      endif
    endfor
    let is_first_lvl = 0
  endwhile

  return reverse(n_hits)
endfunction

and the second below is the grouping algorithm.

function! s:GetJumpTree(hits)
  " returns a tree where non-leaf nodes are jump tokens and leaves are coordinates
  " (tuples) of hits.
  "
  " each level of the tree is filled such that the average path depth of the tree
  " is minimized and the closest hits come first first.
  let tree = {}

  " i: index into hits
  " j: index into jump tokens
  let i = 0
  let j = 0
  for n_children in s:get_children_counts(len(a:hits))
    let node = s:jump_tokens[j]
    if n_children == 1
      let tree[node] = a:hits[i]
    elseif n_children > 1
      let tree[node] = s:GetJumpTree(a:hits[i:i + n_children - 1])
    else
      continue
    endif
    let j += 1
    let i += n_children
  endfor
  return tree
endfunction

The algorithm and results are the same; I've tested it on various inputs and can include in the PR.

I'm open to preserving the existing naming (I use "hits" for "targets" and "jump tokens" for "keys") and but I think this implementation is cleaner and makes it easier for others to understand and extend, and cuts the amount of grouping algorithm code almost by half.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions