Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sort Data in-place and consume Node on the fly to use less memory #317

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Krinkle
Copy link

@Krinkle Krinkle commented Oct 18, 2023

See https://phabricator.wikimedia.org/T315056

Use in-place sort for @Data and consume $Node on the fly to use less memory.

This helped stave off out-of-memory errors on cron scripts that builds svgs from large log files.

Re-creates PR #298 by @AaronSchulz.

@HarrySidhuz0008
Copy link

In-place Merge Sort

def merge_sort(arr):
if len(arr) > 1:
mid = len(arr) // 2
left_half = arr[:mid]
right_half = arr[mid:]

    merge_sort(left_half)
    merge_sort(right_half)

    i = j = k = 0

    while i < len(left_half) and j < len(right_half):
        if left_half[i] < right_half[j]:
            arr[k] = left_half[i]
            i += 1
        else:
            arr[k] = right_half[j]
            j += 1
        k += 1

    while i < len(left_half):
        arr[k] = left_half[i]
        i += 1
        k += 1

    while j < len(right_half):
        arr[k] = right_half[j]
        j += 1
        k += 1

Example usage

data = [4, 2, 9, 6, 1, 5, 8, 7, 3]
merge_sort(data)

for item in data:
# Process and consume each sorted element one by one
print(item)
Sorting data in-place while consuming nodes on the fly is a common approach to efficiently manage large datasets in situations where you want to minimize memory usage. This is often used in scenarios like external sorting (sorting data that doesn't fit entirely in memory) and processing large streams of data. To achieve this, you can use techniques like merge sort or other in-place sorting algorithms. Below is a simplified example using Python to demonstrate this concept:

@matthew-olson-intel
Copy link
Contributor

I measured both the performance and maximum RSS (as reported by GNU time) of this patch against a synthetic dataset, and it seems to reduce maximum RSS by about 2% while increasing runtime by about 5.4%:

config: def
  avg: 5.544 ± 0.043
  max rss: 227.585 ± 0.271
  minor page faults: 36723.600 ± 62893.440
  major page faults: 45.000 ± 0.000
config: free
  avg: 5.848 ± 0.092
  max rss: 223.104 ± 0.310
  minor page faults: 35996.200 ± 175294.560
  major page faults: 45.000 ± 0.000

It may be that these ratios could change significantly with different input; I see in your Phabricator posts that you're using significantly larger input. Is there a way that I can download 2022-08-24.excimer.all.log (or a newer version of it) to try this out?

@Krinkle
Copy link
Author

Krinkle commented Nov 9, 2023

@matthew-olson-intel Yes! The raw data behind the visualisations at https://performance.wikimedia.org/php-profiling/ is published daily at https://performance.wikimedia.org/arclamp/logs/daily/.

The .excimer.all.log suffixed files are the largest ones. I recommend using the ones from yesterday or earlier as the current day is incomplete.

@matthew-olson-intel
Copy link
Contributor

@Krinkle OK, thanks -- I ran another gamut of experiments. These are all 10 iterations per config, using the mean average (runtime in seconds, RSS in MB).

config: def
  avg: 152.025 ± 1.992
  max rss: 5974.135 ± 0.377
  minor page faults: 1502767.100 ± 138277.290
  major page faults: 45.000 ± 0.000
config: free
  avg: 163.438 ± 15.326
  max rss: 3134.559 ± 0.167
  minor page faults: 783530.200 ± 1217912.160
  major page faults: 45.000 ± 0.000

Seems to have changed things quite a lot: a 90% reduction in max RSS, and a 7.5% hit to performance. A clear win for this input, IMO.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants