Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce mem usage #43

Closed
wants to merge 5 commits into from
Closed

Reduce mem usage #43

wants to merge 5 commits into from

Conversation

hawson
Copy link
Contributor

@hawson hawson commented Sep 12, 2015

This pull aims to reduce the RAM usage needed by pihole when parsing new/updated block lists.

The files are downloaded locally and then operated on, instead of being stored as (very large) variables in the shell script. Some simplistic tests (query /proc//status for RSS size) show that the code in master uses ~345,000KB when running (that's the high water mark). When processing locally, RAM usage in this case is somewhere around 8,600KB

It's important to note that the files are downloaded only if the upstream copies are newer, which should address some of the concerns mentioned in #37. That said, if SD card IO is really that much of a concern, there should be full dependency processing to eliminate unnecessary writes.

Storing the output from 'curl' commands directly as shell variables is
very inefficent, and requires much more RAM gravity.sh any time there is
an update to the block lists (and especially on the first run).  Store
the raw blocklists in a temporary file on disk, and process those.
Remove extraneous calls to several programs (cat, uniq).
@jacobsalmela
Copy link
Contributor

Initially, we switched to putting them in a variable (data) to prevent writing to the SD card. Later, the feature was added to only download the lists if there were changes. I have personally never had a failed SD card, but I have read that it happens to people often enough.

What do you mean by dependency processing?

@hawson
Copy link
Contributor Author

hawson commented Sep 17, 2015

By "dependency processing", I mean something like a makefile: you have outputs that require various actions and other dependencies.

Storing the data all in RAM (in bash variables), thus forcing a host to use additional swapfiles, in order to save writes to an SD card seems backwards.

@korhadris
Copy link
Contributor

I think some of your memory usage stats may be low. The sorting step alone of supernova uses 134168 kB on my system. That said, I agree we should have a goal to get away from storing data in memory.

Here are my stats on sorting using time (Not the bash built-in time, the GNU command time, which may need to be installed:

command time -v sort -u pihole.2.supernova.txt > /dev/null
        Command being timed: "sort -u pihole.2.supernova.txt"
        User time (seconds): 17.21
        System time (seconds): 0.47
        Percent of CPU this job got: 308%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:05.72
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 134168
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 0
        Minor (reclaiming a frame) page faults: 33727
        Voluntary context switches: 1854
        Involuntary context switches: 555
        Swaps: 0
        File system inputs: 62520
        File system outputs: 0
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0

@jacobsalmela
Copy link
Contributor

I haven't forgotten about this. Just waiting for some more time that I have to test it. It looks pretty good though from my initial scan.

@jacobsalmela
Copy link
Contributor

Your commits are on the right track, but there are some issues:

  1. Lines 47-70 can be removed since we won't be storing the variables in memory any more
  2. On line 103, I would prefer to have the variables named to fit the theme of the script (sci-fi, Star Trek-esque). I think patternBuffer would be the ideal variable name.
  3. On line 104, again, a themed variable name such as temporalDistortion or tachyonEmissions
  4. On lines 108-109, you display the entire command that is running. I prefer to keep the echos simple and clean to reduce on-screen clutter
  5. Lines 143-144 store the filename in the variable, so it displays as ** 1679450 /etc/pihole/pihole.1.andLight.txt domains being pulled in by gravity... instead of just ** 1679450 domains being pulled in by gravity...
  6. Same thing for 152-153
  7. I'm far from a sed expert, so could you please explain how line 119 works?

@hawson
Copy link
Contributor Author

hawson commented Sep 26, 2015

  1. The patch aims to have minimal changes, and removing swap support is outside the scope of this.
  2. I can update the names, but I will note that the...unusual names (even for someone versed in Star Trek Technobabble)...made coding slower than usual.
  3. ditto
  4. Helps with debugging (short of running with -x, which is entirely too verbose. Easy enough to comment out.
  5. Okay
  6. ditto
  7. Sure:
  • -n means "don't print by default" (Perl's -n option is similar).
  • -r means use extedned regexes, instead of the very limited set normally supported
  • Then there are two different statements, each following the two -e options.
    -- The first statement finds 2 or more literal "." characters, and replaces them with just one. I found several cases where there are multiples periods in a row, and this fixes that problem.
    -- The second looks for lines with a literal "." character, and prints it (this is needed because we "-n" earlier on to suppress printing, and also handles blank lines.

@jacobsalmela
Copy link
Contributor

Did you want to make the changes?

@jacobsalmela
Copy link
Contributor

I have some time this weekend if you wanted to make the changes.

@dschaper
Copy link
Member

dschaper commented Nov 6, 2015

I'd be happy to make a PR with the changes you requested if this issue is still open for consideration.

@jacobsalmela
Copy link
Contributor

Yes, it is. Thanks!

@dschaper
Copy link
Member

dschaper commented Nov 6, 2015

Okay, PR #68 opened with @jacobsalmela changes for @hawson memory reductions.

@jacobsalmela
Copy link
Contributor

Cool. I'll close this one then. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants