New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Media in /content kills performance #1181

Closed
bep opened this Issue May 29, 2015 · 10 comments

Comments

Projects
None yet
2 participants

@bep bep added the Bug label May 31, 2015

@tatsushid

This comment has been minimized.

Show comment
Hide comment
@tatsushid

tatsushid Aug 12, 2015

Contributor

I tested it really causes performance degradation on my Gentoo Linux machine (AMD Athlon II Neo N36L 1.3GHz CPU and 8GB RAM with HDD). This is less CPU power, no SSD drive so I thought it would show the difference clearer than my MacBook.

I made 500 1MB dummy media files for the test. First I put them into static and second content. To eliminate file system cache effect, I removed public directory and also cleared system cache by https://github.com/cubicdaiya/cachectl tool each time.

Here is the result of them.

500 1MB files in static directory

./hugo --stepAnalysis
initialize & template prep:
    6.945616901s (6.947149254s)    19.64 MB     72689 Allocs
load data:
    164.162µs (6.947406869s)       0.00 MB     109 Allocs
import pages:
    43.793285ms (6.992960892s)      1.92 MB     10760 Allocs
build taxonomies:
    1.386118ms (6.99454559s)        0.03 MB     1082 Allocs
render and write aliases:
    1.54µs (6.994614864s)      0.00 MB     0 Allocs
render and write taxonomies:
    30.822638ms (7.025600572s)      2.67 MB     32406 Allocs
render & write taxonomy lists:
    3.227697ms (7.028946561s)       0.11 MB     2146 Allocs
render and write lists:
    23.976487ms (7.053053313s)      1.53 MB     13694 Allocs
render and write pages:
    80.769532ms (7.134023515s)      5.47 MB     103425 Allocs
render and write homepage:
    17.745852ms (7.152004746s)      1.52 MB     9524 Allocs
render and write Sitemap:
    6.71283ms (7.158859787s)        0.19 MB     3814 Allocs
0 draft content
0 future content
16 pages created
0 paginator pages created
13 tags created
in 252 ms

500 1MB files in content directory

./hugo --stepAnalysis
initialize & template prep:
    104.830714ms (106.413322ms)     1.62 MB     20581 Allocs
load data:
    237.894µs (106.743614ms)       0.00 MB     108 Allocs
import pages:
    5.629365353s (5.736311465s)   531.67 MB     41423 Allocs
build taxonomies:
    1.32118ms (5.737762362s)        0.03 MB     1080 Allocs
render and write aliases:
    1.304µs (5.737852728s)     0.00 MB     0 Allocs
render and write taxonomies:
    25.595834ms (5.763709003s)      2.36 MB     32154 Allocs
render & write taxonomy lists:
    2.965507ms (5.766832536s)       0.11 MB     2146 Allocs
render and write lists:
    21.140266ms (5.78818752s)       1.60 MB     13665 Allocs
render and write pages:
    78.866389ms (5.867315995s)      5.25 MB     103297 Allocs
render and write homepage:
    15.704637ms (5.883270764s)      1.48 MB     9498 Allocs
render and write Sitemap:
    4.709143ms (5.888165617s)       0.18 MB     3794 Allocs
0 draft content
0 future content
16 pages created
0 paginator pages created
13 tags created
in 5818 ms

These show following

  1. In static case, it consumes much time at initialize & template prep step.
  2. In content case, it consumes much time at import pages step.
  3. In static case, the total time (252 ms) is significantly different from the last step time (7.158859787s).

In No.1 case, it seems to measure the time consumed at https://github.com/spf13/hugo/blob/master/commands/hugo.go#L297, copying the dummy files into public directory.

On the other hand, No.2 case, it uses much time to read the dummy files into memory at https://github.com/spf13/hugo/blob/master/source/filesystem.go#L59.

According to the code described at No.1 and around it, No.3 case's difference is caused by the starting point of the total time measurement. It doesn't includes the time consumed by copying files so it doesn't show the actual total time but only shows how much time it consumes for reading, parsing and rendering the contents. I don't think it is wrong but in this investigation, the last step time is more useful.

So, actually, content case is a little faster than static case. I tested some times and also tested with 100 dummy files but these showed the same trend.

However, in the forum thread, @bep said Hugo command must have been killed because of the slowness in content case. I guess it was caused by something memory thrashing. In content case, as the above result shows, Hugo consumes memory as same as the total size of dummy files at import pages step so it may have caused swap in/out.

I don't know how media files read into memory are used in Hugo later but if these are just used for writing to public directory, ignoring reading them into memory should improve performance and memory consumption.

Contributor

tatsushid commented Aug 12, 2015

I tested it really causes performance degradation on my Gentoo Linux machine (AMD Athlon II Neo N36L 1.3GHz CPU and 8GB RAM with HDD). This is less CPU power, no SSD drive so I thought it would show the difference clearer than my MacBook.

I made 500 1MB dummy media files for the test. First I put them into static and second content. To eliminate file system cache effect, I removed public directory and also cleared system cache by https://github.com/cubicdaiya/cachectl tool each time.

Here is the result of them.

500 1MB files in static directory

./hugo --stepAnalysis
initialize & template prep:
    6.945616901s (6.947149254s)    19.64 MB     72689 Allocs
load data:
    164.162µs (6.947406869s)       0.00 MB     109 Allocs
import pages:
    43.793285ms (6.992960892s)      1.92 MB     10760 Allocs
build taxonomies:
    1.386118ms (6.99454559s)        0.03 MB     1082 Allocs
render and write aliases:
    1.54µs (6.994614864s)      0.00 MB     0 Allocs
render and write taxonomies:
    30.822638ms (7.025600572s)      2.67 MB     32406 Allocs
render & write taxonomy lists:
    3.227697ms (7.028946561s)       0.11 MB     2146 Allocs
render and write lists:
    23.976487ms (7.053053313s)      1.53 MB     13694 Allocs
render and write pages:
    80.769532ms (7.134023515s)      5.47 MB     103425 Allocs
render and write homepage:
    17.745852ms (7.152004746s)      1.52 MB     9524 Allocs
render and write Sitemap:
    6.71283ms (7.158859787s)        0.19 MB     3814 Allocs
0 draft content
0 future content
16 pages created
0 paginator pages created
13 tags created
in 252 ms

500 1MB files in content directory

./hugo --stepAnalysis
initialize & template prep:
    104.830714ms (106.413322ms)     1.62 MB     20581 Allocs
load data:
    237.894µs (106.743614ms)       0.00 MB     108 Allocs
import pages:
    5.629365353s (5.736311465s)   531.67 MB     41423 Allocs
build taxonomies:
    1.32118ms (5.737762362s)        0.03 MB     1080 Allocs
render and write aliases:
    1.304µs (5.737852728s)     0.00 MB     0 Allocs
render and write taxonomies:
    25.595834ms (5.763709003s)      2.36 MB     32154 Allocs
render & write taxonomy lists:
    2.965507ms (5.766832536s)       0.11 MB     2146 Allocs
render and write lists:
    21.140266ms (5.78818752s)       1.60 MB     13665 Allocs
render and write pages:
    78.866389ms (5.867315995s)      5.25 MB     103297 Allocs
render and write homepage:
    15.704637ms (5.883270764s)      1.48 MB     9498 Allocs
render and write Sitemap:
    4.709143ms (5.888165617s)       0.18 MB     3794 Allocs
0 draft content
0 future content
16 pages created
0 paginator pages created
13 tags created
in 5818 ms

These show following

  1. In static case, it consumes much time at initialize & template prep step.
  2. In content case, it consumes much time at import pages step.
  3. In static case, the total time (252 ms) is significantly different from the last step time (7.158859787s).

In No.1 case, it seems to measure the time consumed at https://github.com/spf13/hugo/blob/master/commands/hugo.go#L297, copying the dummy files into public directory.

On the other hand, No.2 case, it uses much time to read the dummy files into memory at https://github.com/spf13/hugo/blob/master/source/filesystem.go#L59.

According to the code described at No.1 and around it, No.3 case's difference is caused by the starting point of the total time measurement. It doesn't includes the time consumed by copying files so it doesn't show the actual total time but only shows how much time it consumes for reading, parsing and rendering the contents. I don't think it is wrong but in this investigation, the last step time is more useful.

So, actually, content case is a little faster than static case. I tested some times and also tested with 100 dummy files but these showed the same trend.

However, in the forum thread, @bep said Hugo command must have been killed because of the slowness in content case. I guess it was caused by something memory thrashing. In content case, as the above result shows, Hugo consumes memory as same as the total size of dummy files at import pages step so it may have caused swap in/out.

I don't know how media files read into memory are used in Hugo later but if these are just used for writing to public directory, ignoring reading them into memory should improve performance and memory consumption.

@bep

This comment has been minimized.

Show comment
Hide comment
@bep

bep Aug 12, 2015

Member

I don't know how media files read into memory are used in Hugo later but if these are just used for writing to public directory, ignoring reading them into memory should improve performance and memory consumption.

They are not used later, so that would be a good fix.

Member

bep commented Aug 12, 2015

I don't know how media files read into memory are used in Hugo later but if these are just used for writing to public directory, ignoring reading them into memory should improve performance and memory consumption.

They are not used later, so that would be a good fix.

@bep bep added this to the v0.15 milestone Aug 13, 2015

@bep

This comment has been minimized.

Show comment
Hide comment
@bep

bep Aug 13, 2015

Member

@tatsushid could you take a stab at fixing this? If we could get this one solved, we could maybe think about asking @spf13 to do a release soon ... I'm not really sure what it contains, but there are some fairly important bug fixes in the mix.

Member

bep commented Aug 13, 2015

@tatsushid could you take a stab at fixing this? If we could get this one solved, we could maybe think about asking @spf13 to do a release soon ... I'm not really sure what it contains, but there are some fairly important bug fixes in the mix.

@bep

This comment has been minimized.

Show comment
Hide comment
@bep

bep Aug 13, 2015

Member

BTW: When I looked at this a while ago, my first thought was to do a lazy init of the File content (delay read to when needed).

Member

bep commented Aug 13, 2015

BTW: When I looked at this a while ago, my first thought was to do a lazy init of the File content (delay read to when needed).

@tatsushid

This comment has been minimized.

Show comment
Hide comment
@tatsushid

tatsushid Aug 14, 2015

Contributor

@bep, I explored the codes but Hmm... I think the easiest way is adding new lazy reader and using it instead of bytes.Buffer but it should also make a mess in the code. Hugo has many kinds of file access and reading.

So in my opinion, just adding that functionality isn't much difficult but if we prefer a clear code for making maintenance it easy, first we should clean up them. Hugo already has source.Filesystem, source.File and Handlers in which centralize file related actions.

However, this refactoring would take much time. Hmm, is it better first adding a lazy reader and next improving a file access?

Contributor

tatsushid commented Aug 14, 2015

@bep, I explored the codes but Hmm... I think the easiest way is adding new lazy reader and using it instead of bytes.Buffer but it should also make a mess in the code. Hugo has many kinds of file access and reading.

So in my opinion, just adding that functionality isn't much difficult but if we prefer a clear code for making maintenance it easy, first we should clean up them. Hugo already has source.Filesystem, source.File and Handlers in which centralize file related actions.

However, this refactoring would take much time. Hmm, is it better first adding a lazy reader and next improving a file access?

@bep

This comment has been minimized.

Show comment
Hide comment
@bep

bep Aug 14, 2015

Member

@tatsushid I think the short term fix is the smallest change to fix this particular problem -- then we can create another issue to fix it in a more clean fashion.

Member

bep commented Aug 14, 2015

@tatsushid I think the short term fix is the smallest change to fix this particular problem -- then we can create another issue to fix it in a more clean fashion.

@tatsushid

This comment has been minimized.

Show comment
Hide comment
@tatsushid

tatsushid Aug 15, 2015

Contributor

I wrote a rough lazy reader code and tested it. Here is the result of the new code in the same condition I mentioned.

500 1MB files in content directory

./hugo --stepAnalysis
initialize & template prep:
    134.075851ms (135.605452ms)     1.62 MB     20578 Allocs
load data:
    182.786µs (135.882579ms)       0.00 MB     108 Allocs
import pages:
    4.622476111s (4.758481227s)    19.78 MB     72680 Allocs
build taxonomies:
    1.41736ms (4.760003729s)        0.03 MB     1081 Allocs
render and write aliases:
    1.148µs (4.76007248s)      0.00 MB     0 Allocs
render and write taxonomies:
    31.452447ms (4.793882702s)      2.62 MB     32296 Allocs
render & write taxonomy lists:
    3.419654ms (4.797531228s)       0.14 MB     2167 Allocs
render and write lists:
    22.059476ms (4.819737499s)      1.57 MB     13690 Allocs
render and write pages:
    81.77901ms (4.901673414s)       5.39 MB     103421 Allocs
render and write homepage:
    18.295218ms (4.920205546s)      1.49 MB     9527 Allocs
render and write Sitemap:
    4.507433ms (4.924826506s)       0.18 MB     3794 Allocs
0 draft content
0 future content
16 pages created
0 paginator pages created
13 tags created
in 4835 ms

It improves the memory consumption at import page step and the total execution time. It no longer consumes the memory as same as the total size of content directory files. Reading file contents is parallelized (moved from https://github.com/spf13/hugo/blob/master/source/filesystem.go#L117 to https://github.com/spf13/hugo/blob/master/hugolib/handler_file.go#L42) so it is a little faster than ever.

I will clean up the code and write tests.

Contributor

tatsushid commented Aug 15, 2015

I wrote a rough lazy reader code and tested it. Here is the result of the new code in the same condition I mentioned.

500 1MB files in content directory

./hugo --stepAnalysis
initialize & template prep:
    134.075851ms (135.605452ms)     1.62 MB     20578 Allocs
load data:
    182.786µs (135.882579ms)       0.00 MB     108 Allocs
import pages:
    4.622476111s (4.758481227s)    19.78 MB     72680 Allocs
build taxonomies:
    1.41736ms (4.760003729s)        0.03 MB     1081 Allocs
render and write aliases:
    1.148µs (4.76007248s)      0.00 MB     0 Allocs
render and write taxonomies:
    31.452447ms (4.793882702s)      2.62 MB     32296 Allocs
render & write taxonomy lists:
    3.419654ms (4.797531228s)       0.14 MB     2167 Allocs
render and write lists:
    22.059476ms (4.819737499s)      1.57 MB     13690 Allocs
render and write pages:
    81.77901ms (4.901673414s)       5.39 MB     103421 Allocs
render and write homepage:
    18.295218ms (4.920205546s)      1.49 MB     9527 Allocs
render and write Sitemap:
    4.507433ms (4.924826506s)       0.18 MB     3794 Allocs
0 draft content
0 future content
16 pages created
0 paginator pages created
13 tags created
in 4835 ms

It improves the memory consumption at import page step and the total execution time. It no longer consumes the memory as same as the total size of content directory files. Reading file contents is parallelized (moved from https://github.com/spf13/hugo/blob/master/source/filesystem.go#L117 to https://github.com/spf13/hugo/blob/master/hugolib/handler_file.go#L42) so it is a little faster than ever.

I will clean up the code and write tests.

@bep

This comment has been minimized.

Show comment
Hide comment
@bep

bep Aug 15, 2015

Member

But it is still 252 ms static vs 4835 ms content, right? Which doesn't sound like it's right there yet.

Member

bep commented Aug 15, 2015

But it is still 252 ms static vs 4835 ms content, right? Which doesn't sound like it's right there yet.

@tatsushid

This comment has been minimized.

Show comment
Hide comment
@tatsushid

tatsushid Aug 15, 2015

Contributor

No, it isn't related to the actual total execution time. As I wrote in my first post, the process time measurement starts after copying static directory files so it doesn't show the actual execution time. In these cases, 7.158859787s (static case), 5.888165617s (content original case) and 4.924826506s (content improved case) should be compared. You can also see the real execution time by running hugo command with time command.

The displayed execution time problem should be filed as the other issue.

Contributor

tatsushid commented Aug 15, 2015

No, it isn't related to the actual total execution time. As I wrote in my first post, the process time measurement starts after copying static directory files so it doesn't show the actual execution time. In these cases, 7.158859787s (static case), 5.888165617s (content original case) and 4.924826506s (content improved case) should be compared. You can also see the real execution time by running hugo command with time command.

The displayed execution time problem should be filed as the other issue.

@bep

This comment has been minimized.

Show comment
Hide comment
@bep

bep Aug 15, 2015

Member

Ah, OK -- now I understand. That makes sense. Thanks for looking into this; I can test it myself once you create a pull request. I think the parallel file read could be a big win for really big sites.

And I will create a couple of follow-up issues, one of them being getting the "execution time" more accurate.

Member

bep commented Aug 15, 2015

Ah, OK -- now I understand. That makes sense. Thanks for looking into this; I can test it myself once you create a pull request. I think the parallel file read could be a big win for really big sites.

And I will create a couple of follow-up issues, one of them being getting the "execution time" more accurate.

tatsushid added a commit to tatsushid/hugo that referenced this issue Aug 16, 2015

Use LazyFileReader for reading file contents
Simple ioutil.ReadFile is used for reading file contents but it reads
all of the file contents and copies them into the memory and is run in a
single goroutine. It causes much memory consumption at copying media
files in content directory to publish directory and it is not good at
performance.

This improves the both issue by replacing ReadFile with LazyFileReader.

It postpones reading the file contents until it is really needed. As the
result, actual file read is run in parallelized goroutine. It improves
performance especially in a really big site.

In addition, if this reader is called from io.Copy, it does not copy the
file contents into the memory but just copies them into destination
file. It improves much memory consumption issue when the site has many
media files.

Fix gohugoio#1181

@bep bep closed this in 0c98d8e Aug 22, 2015

tychoish added a commit to tychoish/hugo that referenced this issue Aug 13, 2017

Use LazyFileReader for reading file contents
Simple ioutil.ReadFile is used for reading file contents but it reads
all of the file contents and copies them into the memory and is run in a
single goroutine. It causes much memory consumption at copying media
files in content directory to publish directory and it is not good at
performance.

This improves the both issue by replacing ReadFile with LazyFileReader.

It postpones reading the file contents until it is really needed. As the
result, actual file read is run in parallelized goroutine. It improves
performance especially in a really big site.

In addition, if this reader is called from io.Copy, it does not copy the
file contents into the memory but just copies them into destination
file. It improves much memory consumption issue when the site has many
media files.

Fix gohugoio#1181
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment