Data Files eat memory #1065

Open
bep opened this Issue Apr 21, 2015 · 7 comments

Comments

Projects
None yet
3 participants

@bep bep added the Bug label Apr 21, 2015

@bep

This comment has been minimized.

Show comment
Hide comment
@bep

bep Aug 22, 2015

Member

Note to self: Revisit this with Go 1.5.

Member

bep commented Aug 22, 2015

Note to self: Revisit this with Go 1.5.

@bep

This comment has been minimized.

Show comment
Hide comment
@bep

bep Dec 24, 2015

Member

I have done some fresh testing on this with 700mb worth of JSON files, which needs 4,5 GB of memory.

I also tested to unmarshall into a json.RawMessage and that cuts the memory requirement in half. But the raw message is useless as is ...

So I think the solution is load these objects lazilly, but I have now idea how to do that without breaking the API.

So given

/data/file1.json 
/data/file2.json 
/data/file3.json 
{{ .Site.Data.file2.key1 }}

Would only load file2 and preferable also only unmarshal key1 in that file. If Site.Data was a method and file2 and key1 were arguments to that method, I kind of know how to implement it ...

Any tips? @spf13 @tatsushid

Member

bep commented Dec 24, 2015

I have done some fresh testing on this with 700mb worth of JSON files, which needs 4,5 GB of memory.

I also tested to unmarshall into a json.RawMessage and that cuts the memory requirement in half. But the raw message is useless as is ...

So I think the solution is load these objects lazilly, but I have now idea how to do that without breaking the API.

So given

/data/file1.json 
/data/file2.json 
/data/file3.json 
{{ .Site.Data.file2.key1 }}

Would only load file2 and preferable also only unmarshal key1 in that file. If Site.Data was a method and file2 and key1 were arguments to that method, I kind of know how to implement it ...

Any tips? @spf13 @tatsushid

@tatsushid

This comment has been minimized.

Show comment
Hide comment
@tatsushid

tatsushid Dec 26, 2015

Contributor

If we fulfill both keeping the current template syntax and changing the behavior, I think we need to modify a template AST directly. An parsed template provides Tree.Root member and it keeps all parsed template nodes so we can modify them. Please see the sample code

While we can do above, I prefer changing API with an announcement personally.

Contributor

tatsushid commented Dec 26, 2015

If we fulfill both keeping the current template syntax and changing the behavior, I think we need to modify a template AST directly. An parsed template provides Tree.Root member and it keeps all parsed template nodes so we can modify them. Please see the sample code

While we can do above, I prefer changing API with an announcement personally.

@mattpaz

This comment has been minimized.

Show comment
Hide comment
@mattpaz

mattpaz May 11, 2016

Any updates with Go 1.6? I haven't been monitoring advancements closely, but was thinking about returning to it this summer. If it requires an API update, any feel for where is might land on a roadmap. I expect this is an edge case, so I'm not getting my hopes up, but thought i'd check in nonetheless.

mattpaz commented May 11, 2016

Any updates with Go 1.6? I haven't been monitoring advancements closely, but was thinking about returning to it this summer. If it requires an API update, any feel for where is might land on a roadmap. I expect this is an edge case, so I'm not getting my hopes up, but thought i'd check in nonetheless.

@bep

This comment has been minimized.

Show comment
Hide comment
@bep

bep May 12, 2016

Member

Just about the same with Go 1.6.

Member

bep commented May 12, 2016

Just about the same with Go 1.6.

@bep bep added the Stale label Feb 28, 2017

@bep

This comment has been minimized.

Show comment
Hide comment
@bep

bep Mar 1, 2017

Member

Note/Update: This issue is marked as stale, and I may have said something earlier about "opening a thread on the discussion forum". Please don't.

If this is a bug and you can still reproduce this error on the latest release or the master branch, please reply with all of the information you have about it in order to keep the issue open.

If this is a feature request, and you feel that it is still relevant and valuable, please tell us why.

Member

bep commented Mar 1, 2017

Note/Update: This issue is marked as stale, and I may have said something earlier about "opening a thread on the discussion forum". Please don't.

If this is a bug and you can still reproduce this error on the latest release or the master branch, please reply with all of the information you have about it in order to keep the issue open.

If this is a feature request, and you feel that it is still relevant and valuable, please tell us why.

@bep bep added Keep and removed Stale labels Mar 27, 2017

@bep bep modified the milestone: v0.23 Jun 8, 2017

@bep bep modified the milestones: v0.23, v0.24, v0.25 Jun 16, 2017

@bep

This comment has been minimized.

Show comment
Hide comment
@bep

bep Jun 28, 2017

Member

Just tested this with 670M worth of JSON data, and it seems to have improved a little. May be the type of data files I use:

▶  du -h data && time hugo
671M	data
Started building sites ...
Built site for language en:
0 draft content
0 future content
0 expired content
4 regular pages created
16 other pages created
0 non-page files copied
6 paginator pages created
2 tags created
0 categories created
total in 3172 ms
hugo   0.74s  user 1.49s system 65% cpu 3.395 total
avg shared (code):         0 KB
avg unshared (data/stack): 0 KB
total (sum):               0 KB
max memory:                788740 MB
page faults from disk:     19
other page faults:         409721

Same site, no data:

▶ du -h data && time ago
  0B	data
Started building sites ...
Built site for language en:
0 draft content
0 future content
0 expired content
4 regular pages created
16 other pages created
0 non-page files copied
6 paginator pages created
2 tags created
0 categories created
total in 17 ms
hugo   0.05s  user 0.02s system 112% cpu 0.059 total
avg shared (code):         0 KB
avg unshared (data/stack): 0 KB
total (sum):               0 KB
max memory:                14708 MB
page faults from disk:     9
other page faults:         4048
Member

bep commented Jun 28, 2017

Just tested this with 670M worth of JSON data, and it seems to have improved a little. May be the type of data files I use:

▶  du -h data && time hugo
671M	data
Started building sites ...
Built site for language en:
0 draft content
0 future content
0 expired content
4 regular pages created
16 other pages created
0 non-page files copied
6 paginator pages created
2 tags created
0 categories created
total in 3172 ms
hugo   0.74s  user 1.49s system 65% cpu 3.395 total
avg shared (code):         0 KB
avg unshared (data/stack): 0 KB
total (sum):               0 KB
max memory:                788740 MB
page faults from disk:     19
other page faults:         409721

Same site, no data:

▶ du -h data && time ago
  0B	data
Started building sites ...
Built site for language en:
0 draft content
0 future content
0 expired content
4 regular pages created
16 other pages created
0 non-page files copied
6 paginator pages created
2 tags created
0 categories created
total in 17 ms
hugo   0.05s  user 0.02s system 112% cpu 0.059 total
avg shared (code):         0 KB
avg unshared (data/stack): 0 KB
total (sum):               0 KB
max memory:                14708 MB
page faults from disk:     9
other page faults:         4048

@bep bep modified the milestones: v0.25, v0.26 Jul 5, 2017

@bep bep modified the milestones: v0.26, v0.27 Aug 6, 2017

@bep bep modified the milestones: v0.27, v0.28 Sep 7, 2017

@bep bep modified the milestones: v0.28, v0.29, v0.30 Sep 21, 2017

@bep bep removed this from the v0.30 milestone Oct 13, 2017

@bep bep added this to the v0.31 milestone Oct 13, 2017

@bep bep modified the milestones: v0.31, v0.32 Oct 30, 2017

@bep bep modified the milestones: v0.32, v0.33 Dec 16, 2017

@bep bep modified the milestones: v0.33, v0.34 Jan 11, 2018

@bep bep modified the milestones: v0.34, v0.35, v0.36 Jan 22, 2018

@bep bep modified the milestones: v0.36, v0.37 Feb 3, 2018

@bep bep modified the milestones: v0.37, v0.38 Feb 11, 2018

@bep bep modified the milestones: v0.38, v0.39 Feb 21, 2018

@bep bep modified the milestones: v0.39, v0.40 Apr 9, 2018

@bep bep modified the milestones: v0.40, v0.41 Apr 20, 2018

@bep bep modified the milestones: v0.41, v0.42 May 4, 2018

@bep bep modified the milestones: v0.42, v0.43 Jun 5, 2018

@bep bep modified the milestones: v0.43, v0.44 Jun 30, 2018

@bep bep modified the milestones: v0.44, v0.45, v0.46 Jul 10, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment