Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak on loading and playing clips #1214

Closed
saltomodules opened this issue Jun 4, 2019 · 11 comments
Closed

Memory leak on loading and playing clips #1214

saltomodules opened this issue Jun 4, 2019 · 11 comments

Comments

@saltomodules
Copy link
Contributor

saltomodules commented Jun 4, 2019

Expected behaviour

When a clip is removed (when it's done playing or it's channel or layer is cleared, etc) all the memory the clip was using is released back to the system.

Current behaviour

On the Ubuntu build of CasparCG 2.2.2 (and earlier) every clip you load takes up memory which never gets released again until CasparCG is restarted.

Executing a CLEAR on the respective layer or channel has no (or little) effect.

Often, when you load or play a file that has been used before memory usage does not significantly increase again. However, sometimes it does but it isn't exactly clear why.

It looks a bit as if files are being cached in memory somehow until restart (or crash when system memory ultimately is exhausted).

The Windows version of 2.2.2 does not exhibit this behavior. On CLEAR of a channel or layer the memory a clip was using is clearly (mostly) free'd.


Steps to reproduce

  1. LOADBG, LOAD and or PLAY a set of clips
  2. Use a system monitor like htop to monitor memory usage of the CasparCG process
  3. Memory usages goes up with each first usage of a clip. CLEAR on channel or layer releases no memory (or way too little) .
  4. Often, reloading a clip that has been played before does not result in significantly more memory used (but sometimes it does).

Environment

CasparCG Server version:
2.2.2 85066b9a4 Dev
video-mode 1080i5000    

Operating system:
Ubuntu 18.04

Graphics driver:
Nvidia 390.116  (for GTX1050)

Decklink drivers:
Blackmagic Design BlackmagicIO driver 11.1
@dotarmin
Copy link
Contributor

dotarmin commented Jun 6, 2019

Thank @saltomodules for reporting this.

Armin

@dotarmin dotarmin changed the title Memory leak on loading and playing clips in 2.2.2 85066b9 on Ubuntu 18.04 Memory leak on loading and playing clips Jun 6, 2019
@saltomodules
Copy link
Contributor Author

On linux pmap shows the total memory usage of a process and a list of the separate memory blocks the process is using.

On a freshly started casparcg instance this block list is nicely clean.

However when you start playing clips a long and ever growing list of nvidiactl and [anon] blocks shows up. It looks like this:

00007f7d94830000   2048K rw-s- nvidiactl
00007f7d94a30000  20500K rw---   [ anon ]
00007f7d95e35000      4K -----   [ anon ]
00007f7d95e36000   8192K rw---   [ anon ]
00007f7d96835000   2048K rw-s- nvidiactl
00007f7d96a35000  12300K rw---   [ anon ]
00007f7d97638000      4K -----   [ anon ]
00007f7d97639000   8192K rw---   [ anon ]
00007f7d97e39000      4K -----   [ anon ]
00007f7d97e3a000   8192K rw---   [ anon ]
00007f7d9863a000      4K -----   [ anon ]
00007f7d9863b000   8192K rw---   [ anon ]
00007f7d98e3b000   2048K rw-s- nvidiactl
00007f7d9903b000   2048K rw-s- nvidiactl
00007f7d9923b000   4100K rw---   [ anon ]
00007f7d9983b000   2048K rw-s- nvidiactl
00007f7d99a3b000  16400K rw---   [ anon ]
00007f7d9aa3f000   2048K rw-s- nvidiactl
00007f7d9ac3f000   4100K rw---   [ anon ]
00007f7d9b040000   2048K rw-s- nvidiactl
00007f7d9b240000   2048K rw-s- nvidiactl
00007f7d9b440000  12300K rw---   [ anon ]

This goes on for many hundreds and eventually thousands of lines. The more (not played before) clips you play, the longer the list gets.

Maybe this helps in pinpointing the source of the memory leak.

@toontoet
Copy link

Here you can see / hear whats happening when we keep this going on for some time: https://www.youtube.com/watch?v=f0H6BtmHFuM

@MarecP01
Copy link

Hi.
Detecting this problem on windows.
version 2.2.0
The only option is: restart the application.
That was unacceptable.
I'm back 2.0.7 :(
Regards
Peter

@vimlesh1975
Copy link

vimlesh1975 commented Oct 19, 2019

Server 2.2.0 66a9e3e Stable, windows 10 version 1903

Untitled

And then server crashed.
`[2019-10-16 23:50:18.909] [error] Exception: C:\Program Files (x86)\Jenkins\workspace\casparcg-server-dep\2.2.x\src\modules\ffmpeg\producer\av_producer.cpp(492): Throw in function bool _cdecl caspar::ffmpeg::Filter::operator ()(int)
[2019-10-16 23:50:18.909] [error] Dynamic exception type: class boost::exception_detail::clone_impl
[2019-10-16 23:50:18.909] [error] [struct boost::errinfo_api_function
* _ptr64] = av_buffersink_get_frame
[2019-10-16 23:50:18.909] [error] [struct boost::errinfo_errno
* __ptr64] = 12, "Not enough space"
[2019-10-16 23:50:18.909] [error] [struct caspar::tag_stacktrace_info * __ptr64] = 0# 0x00007FF79F6598AE
[2019-10-16 23:50:18.909] [error] 1# 0x00007FF79F688C50
[2019-10-16 23:50:18.909] [error] 2# 0x00007FF79F859D9B
[2019-10-16 23:50:18.909] [error] 3# 0x00007FF79F85D11D
[2019-10-16 23:50:18.909] [error] 4# tbb::interface7::internal::task_arena_base::internal_current_slot in tbb
[2019-10-16 23:50:18.909] [error] 5# tbb::task_scheduler_init::terminate in tbb
[2019-10-16 23:50:18.909] [error] 6# tbb::task_scheduler_init::terminate in tbb
[2019-10-16 23:50:18.909] [error] 7# tbb::internal::thread_sleep_v3 in tbb
[2019-10-16 23:50:18.909] [error] 8# tbb::internal::thread_sleep_v3 in tbb
[2019-10-16 23:50:18.909] [error] 9# 0x00007FF80C8F0E72 in ucrtbase
[2019-10-16 23:50:18.909] [error] 10# 0x00007FF80DAD7BD4 in KERNEL32
[2019-10-16 23:50:18.909] [error] 11# 0x00007FF80F8CCED1 in ntdll
[2019-10-16 23:50:18.909] [error]
[2019-10-16 23:50:18.909] [error]

[2019-10-16 23:50:18.909] [error] 0# 0x00007FF79F6598AE in casparcg
[2019-10-16 23:50:18.909] [error] 1# 0x00007FF79F6591DF in casparcg
[2019-10-16 23:50:18.909] [error] 2# 0x00007FF79FB7C7D7 in casparcg
[2019-10-16 23:50:18.909] [error] 3# 0x00007FF805421030 in VCRUNTIME140
[2019-10-16 23:50:18.909] [error] 4# is_exception_typeof in VCRUNTIME140
[2019-10-16 23:50:18.909] [error] 5# 0x00007FF80F900646 in ntdll
[2019-10-16 23:50:18.909] [error] 6# 0x00007FF79F858D82 in casparcg
[2019-10-16 23:50:18.909] [error] 7# 0x00007FF79F9D95E3 in casparcg
[2019-10-16 23:50:18.909] [error] 8# 0x00007FF80C8F0E72 in ucrtbase
[2019-10-16 23:50:18.909] [error] 9# BaseThreadInitThunk in KERNEL32
[2019-10-16 23:50:18.909] [error] 10# 0x00007FF80F8CCED1 in ntdll
[2019-10-16 23:50:18.909] [error] `

Now Will test 2.3

@dimitry-ishenko
Copy link
Contributor

dimitry-ishenko commented Oct 20, 2019

Does this affect everybody or just select people? I was thinking of upgrading, but now I am not sure if that's a good idea. (We are running 2.1 on Ubuntu 18.04.)

@dimitry-ishenko
Copy link
Contributor

ping...

@dotarmin
Copy link
Contributor

dotarmin commented Nov 7, 2019

Hi

We need more information such as producers and consumers you're using when this issue occurs. It would be beneficial if you also could post your configuration (format it as code please).

@dimitry-ishenko, this affects more people. Can you see a pattern such as some specific files or something you do? As much information as possible is good so @Julusian can try find the issue.

Thanks

@saltomodules
Copy link
Contributor Author

saltomodules commented Nov 20, 2019

In our case on linux 18.04:

We've tested without any consumers at all or with a decklink consumer enabled with no obvious difference in build up of memory load. The simplest config possible with one 1080i50 channel and no extra producers or consumers showed the same issue

To re-emphasise, in our case memory only goes up when a file is played for the first time.

For instance, we've tested with a script that plays an entire folder of five hundred h264 files. During the first run of the script memory load shoots up to 10 gigs in no time.

However during the second run almost no extra memory load occurs.

Maybe this is a hint in which direction to look? File pointers that stay open or perhaps an internal data structure with keys based on filepath that is not properly cleared after a clip ends?

@saltomodules
Copy link
Contributor Author

Also memory load on the gpu and utilization goes up fast after each new clip.

nvidia-smi output on Ubuntu 18.04 after playing a set of clips:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 430.40       Driver Version: 430.40       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 1050    Off  | 00000000:02:00.0  On |                  N/A |
| 45%   57C    P0    N/A /  75W |   1998MiB /  1999MiB |    100%      Default |
+-------------------------------+----------------------+----------------------+

When gpu memory is full and/or utilization becomes too large the output from CasparCG starts stuttering, as mentioned earlier here

Perhaps this helps with isolating the issue.

@dotarmin dotarmin added this to the v2.3.0 LTS milestone Apr 7, 2020
@ronag ronag removed this from the v2.3.0 LTS milestone May 28, 2020
@Julusian
Copy link
Member

Julusian commented Mar 6, 2023

Duplicate of #1356?

@Julusian Julusian closed this as completed Mar 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants