Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker stress test with Synology DS220+ (v112.1) #1808

Closed
3 tasks done
Tracked by #2176
tesshucom opened this issue Nov 2, 2022 · 0 comments · Fixed by #2203
Closed
3 tasks done
Tracked by #2176

Docker stress test with Synology DS220+ (v112.1) #1808

tesshucom opened this issue Nov 2, 2022 · 0 comments · Fixed by #2203
Assignees
Labels
for : ported-from-airsonic Known issue resolved after airsonic is closed in: data-scan Issues especially related to scan. in: docker Issues in the test module. in: test Issues in the test module or test package type: investigation Investigation required. If it's not a bug it will be closed.

Comments

@tesshucom
Copy link
Owner

tesshucom commented Nov 2, 2022

Docker considerations. Related airsonic/airsonic#1473, #1747

If we can't reproduce Problems with Airsonic, finding out the cause is proof of the devil. Therefore, we will conduct an equivalent stress test and if there are no problems, it will be closed.

Overview
  • Airsonic has reported memory issues with Docker. (Probably a configuration issue..) Jpsonic will prepare some answer, policy, or clue for this. This test will use a Docker image instead of a standalone Jpsonic.
  • Previously, Jpsonic standalone scanned 400,000 songs and found no noticeable memory leaks or slowdowns. However, it was not intended to be an accurate measurement of memory usage useful for Server operations, at that time. A little more practical investigation is needed.
  • Scan design has changed significantly in v112.0.0 and assumptions have changed. So a similar test will be necessary again.
    • Unlike Airsonic, differential update is introduced, so be careful. If we continue to add a small amount of data to a large library, it is easier to take effective countermeasures with differential updates. (Probably very necessary) On the other hand, the speed may be easily affected when a large number of registrations are performed at the same time, such as a new scan. Catch this trend.

This may not always be a probrem with an immediate answer. However, it can be dealt with through continuous observation and improvement. Since the fature of memory measurement by logging has been added in Jpsonic, it is easier to measure and track changes for each process than before. These considerations and cumulative improvements are now a little easier than they used to be.

Goal

If it is a configuration that has been confirmed to work based on Synology DS220+, it is not difficult to operate it in a general Linux environment. Only the work that we always do, such as rewriting directories, occurs.

Non-Goal
  • 400,000 songs is the stress value. It does not guarantee smooth operation.
    • Empirically, I think that the number of songs owned by users will be as follows.
      • 5,000 : light user. Unexpectedly, I expect that this group has the largest number of general users. It is a number that makes the management of physical media such as CDs a little more difficult.
      • 10,000 - 50,000 : I think that many of the "music lovers" I see on the web have this number of songs. That's more we see on the forums...? I have absolutely no basis, but I imagine that 90% of users are included in the groups below this.
      • 100,000 : The number of songs that "music lovers with a lot of songs" have. They are often found on media library forums. They often say they have a lot of old MP3s!
      • 300,000 : The number of songs a "music lover with large library" has. You'll see it on the forums. However, I don't think there are many chances to meet him in the real world.
      • 400,000 : Maximum value currently assumed by Jpsonic for verification. Scanning is possible, but some functions are supposed to be quite inconvenient. The main purpose is to detect points for improvement. If it is modified to work at this level to some extent, it will work quite comfortably when the number of songs is small. Of course, if we don't get hung up on resources, we can add more, but there will be more and more improvements to be made. (It's not just about scanning. For example, displaying the Artist index should be slow. Managing more songs on a legacy server requires a certain amount of machine power, so I think the user who was doing it was a very special user.)
  • The goal is to get it to work correctly within the target resource range and at some reasonable speed. Algorithm improvements aimed at speed are not included in this version. However, it may include fixing significant barriers that interfere with Goal or fixing bugs.
Dummy data

10 songs with random titles per album.

image

10 such albums per artist.

image

100 such artists in your music folder. i.e. 10,000 songs per music folder.

image

Make a lot of such music folders.

image

    volumes:
     - '/volume1/docker/jpsonic-data-sandbox:/jpsonic/data'
     - '/volume1/Music:/jpsonic/music'
     - '/volume1/Podcasts:/jpsonic/podcasts'
     - '/volume1/Playlists:/jpsonic/playlists'
     - '/volume1/TestMusic/00:/jpsonic/TestMusic00'
     - '/volume1/TestMusic/01:/jpsonic/TestMusic01'
     - '/volume1/TestMusic/02:/jpsonic/TestMusic02'
     - '/volume1/TestMusic/03:/jpsonic/TestMusic03'
     - '/volume1/TestMusic/04:/jpsonic/TestMusic04'
     - '/volume1/TestMusic/05:/jpsonic/TestMusic05'
     - '/volume1/TestMusic/06:/jpsonic/TestMusic06'
     - '/volume1/TestMusic/07:/jpsonic/TestMusic07'
     - '/volume1/TestMusic/08:/jpsonic/TestMusic08'
     - '/volume1/TestMusic/09:/jpsonic/TestMusic09'
     - '/volume1/TestMusic/10:/jpsonic/TestMusic10'
     - '/volume1/TestMusic/11:/jpsonic/TestMusic11'
     - '/volume1/TestMusic/12:/jpsonic/TestMusic12'
     - '/volume1/TestMusic/13:/jpsonic/TestMusic13'
     - '/volume1/TestMusic/14:/jpsonic/TestMusic14'
     - '/volume1/TestMusic/15:/jpsonic/TestMusic15'
     - '/volume1/TestMusic/16:/jpsonic/TestMusic16'
     - '/volume1/TestMusic/17:/jpsonic/TestMusic17'
     - '/volume1/TestMusic/18:/jpsonic/TestMusic18'
     - '/volume1/TestMusic/19:/jpsonic/TestMusic19'
     - '/volume1/TestMusic/19:/jpsonic/TestMusic20'
・
・
・

image

When using 100,000 songs.

image

image

Procedure

Do the following steps 5 times without restarting Jpsonic. Simply put, repeat all registration and all deletion 5 times. The scan will be done 15 times.

  • Register directory
  • First scan
  • 2nd scan
  • Run scan after deleting directory all. (i.e. empty the data!)
Result

With song part size of 0

In a test using dummy data with song part size of 0, memory overflow was not confirmed with the following settings.

Songs Java heap Docker memory
100,000 -Xmx512m 1g
200,000 -Xmx1024m 1.5g
300,000 -Xmx1280m 1.75g
400,000 -Xmx1536m 2g

In the case of data tagged as Well-formed, as in this verification, scanning of 100,000 songs is completed in 5 minutes. 1 minute for the second scan. However, efficiency decreases somewhat as the number of songs increases.

The above parameters are not meant to be optimal values. It's a value that does not cause memory overflow. Therefore, it will take some time before we officially announce the so-called "recommended value" that users want. A fix that is expected to improve is coming, and verification will continue to be performed cyclically in the future.

However, It is assumed that the recommended value for the library of 100,000 songs, which is the main target, will probably not change significantly in the future. Because of nice round number!

If the song part size is not 0

Create 100,000 dummy data by writing tags to 34.7 MB FLAC songs in the same way.

34.7 MB (36,418,181 byte, Playing time 5:09) * 100,000 ≒ 3.39Tb

ah. It will take about 5 hours to create the data. And the 4TB hard disk purchased for testing screamed. (Because the OS area of NAS is also included)

image

Taking this situation into consideration, the “procedure” is repeated 5 times each with dummy data with song parts and data without song parts.

  • Even with the same 100,000 songs, it takes 7 minutes if the size of the song data part is 0, and 26 minutes otherwise.
  • No impact on memory usage. Only the speed is different.

Good harvest.

Summary

Original purpose achieved

  • At least with Jpsonic, as long as Java and Docker each have the correct setting values, there should be no memory overflow during scanning. Therefore, we are to provide production.yml with settings such as Java: 512m, Docker: 1g that can be processed in a standard DS220+ environment. If a 4TB NAS is used, 5 minutes of FLAC x 100,000 songs would be manageable specs.
  • Even with 400,000 songs, overflow won't happen if you have the memory you need. However, at this stage it is difficult to say.  No dedicated strategy for extremely large libraries has been done so far. They will be improved in the future.

Issue extraction

Arbitrary verification procedures were instituted for this. There are three main suggestions for improvement.

    1. IO speed improvement. If anything, it is a topic from JDK. Although the load on the CPU and memory is the same, it seems that there is a problem with IO processing that the processing time varies greatly depending on the presence or absence of song parts. RandomAccessFile is used when reading tags, but SeekableByteChannel should be tried, etc. When a 26 minute scan is reduced to nearly 7 minutes, that's a huge advantage. It can also be an improvement with a very large scope of impact. (Since slower HDs are more affected) But this is the implementation range of the parser library, not Jpsonic.
    1. SQL improvements. I'm trying not to overcomplicate it at the moment, but if speed is the goal, there may be room for improvement. As the number of songs increases, the joins can become enormous.
    1. Improve indexing speed and fix web pages. Even if a scan doesn't cause a memory overflow, it has the ability to combo with a scan and create a catastrophe. In that respect, index generation is a poor feature. This improvement is inevitable. It is also known to be one of the causes of high load at the time of so-called login.

These are in order of priority. (Resolving i first allows us to erase the large dummy data... we can add data with music part size of 0 to the empty space .... )

We shouldn't be parallelizing scans until these are resolved. It just jumps up resource consumption pointlessly. If anything, it is better not to parallelize in the case of SD220+. To avoid battling with the GC the required memory requirement will jump.

However, the verifications that should be done in v112.1.0 have been completed, so this matter is closed. The topic of speed improvement will be after v112.2.0.

@tesshucom tesshucom added in: test Issues in the test module or test package in: docker Issues in the test module. for : ported-from-airsonic Known issue resolved after airsonic is closed labels Nov 2, 2022
@tesshucom tesshucom added this to the Near future milestone Nov 2, 2022
@tesshucom tesshucom self-assigned this Nov 2, 2022
@tesshucom tesshucom added the type: task A general task label Nov 4, 2022
@tesshucom tesshucom modified the milestones: Near future, jpsonic 112.1.0 May 5, 2023
@tesshucom tesshucom mentioned this issue May 7, 2023
16 tasks
@tesshucom tesshucom changed the title Need a Docker stress test Docker stress test with Synology DS220+ May 15, 2023
@tesshucom tesshucom linked a pull request May 18, 2023 that will close this issue
@tesshucom tesshucom mentioned this issue Jul 1, 2023
4 tasks
@tesshucom tesshucom mentioned this issue Jul 20, 2023
12 tasks
@tesshucom tesshucom changed the title Docker stress test with Synology DS220+ Docker stress test with Synology DS220+ (v112.1) Aug 15, 2023
@tesshucom tesshucom added type: investigation Investigation required. If it's not a bug it will be closed. in: data-scan Issues especially related to scan. and removed type: task A general task labels Aug 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
for : ported-from-airsonic Known issue resolved after airsonic is closed in: data-scan Issues especially related to scan. in: docker Issues in the test module. in: test Issues in the test module or test package type: investigation Investigation required. If it's not a bug it will be closed.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant