Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem: "cleanup names" produces too much output #1118

Closed
jrwdunham opened this issue Jun 4, 2018 · 7 comments
Closed

Problem: "cleanup names" produces too much output #1118

jrwdunham opened this issue Jun 4, 2018 · 7 comments
Labels
Columbia University Library CUL: phase 1 Status: invalid A reported issue that isn't reproduceable or was caused by user error/improper installation/etc. Type: bug A flaw in the code that causes the software to produce an incorrect or unexpected result.

Comments

@jrwdunham
Copy link
Contributor

Problem

The "cleanup names" micro-service runs once per package and will print at least one line to stdout and stderr for each file in the package. For very large transfers/SIPs, this can mean many MBs of output that must be stored in the Tasks table. This leads to a bloated Tasks table and, worse, can lead to dropped connections to the MySQL server, which Archivematica's MCP Server does not handle gracefully: the entire daemon process exits. This causes all processing to halt and the user must restart the Archivematica services and restart the transfer.

Workaround

The max_allowed_packet setting in MySQL is 1MB by default. Setting a higher value for the max_allowed_packet variable in the mysql config file will provide an immediate work around to this issue. Based on tests with a transfer containing 3174 files, the maximum packet created was just under 6MB. Assuming a desired transfer size of 21,000, it is conceivable that the current version of Archivematica could attempt to construct a packet that is up to 7 times larger (35MB). Tests were run successfully with the value set to 32MB. It is probably safe to set it as high as 128MB, but it may be better to also look at some of the longer term solutions listed below, and stick with a value like 32MB or 64MB.

The steps we followed to implement this workaround were:

  1. Stop services

  2. open (as root) the mysql config file at /etc/my.cnf.d/server.conf for editing (we used vi).

  3. add the following entries to the section labelled [mariadb]:

    [mariadb]
    innodb_use_sys_malloc = 0
    innodb_log_file_size = 32M
    innodb_log_files_in_group = 4
    innodb_buffer_pool_size = 512M
    innodb_log_buffer_size = 2M
    innodb_file_per_table = 1
    join_buffer_size = 256K
    thread_cache_size = 8
    query_cache_size = 0
    query_cache_type = 0
    query_cache_limit = 2M
    max_allowed_packet = 32M
    

    Note that only the last variable (max_allowed_packet) has to be added. All of the others were mysql tuning parameters recommended by mysqltuner.pl (https://github.com/major/MySQLTuner-perl/). These additional changes did not significantly affect our test results.

  4. restart services

Longer term solutions

  1. Upgrade to 1.7.1 and disable output capture.
  2. New development work to reduce the output generated by the most verbose preservation tasks, such as "clean up names" and "characterize and extract metadata".
@jrwdunham jrwdunham added the Type: bug A flaw in the code that causes the software to produce an incorrect or unexpected result. label Jun 4, 2018
@jrwdunham jrwdunham self-assigned this Jun 4, 2018
@ross-spencer
Copy link
Contributor

@jrwdunham I have seen this too. In case it's useful - For a single object nested inside two folders, we have 15 lines of logging in the Transfer and Ingest records, respectively. Working with another client at the moment, we are considering using the filenamecleanup.log as a method of correcting the mets record where required so it's a useful tool to have. Less so perhaps is the additional logging to say that no name sanitization happened.

@J4bbi
Copy link

J4bbi commented Oct 10, 2018

A related issue was causing "characterize and extract metadata" job to hang indefinitely.

The FITS metadata extracted would be 2,1 mb for a particular transfer and the MCPServer log would record a OperationalError: (2006, 'MySQL server has gone away') without failing.

When setting max_allowed_packet = 32M and restarting services the transfer went through without a hitch.

@ross-spencer
Copy link
Contributor

Related to archivematica/Issues#956

@ross-spencer
Copy link
Contributor

I recently analysed the output changes made for 1.10 as part of an analysis ticket for UoW and we can see the following results:

1.9 amount of data for 10,116 files (all files renamed):

mysql> select LENGTH(stdOut) from Tasks 
       where taskUUID 
       like "c3fd658f-0ae8-469a-8e40-c20c1ad5af1f";
+----------------+
| LENGTH(stdOut) |
+----------------+
|        1782732 |
+----------------+
1 row in set (0.00 sec)

Total: 1.7MB

And stderr:

mysql> select LENGTH(stdError) from Tasks 
       where taskUUID 
       like "c3fd658f-0ae8-469a-8e40-c20c1ad5af1f";
+------------------+
| LENGTH(stdError) |
+------------------+
|         11372184 |
+------------------+
1 row in set (0.00 sec)

Total: 11.3MB

1.10 same data-set:

mysql> select LENGTH(stdOut) from Tasks where taskUUID like "d1195897-2230-4242-a0e1-35584eb7b607";
+----------------+
| LENGTH(stdOut) |
+----------------+
|        3779475 |
+----------------+
1 row in set (0.00 sec)

Total: 3.7MB

Stderr:

NB. Stderr was removed from this microservice in 1.10.

@sromkey @sevein given the reduction in verbosity with 1.10 noted above I'd like to propose we close this ticket in favor of related tickets (e.g. increasing SQL limits, or disabling FITS by default) and then creating more specific issues around high-volume output microservices as those issues arise. This ticket could sit open for a while unless there's a specific target we'd like to reach with it.

Does that sound okay?

@sromkey
Copy link
Contributor

sromkey commented Nov 6, 2019

That sounds like a good plan!

@sromkey
Copy link
Contributor

sromkey commented Nov 6, 2019

@sevein you?

@sevein
Copy link
Member

sevein commented Nov 6, 2019

Sounds good.

@sromkey sromkey added the Status: invalid A reported issue that isn't reproduceable or was caused by user error/improper installation/etc. label Nov 6, 2019
@sromkey sromkey closed this as completed Nov 6, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Columbia University Library CUL: phase 1 Status: invalid A reported issue that isn't reproduceable or was caused by user error/improper installation/etc. Type: bug A flaw in the code that causes the software to produce an incorrect or unexpected result.
Projects
None yet
Development

No branches or pull requests

5 participants