Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem: currentlyProcessing is not empty after transfer is done #1113

Closed
5 tasks
jorikvankemenade opened this issue Feb 24, 2020 · 9 comments
Closed
5 tasks
Labels
Picturae Severity: low An inconvenient situation where the software is usable but could be better. Status: review The issue's code has been merged and is ready for testing/review. Type: bug A flaw in the code that causes the software to produce an incorrect or unexpected result.
Milestone

Comments

@jorikvankemenade
Copy link

Expected behaviour
When a transfer has completed successfully all traces are removed from the currentlyProcessing directory.

Current behaviour
When a transfer has completed successfully a folder containing some of the by-products remain in the currentlyProcessing directory. An example:

[root@archivematica-afs ~]# tree  /volume/sharedDirectory/currentlyProcessing/Images-aee6018d-ba65-4291-8aab-231309876148/
/volume/sharedDirectory/currentlyProcessing/Images-aee6018d-ba65-4291-8aab-231309876148/
├── logs
│   ├── bulk-0b9aa156-6e03-468a-95db-44e307e79306
│   │   ├── domain_histogram.txt
│   │   ├── domain.txt
│   │   ├── report.xml
│   │   ├── url_histogram.txt
│   │   ├── url_services.txt
│   │   └── url.txt
│   ├── bulk-3df51291-6e2d-4f3e-8294-4ec3b5a74ced
│   │   └── report.xml
│   ├── bulk-59f01e88-9d00-4e94-b4ce-73516bd65b01
│   │   ├── domain_histogram.txt
│   │   ├── domain.txt
│   │   └── report.xml
│   ├── bulk-5da098b7-616d-4bfe-a375-7bc4f82f94f0
│   │   ├── report.xml
│   │   ├── telephone_histogram.txt
│   │   └── telephone.txt
│   ├── bulk-6071c87a-186d-4c09-b881-3fe25b6e4d9f
│   │   ├── domain_histogram.txt
│   │   ├── domain.txt
│   │   └── report.xml
│   ├── bulk-82ca71d6-bbc5-45d9-96ae-f1041e85f297
│   │   ├── exif.txt
│   │   └── report.xml
│   ├── bulk-ca32e7ca-28e8-4230-8097-25165eacf5ac
│   │   ├── exif.txt
│   │   ├── report.xml
│   │   ├── telephone_histogram.txt
│   │   └── telephone.txt
│   ├── bulk-cf6b996e-7ee4-4885-8525-6fe281040658
│   │   └── report.xml
│   ├── bulk-e0d4f869-e190-4169-8fe1-f7a852ce2820
│   │   ├── domain_histogram.txt
│   │   ├── domain.txt
│   │   ├── report.xml
│   │   ├── url_histogram.txt
│   │   ├── url_services.txt
│   │   └── url.txt
│   ├── bulk-e816a377-360a-481d-b717-f994e738f06f
│   │   ├── exif.txt
│   │   └── report.xml
│   ├── fileFormatIdentification.log
│   ├── fileMeta
│   └── filenameCleanup.log
├── metadata
│   ├── directory_tree.txt
│   └── submissionDocumentation
│       └── METS.xml
├── objects
└── processingMCP.xml

When comparing the workflow of a qa/1.x with a 10.1 version I see that there is a difference in the workflow. Is it possible that some step that should clean up this file has accidentally been removed from the workflow?

qa
image

10.1
image

Steps to reproduce
Run any sample transfer and check the contents of the currentlyProcessing directory.

Your environment (version of Archivematica, operating system, other relevant details)
Latest qa/1.x branch


For Artefactual use:

Before you close this issue, you must check off the following:

  • All pull requests related to this issue are properly linked
  • All pull requests related to this issue have been merged
  • A testing plan for this issue has been implemented and passed (testing plan information should be included in the issue body or comments)
  • Documentation regarding this issue has been written and merged (if applicable)
  • Details about this issue have been added to the release notes (if applicable)
@sromkey sromkey added Severity: low An inconvenient situation where the software is usable but could be better. Type: bug A flaw in the code that causes the software to produce an incorrect or unexpected result. labels Feb 24, 2020
@sromkey
Copy link
Contributor

sromkey commented Feb 24, 2020

@jorikvankemenade we've marked this Low severity because while we recognize it's an issue it's never really come up as a major problem or hit on performance- but can you let us know if you have found otherwise.

@sromkey sromkey added the Status: refining The issue needs additional details to ensure that requirements are clear. label Feb 24, 2020
@jorikvankemenade
Copy link
Author

@sromkey I understand the priority. With about 0.5MB per transfer, it is not a big problem. Even when running a lot of transfers it is manageable. I would like to suggest to mark it as a regression since it is introduced on the QA branch.

@jorikvankemenade
Copy link
Author

I just did some digging. I am sure that the problem is the last "Move to SIP creation directory for completed transfers". This moves the folder structure I just described to sharedDirectory/watchedDirectories/SIPCreation/completedTransfers/. I think this step is lost in the queuing work that has been merged around September. So maybe we can fix this with the work we are doing for #1055 and #1108.

@jorikvankemenade
Copy link
Author

jorikvankemenade commented Feb 25, 2020

The step I am talking about is removed in this PR. We might have to move this step back into the workflow, but I am not sure if this has any consequences. The PR description also briefly mentions this problem, but not really any solution to it.

Before fixing this, it might be good to think if the whole completed transfer folder is wanted/used. If this is not the case then we need to make sure that this "extra" folder is deleted, rather then moved.

@FransPicturae
Copy link

In our local environment we reverted the PR @jorikvankemenade mentioned, the behavior of not deleting the currentlyProccessing contents remains. This bug is preventing new transfers to be started in our environment. For us this is high impact.

@jorikvankemenade
Copy link
Author

jorikvankemenade commented Apr 28, 2020

This bug is preventing new transfers to be started in our environment.

Could you maybe clarify this a bit? How is not deleting the currently processing contents preventing new transfers from being started?

Or is the problem of transfers not being started new, and caused by reverting the PR. If this is the case, I wouldn't be surprised. Reverting that PR changes the workflow. This can cause problems if this means that the workflow never hits a terminal link. See #1055 and this PR for some extra information on the termination of workflows.

@FransPicturae
Copy link

By reverting the PR we investigated if the currentlyProccesing directory would be empty after a succesful transfer/ingest. When a transfer is done the SIP is copied and gets a new uuid. The old remains in the folder. We discovered that manual removing the files would make it possible to start a new transfer. Which was not possible using some custom MicroServices.

@ross-spencer
Copy link
Contributor

Comment on the forum about the correctness of the database at this stage in the transfer too. It'd be a good idea to peek into it a bit more with some deeper analysis: https://groups.google.com/g/archivematica/c/f3C_2gdgY0U/m/PUBZbF3dAAAJ

@arthytrip
Copy link

I confirm the same functioning observed by Jorik and FransPicturae.
I also confirm that in the database the packages (Transfers table) all remain with currentLocation set to "'%sharedPath%currentlyProcessing/...".
N.B. Please also support DB. It is not true that it is only transitory, for the current management and the related reporting it is essential!

@sromkey sromkey added this to the 1.13.0 milestone Feb 16, 2021
@sromkey sromkey added Status: ready The issue is sufficiently described/scoped to be picked up by a developer. and removed Status: refining The issue needs additional details to ensure that requirements are clear. labels Feb 23, 2021
sevein added a commit to artefactual/archivematica that referenced this issue Feb 26, 2021
This commit removes the original transfer directory that Archivematica
leaves under `sharedDirectory/currentlyProcessing`. It does not truncate
related database entries, that's beign tackled as part of issue 1239.

Connects to archivematica/Issues#1239.
Connects to archivematica/Issues#1113.
@sevein sevein added Status: in progress Issue that is currently being worked on. and removed Status: ready The issue is sufficiently described/scoped to be picked up by a developer. labels Feb 26, 2021
@sevein sevein self-assigned this Feb 26, 2021
sevein added a commit to artefactual/archivematica that referenced this issue Feb 26, 2021
This commit removes the original transfer directory that Archivematica
leaves under `sharedDirectory/currentlyProcessing`. It does not truncate
related database entries, that's being tackled as part of issue 1239.

Connects to archivematica/Issues#1239.
Connects to archivematica/Issues#1113.
sevein added a commit to artefactual/archivematica that referenced this issue Feb 26, 2021
This commit removes the original transfer directory that Archivematica
leaves under `sharedDirectory/currentlyProcessing`. It does not truncate
related database entries, that's being tackled as part of issue 1239.

Connects to archivematica/Issues#1239.
Connects to archivematica/Issues#1113.
@sevein sevein added Status: review The issue's code has been merged and is ready for testing/review. and removed Status: in progress Issue that is currently being worked on. labels Mar 3, 2021
@sevein sevein removed their assignment Mar 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Picturae Severity: low An inconvenient situation where the software is usable but could be better. Status: review The issue's code has been merged and is ready for testing/review. Type: bug A flaw in the code that causes the software to produce an incorrect or unexpected result.
Projects
None yet
Development

No branches or pull requests

7 participants