/ Singularity Public
reduce S3 uploader mem usage #1639
Add this suggestion to a batch that can be applied as a single commit. This suggestion is invalid because no changes were made to the code. Suggestions cannot be applied while the pull request is closed. Suggestions cannot be applied while viewing a subset of changes. Only one suggestion per line can be applied in a batch. Add this suggestion to a batch that can be applied as a single commit. Applying suggestions on deleted lines is not supported. You must change the existing code in this line in order to create a valid suggestion. Outdated suggestions cannot be applied. This suggestion has been applied or marked resolved. Suggestions cannot be applied from pending reviews. Suggestions cannot be applied on multi-line comments. Suggestions cannot be applied while the pull request is queued to merge.
Originally, we'd recurse through the directory to find files to upload. Based on the heap dump, the depth could get large if many subdirectories existed and were checked. The biggest piece seemed to be the list of paths themselves.
The order we check the files shouldn't matter here; we just want to check all files within a dir. So this updates the logic so we get all file paths we need to check, put them in a list, and loop through each path to check the file. This removes the overhead of the recursive method calls and the copies of references to
However, it's possible that the bigger problem is in the number Paths we have in the
toUploadlist unrelated to how it's handled within the recursion. One solution to that would be to just upload in limited batch sizes (e.g. 100 items at a time). Let me know your thoughts