Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

more efficient use of disk space on exports #3

Closed
gonzoearth opened this issue May 26, 2014 · 10 comments
Closed

more efficient use of disk space on exports #3

gonzoearth opened this issue May 26, 2014 · 10 comments
Labels

Comments

@gonzoearth
Copy link

No description provided.

@gonzoearth gonzoearth added the bug label May 26, 2014
@jywarren
Copy link
Member

Fixed tnx it was disk space.
On May 26, 2014 4:03 PM, "stewart long" notifications@github.com wrote:

[image: screen shot 2014-05-26 at 1 01 03 pm]https://cloud.githubusercontent.com/assets/1051020/3085409/d73706d8-e510-11e3-884e-c51a6752bf06.png
[image: screen shot 2014-05-26 at 1 00 41 pm]https://cloud.githubusercontent.com/assets/1051020/3085410/d7476942-e510-11e3-94fd-f5766f64361a.png


Reply to this email directly or view it on GitHubhttps://github.com//issues/3
.

@gonzoearth gonzoearth reopened this May 26, 2014
@jywarren jywarren changed the title MapKnitter.org brought down with large export more efficient use of disk space on exports May 29, 2014
@jywarren
Copy link
Member

Stu, I'm not sure how there could have been image placement work lost; perhaps you had been working on the map for 2.5 hours while the server was already down, but you hadn't refreshed your local browser window? Have you seen this happen again? If you do, could you create a new issue for it?

I'm sorry you lost 2.5 hours of work, that sucks. I've been inundated with feature requests, bug reports, etc on Infragram and PublicLab.org, while also trying to get this big tag migration going and a new test server. I've blocked out some time right now to make MapKnitter more efficient on disk space usage, but it is so huge right now that we're going to have to move it onto a larger disk very soon.

@jywarren
Copy link
Member

taking notes here; going through https://github.com/publiclab/mapknitter/blob/master/app/models/warpable.rb#L119 to find temp files from the export we can delete:

  • believe everything in -working/ can be deleted; this is just so we can use the files locally outside of s3
  • believe everything -masked.png can be deleted
  • believe everything -mask.png can be deleted
  • everything -geo WITH AN ID could be deleted, but there is a feature request to preserve these by Don, though I can't find it and it might've been solved a different way
  • actually, everything ending in .png can be deleted -- this is just masks and local warped & masked files

This should win us a lot of space.

  • first i'm going to try manually deleting them on a test map
  • then will attempt to hardcode auto-cleanup in
  • then run it on the test map

@btbonval
Copy link
Member

Would it be worth trying to offload the temporary files to an S3 bucket?
storage on S3 is cheap. 5 GiB ~= 2 cents.

We could create a new bucket for a new task, name it based on the task, and
then delete the bucket when the temporary utility has concluded.

We'll never run out of disk space, but you might get charged a few quarters
or something for each export.
-Bryan

On Thu, May 29, 2014 at 11:48 AM, Jeffrey Warren
notifications@github.comwrote:

taking notes here; going through
https://github.com/publiclab/mapknitter/blob/master/app/models/warpable.rb#L119to find temp files from the export we can delete:

  • believe everything in -working/ can be deleted; this is just so we
    can use the files locally outside of s3
    • believe everything -masked.png can be deleted
  • believe everything -mask.png can be deleted
  • everything -geo WITH AN ID could be deleted, but there is a feature
    request to preserve these by Don, though I can't find it and it might've
    been solved a different way

This should win us a lot of space.

  • first i'm going to try manually deleting them on a test map
  • then will attempt to hardcode auto-cleanup in
  • then run it on the test map


Reply to this email directly or view it on GitHubhttps://github.com//issues/3#issuecomment-44569185
.

@jywarren
Copy link
Member

The source files are already stored in s3 - actually the first operation is to copy them to a local folder so they can be manipulated. I think we can be very aggressive here without losing anything. like rm warps/*/*.png should be fine. also rm -r warps/*-working/ since there are no maps which end in -working

@jywarren
Copy link
Member

I'm going to start with deleting all -working directories and pngs, and leave the geo.tiff files, since they're used later in the export process. We can take this by stages.

@jywarren
Copy link
Member

tested and running... doing bulk deletions now

148gb available after deleting pngs

@jywarren
Copy link
Member

warren@tycho:/sites/mapknitter.org$ du --max-depth 1 -h public/warps/mestia/
2.7G public/warps/mestia/
warren@tycho:
/sites/mapknitter.org$ du --max-depth 1 -h public/warps/space-test/
7.0M public/warps/space-test/

So huge maps like http//mapknitter.org/map/view/mestia still have 2.7gb of geotiffs, but we've won a lot of space overall.

@jywarren
Copy link
Member

after the working directories were deleted, we now have 162gb of space. And the exports should now clean up after themselves.

@jywarren
Copy link
Member

If we need more later, we can delete all /public/warps/<mapname>/<warpable_id>-geo.tif

otherwise closing for now!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants