Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delete copies of CONUS hourly data on object storage #417

Open
rsignell opened this issue Dec 1, 2023 · 4 comments
Open

Delete copies of CONUS hourly data on object storage #417

rsignell opened this issue Dec 1, 2023 · 4 comments

Comments

@rsignell
Copy link
Contributor

rsignell commented Dec 1, 2023

@amsnyder, @alaws-USGS and hytest crew:

As described in more detail in this notebook, there are currently 3 copies of the CONUS404 hourly data on object storage:

  • AWS S3 nhgf-development
  • RENCI Pod rsignellbucket2
  • USGS Pod hytest

Since this data also exists on caldera, I suggest we delete the dataset on AWS S3 nhgf-development (saving USGS about $1600/month) and also on the RENCI pod rsignellbucket2, which was an allocation to be used before the USGS pod was acquired.

Any objections?
If not I will delete the dataset from the RENCI pod rsignellbucket2, freeing up the space for other use.

Someone with permissions for nhgf-development would need to delete the copy there.

@rsignell
Copy link
Contributor Author

rsignell commented Dec 2, 2023

I'm currently deleting the version on rsignellbucket2 with this command:

rclone delete osn-rsignellbucket2:rsignellbucket2/hytest/conus404/conus404_hourly_202302.zarr --disable ListR --rmdirs --checkers 34

It's deleting at rate of about 6TB per hour.

@rsignell
Copy link
Contributor Author

rsignell commented Dec 2, 2023

I forgot, there is one more copy! Before we had the hytest bucket on USGS pod 1, we had the usgspod-testbucket to see if the pod was working okay. And that copy is still there:

(rclone)[rsignell@pn124 ~]$ rclone size osn-usgspod-testbucket:usgspod-testbucket/hytest/conus404/conus404_hourly_202302.zarr --checkers 34 --disable ListR
Total objects: 10.915M (10914729)
Total size: 70.266 TiB (77257934739130 Byte)

The rclone remote for this bucket I have specified in ~/.config/rclone/rclone.conf as:

[osn-usgspod-testbucket]
type = s3
provider = Ceph
access_key_id = <osn key for this bucket here>
secret_access_key = <osn secret key for this bucket here>
endpoint = https://usgs.osn.mghpcc.org
no_check_bucket = true

@amsnyder shall I delete this as well or would you like someone else to have the experience?

@amsnyder
Copy link
Contributor

amsnyder commented Dec 8, 2023

You can go ahead and delete the copy in usgspod-testbucket

@rsignell
Copy link
Contributor Author

rsignell commented Dec 8, 2023

Okay, I'll fire off the batch job to remove it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants