2 way sync between Google Cloud Storage (the s3 equivalent for GCP) and a local machine. There are probably other tools that do similar things. Im doing it for fun.
This is a loop forever that step1) 2 syncs data from local machine and gcs. step2) sleep 30 secs. This works more or less, except that there are atleast a few things to improve
- While a making GCP apis once in 30 secs is okay, it seems wrong. Dont have a good explanation yet
- No pagination. I think the GCP SDK i use takes care of that, if the directory is large, i will hold it all in memory. Is this okay?
- If i start the sync process, it plays safe and removed files will be added back. I can save the last scan state on disk to avoid this.
- No unit or integration tests. The only testing i did was to sync this repo by using the code here to GCS
go run *go -remote=gs://<my gcp bucket>/cloudsync -local=$PWD
- support gitignore. E.g in my testing i would have liked to skip the .git directory and .idea directory
- Support trash
- Support recovering from an earlier state. Is it possible to give a simple experience? - Something like "give me state of things as of from the cloud". Maybe that's too much. Should we do blobstore operations in parallel?
- Should we do local file operations in parallel?
There might be more things to do.