Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

s3 nippy example doesn't work #6

Closed
AdamClements opened this issue Apr 8, 2018 · 9 comments
Closed

s3 nippy example doesn't work #6

AdamClements opened this issue Apr 8, 2018 · 9 comments

Comments

@AdamClements
Copy link

Nippy has low level and high level apis, as such the data saved by freeze can't actually be read by thaw-from-in! because freeze/thaw is a high level api supporting encryption, compression and including the nippy file header, where thaw-from-in! and freeze-from-in! are low level and don't support those, so when it is trying to read the type id and instead finds the first letter of the NPY header, it complains that it doesn't recognise the file.

I also have an issue in that I get a 404 from s3 if the key doesn't exist yet and I don't know what to put in it to manually create a blank nippy file as a starting point.

@jimpil
Copy link
Owner

jimpil commented Apr 10, 2018

Again, thanks for taking the time to report this, and to point out the low VS high level api of nippy. Have you tried something along these lines as the :read fn?

#(with-open [dis (DataInputStream. %)
             bos (ByteArrayOutputStream. (.available dis))]
   (io/copy dis bos)
   (nippy/thaw (.toByteArray bos)))

If that (or something similar) works for you let me know, and I will update the example asap. As to your other 404 problem, I'm not sure I can help here as duratom simply delegates to amazonica for creating/deleting the bucket. I will admit that the S3 code has not actually been tested because I didn't have a way of testing it, and so it's not inconceivable that I'm doing something wrong, but I can't see what at the moment.

Kind regards

@jimpil
Copy link
Owner

jimpil commented Apr 11, 2018

duratom 0.3.8 has been released and it comes with a helper fn in utils.clj (s3-bucket-bytes). Compose that function with nippy/thaw and you should be golden ;). I've updated the README to reflect this. Let me know if you 're still having issues.

@jimpil
Copy link
Owner

jimpil commented Apr 11, 2018

OK, so it turns out that I wasn't calling aws/put-object correctly. According to the amazonica README the :content-length should go under a :metadata key:

{:metadata {:content-length (alength val-bytes)}})))

That could be the source of your issue, if duratom wasn't able to initialise the bucket. Can you try with duratom 0.3.9 which I've just cut fixing this? Thanks in advance...

@jimpil
Copy link
Owner

jimpil commented Apr 22, 2018

I've just cut 0.4.0 which improves streaming the bytes from the s3 bucket (via ut/s3-bucket-bytes). If you're doing any sort of testing using 0.3.9, I'd appreciate it if you could swap to 0.4.0, as you could be seeing major performance improvements (e.g. in the case of nippy encode bytes when using the suggested custom-reader).

@AdamClements
Copy link
Author

AdamClements commented Jun 13, 2018

Hey, I tried it out, and I'm still getting it error out with a 404 not found instead of creating the file initially (using s3 and nippy with the instructions in the readme)

@jimpil
Copy link
Owner

jimpil commented Jun 14, 2018

Are you able to create buckets manually (e.g. using aws/create-bucket) outside of duratom? I'm not sure if you've looked in utils.clj, but the four S3 relevant functions are all wrappers around amazonica (i.e. aws/create-bucket, aws/aws/does-bucket-exist, aws/delete-object & aws/put-object). I suggest that you try these functions on your REPL without involving duratom. Once you establish that they work as expected, I'd appreciate some code snippets showcasing each of those calls, and hopefully I will be able to figure out what I 'm doing wrong. Thanks in advance...

@AdamClements
Copy link
Author

AdamClements commented Jun 14, 2018 via email

@jimpil
Copy link
Owner

jimpil commented Jun 14, 2018

The S3 backend is the only backend that is completely untested - I believe I mentioned that in some previous comment. Moreover, you seem to be the first person even remotely trying it out it! Now, in all fairness, the only difference between the various backends are really the different underlying helpers, so if something works on one backend but not on another, it can really only be some something relating to these backend-specific helpers.

Ok, so you say the bucket exists. Can you confirm that s3/does-bucket-exist returns true? Also, can you point me to the action that returns the 404 (I assume it's s3/put-object, but it could be s3/does-bucket-exist).

In your code snippet, I don't see any credentials anywhere, so you must be using some of that defcredential/with-credential magic. That won't work with duratom I'm afraid, as I'm using the arities that expect the credential as the first argument (e.g. https://github.com/mcohen01/amazonica/blob/d1720a3985496b22aba87cc9d44160362ff1995d/test/amazonica/test/s3.clj#L153). May I ask, what sort of :credentials key are you passing to duratom upon construction? According to the link above it should be the raw credentials map.

@jimpil
Copy link
Owner

jimpil commented Apr 7, 2023

Revisiting this ticket because I was finally able to create an AWS account and test this myself. The good news is that the 5 S3-related helpers work as expected - observe the following:

 (def creds {:access-key "..."         ;; <= replace with yours
             :secret-key "..."         ;; <= replace with yours
             :endpoint   "eu-west-1"}) ;; <= replace with yours
  (def dummy-value (pr-str {:a 1 :b 2}))
  ;; check a bucket I know exists
  (bucket-exists? creds "jimpil-test") ;; => true
  ;; create a brand new one (with public access!)
  (create-s3-bucket creds "jimpil-test-delete-me") ;; => {:name "jimpil-test-delete-me"}
  (store-value-to-s3 creds "jimpil-test" "dummy.edn" {} dummy-value) ;; => a big response
  (get-value-from-s3  creds "jimpil-test" "dummy.edn" {} (partial read-edn-object {})) ;; => {:a 1, :b 2}
  (delete-object-from-s3 creds "jimpil-test" "dummy.edn") ;; => nil (but succeeded)

The credentials I used were for a user belonging to a user-group with full S3 access, and as you can see, he can still access a bucket which has full BlockPublicAccess enabled. In any case, that's how these functions are supposed to be called - with the credentials (which you pass when you create the duratom) as the first argument.

Now, the nippy part of the issue, is somewhat separate, and can be addressed/tweaked using the :read fn. The point is you get raw bytes from S3, and so what they 'mean' is entirely up to you.

In any case, I'm (finally) closing this 😌

@jimpil jimpil closed this as completed Apr 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants