-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace s3 mock with minio #43
Conversation
@@ -14,7 +14,7 @@ object Generators { | |||
) | |||
|
|||
lazy val bucketAllCharGen: Gen[Char] = Gen.frequency( | |||
(6, Gen.alphaLowerChar), | |||
(10, Gen.alphaLowerChar), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Due to the shrinking mechanism of scalacheck and the filter done by checkInvalidDuplicateChars
, the test was sometimes hitting too many cases where the checkInvalidDuplicateChars
returned false. Increasing the proportion of alpha lower chars (which are never filtered) made this issue go away
70caeee
to
5d7ff57
Compare
implicit val s3Attrs: Attributes = S3Attributes.settings(s3Settings) | ||
|
||
val delay = | ||
(THROTTLE_AMOUNT * (kafkaDataWithTimePeriod.data.size / THROTTLE_ELEMENTS) * 1.2) + (10 millis) match { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incase you are wondering, yes this is the only way to convert a Duration
to a FiniteDuration
....
).asJava | ||
) | ||
lazy val s3Settings = S3Settings() | ||
.withEndpointUrl(s"http://${container.getHostAddress}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure how important it is, but S3 will be HTTPS very likely, I think you could also configure TLS with Minio and be as close to production as possible. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's worth exploring
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My view on this is the point of this test specifically is to see that the backup flow for guardian works on S3 as much as possible without needing an S3 account. Testing https in this context is really only testing that the Alpakka library (which we use for S3) can properly load the S3 client configuration for https which is a test that is likely already done in the underlying library (see https://github.com/akka/alpakka/blob/master/s3/src/test/scala/akka/stream/alpakka/s3), so in other words we would just be duplicating a test.
An integration test for this could also work, but it would be closer to a unit style test where we just make sure that the S3 client can do a trivial operation over https, i.e. creating a single bucket (in contrast to this test which is actually making sure everything in the backup flow is being written to S3 correctly).
If you feel strongly about it and its not too difficult (ergo needing to setup fake certificates) I am not against changing this test to use https in a future PR however note that it is intended in the future to be able to do a full end2end test in guardian on an actual S3 account (which would include https).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some minor comments and a question
backup-s3/src/test/scala/io/aiven/guardian/kafka/backup/s3/BackupClientSpec.scala
Outdated
Show resolved
Hide resolved
backup-s3/src/test/scala/io/aiven/guardian/kafka/backup/s3/BackupClientSpec.scala
Outdated
Show resolved
Hide resolved
core/src/test/scala/io/aiven/guardian/kafka/MockedKafkaClientInterface.scala
Outdated
Show resolved
Hide resolved
core/src/test/scala/io/aiven/guardian/kafka/MockedKafkaClientInterface.scala
Show resolved
Hide resolved
5d7ff57
to
e3cde7c
Compare
@jlprat I have just rebased the PR with the only change being making the constants follow Scala style guide. I didn't make this an additional commit because this PR has multiple commits and then I would have had to squish things out of order (which is annoying/difficult) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
9001b5e
to
085b29f
Compare
085b29f
to
b494c7d
Compare
About this change - What it does
This change replaces the adobe S3 mock with minio running in a docker container using scala test-containers as well as throttling the source input in the Kafka mock so that the entire source isn't processed at once
References: #38
Why this way
The tests for the S3 backup client have randomly started failing somewhat recently (see linked issue). After some investigation I believe the main case is due to the fact that the S3 mock couldn't handle receiving the data so fast. Hence as a solution I did the following things
sourceTransform
function to the Kafka mock which allows you to apply a function on theSource
. We use this to throttle theSource
(which is just a hardcoded list of events) so that events occur naturally over time rather than all at onceNote that since this PR gets test-containers up and running, its the exact same type of work that is required for GCS.