-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add list-schemas for the embedded repo (close #212) #214
Conversation
Thanks for your pull request. Is this your first contribution to a Snowplow open source project? Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). 📝 Please visit https://docs.snowplowanalytics.com/docs/contributing/contributor-license-agreement/ to learn more and sign. Once you've signed, please reply here (e.g. I signed it!) and we'll verify. Thanks. |
121c7cf
to
f221c0a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good. If it's helpful for making tests pass in other places, then we should merge it because it's helpful.
I spent a while on this review, because I was interested in the limits of where it works and where it doesn't. Until now, the embedded repo implementation has allowed two slightly different types of schema location:
- A standalone file on the filesystem, which is a sub-directory of the class path.
- A schema packaged into a fat jar file, which is on the class path.
Both of the above work for a single schema lookup, because getClass.getResource(path)
works for both. But your implementation of listing schemas only works on type 1, not on type 2.
I wondered how difficult it would be to get it working with schemas in fat jar files. This gist I found shows roughly how to handle the two different cases.
Anyway, I'm not saying to change anything. I just thought it was interesting to consider both cases.
case _ => F.pure(RegistryError.NotFound.asLeft) | ||
case Registry.Embedded(_, base) => | ||
val path = toSubpath(base, vendor, name) | ||
Utils.unsafeEmbeddedList(path, model).pure[F] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably strictly speaking should be suspended instead of pure:
Sync[F].delay(Utils.unsafeEmbeddedList(path, model))
def unsafeEmbeddedList(path: String, modelMatch: Int): Either[RegistryError, SchemaList] = | ||
try { | ||
val d = new File(getClass.getResource(path).getPath) | ||
val schemaFileRegex: Regex = (".*?" + // path to file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you make this line just a little bit more strict?
val schemaFileRegex: Regex = (".*?/schemas/" + // path to file
@@ -158,4 +196,5 @@ private[registries] object Utils { | |||
private[resolver] def repoFailure(failure: Throwable): RegistryError = | |||
RegistryError.RepoFailure(failure.getMessage) | |||
|
|||
implicit val orderingSchemaKey: Ordering[SchemaKey] = SchemaKey.ordering |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You haved defined this ordering but you are not using it in your implementation.
I proved this by making it private, and then the compiler told me: private val orderingSchemaKey in object Utils is never used
d.listFiles | ||
.filter(_.isFile) | ||
.toList | ||
.filter(_.getName.startsWith(modelMatch.toString)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not filter properly if modelMatch
is 1
but the name of the file is 100-0-0
.
I would remove this filter line, and instead put a check after the regex matcher a couple of lines below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! It makes it a bit awkward to put these because there isn't a neutral SchemaKey
element. It would have to be inside anther filter. 100-
could be avoided with s"${modelMatch.toString}-}
, see my new commit.
I think it is too much effort to make it work for a very uncommon use case (I have not seen it being used ever). |
edbc33c
to
d1bd077
Compare
5f09cf2
to
3ecc8b8
Compare
ready for merge |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
content | ||
.traverse { | ||
case schemaFileRegex(vendor, name, format, model, revision, addition) | ||
if model == modelMatch.toString => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a big fan of cats triple equals ===
for some extra type safety.
No description provided.