Add list-schemas for the embedded repo (close #212) #214

voropaevp · 2022-11-12T20:00:25Z

No description provided.

snowplowcla · 2022-11-12T20:00:27Z

Thanks for your pull request. Is this your first contribution to a Snowplow open source project? Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://docs.snowplowanalytics.com/docs/contributing/contributor-license-agreement/ to learn more and sign.

Once you've signed, please reply here (e.g. I signed it!) and we'll verify. Thanks.

istreeter

This looks good. If it's helpful for making tests pass in other places, then we should merge it because it's helpful.

I spent a while on this review, because I was interested in the limits of where it works and where it doesn't. Until now, the embedded repo implementation has allowed two slightly different types of schema location:

A standalone file on the filesystem, which is a sub-directory of the class path.
A schema packaged into a fat jar file, which is on the class path.

Both of the above work for a single schema lookup, because getClass.getResource(path) works for both. But your implementation of listing schemas only works on type 1, not on type 2.

I wondered how difficult it would be to get it working with schemas in fat jar files. This gist I found shows roughly how to handle the two different cases.

Anyway, I'm not saying to change anything. I just thought it was interesting to consider both cases.

istreeter · 2022-11-14T09:56:01Z

...re/src/main/scala/com.snowplowanalytics.iglu/client/resolver/registries/RegistryLookup.scala

-          case _                            => F.pure(RegistryError.NotFound.asLeft)
+          case Registry.Embedded(_, base) =>
+            val path = toSubpath(base, vendor, name)
+            Utils.unsafeEmbeddedList(path, model).pure[F]


Probably strictly speaking should be suspended instead of pure:

Sync[F].delay(Utils.unsafeEmbeddedList(path, model))

istreeter · 2022-11-14T09:59:31Z

modules/core/src/main/scala/com.snowplowanalytics.iglu/client/resolver/registries/Utils.scala

+  def unsafeEmbeddedList(path: String, modelMatch: Int): Either[RegistryError, SchemaList] =
+    try {
+      val d = new File(getClass.getResource(path).getPath)
+      val schemaFileRegex: Regex = (".*?" + // path to file


Could you make this line just a little bit more strict?

val schemaFileRegex: Regex = (".*?/schemas/" + // path to file

istreeter · 2022-11-14T12:42:41Z

modules/core/src/main/scala/com.snowplowanalytics.iglu/client/resolver/registries/Utils.scala

@@ -158,4 +196,5 @@ private[registries] object Utils {
  private[resolver] def repoFailure(failure: Throwable): RegistryError =
    RegistryError.RepoFailure(failure.getMessage)

+  implicit val orderingSchemaKey: Ordering[SchemaKey] = SchemaKey.ordering


You haved defined this ordering but you are not using it in your implementation.

I proved this by making it private, and then the compiler told me: private val orderingSchemaKey in object Utils is never used

istreeter · 2022-11-14T12:47:56Z

modules/core/src/main/scala/com.snowplowanalytics.iglu/client/resolver/registries/Utils.scala

+        d.listFiles
+          .filter(_.isFile)
+          .toList
+          .filter(_.getName.startsWith(modelMatch.toString))


This does not filter properly if modelMatch is 1 but the name of the file is 100-0-0.

I would remove this filter line, and instead put a check after the regex matcher a couple of lines below.

Good catch! It makes it a bit awkward to put these because there isn't a neutral SchemaKey element. It would have to be inside anther filter. 100- could be avoided with s"${modelMatch.toString}-}, see my new commit.

voropaevp · 2022-11-15T23:05:01Z

getResource does not work with jar files, it has to be getResourceAsStream. It also has an issue getting a subfolder, the argument got to be a file. Which makes implementation a lot more awkward than the gist in reference.

I think it is too much effort to make it work for a very uncommon use case (I have not seen it being used ever).

voropaevp · 2022-11-17T22:03:36Z

ready for merge

istreeter

Looks good!

istreeter · 2022-11-18T10:39:04Z

modules/core/src/main/scala/com.snowplowanalytics.iglu/client/resolver/registries/Utils.scala

+      content
+        .traverse {
+          case schemaFileRegex(vendor, name, format, model, revision, addition)
+              if model == modelMatch.toString =>


I'm a big fan of cats triple equals === for some extra type safety.

Prepare for 2.1.0 release

6e68bb6

snowplowcla added the cla:no label Nov 12, 2022

voropaevp changed the base branch from master to develop November 12, 2022 20:00

Add list-schemas for the embedded repo (close #212)

f221c0a

voropaevp force-pushed the feature/embedded-list branch from 121c7cf to f221c0a Compare November 12, 2022 23:04

ordering

17376f8

istreeter reviewed Nov 14, 2022

View reviewed changes

review feedback

d1bd077

voropaevp force-pushed the feature/embedded-list branch from edbc33c to d1bd077 Compare November 16, 2022 00:36

voropaevp added 2 commits November 16, 2022 00:52

NPE

1d818a7

NPE

3ecc8b8

voropaevp force-pushed the feature/embedded-list branch from 5f09cf2 to 3ecc8b8 Compare November 17, 2022 22:03

voropaevp requested a review from pondzix November 17, 2022 22:03

istreeter approved these changes Nov 18, 2022

View reviewed changes

voropaevp merged commit 3ea1755 into develop Nov 21, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add list-schemas for the embedded repo (close #212) #214

Add list-schemas for the embedded repo (close #212) #214

voropaevp commented Nov 12, 2022

snowplowcla commented Nov 12, 2022

istreeter left a comment

istreeter Nov 14, 2022

istreeter Nov 14, 2022

istreeter Nov 14, 2022

istreeter Nov 14, 2022

voropaevp Nov 15, 2022

voropaevp commented Nov 15, 2022

voropaevp commented Nov 17, 2022

istreeter left a comment

istreeter Nov 18, 2022

Add list-schemas for the embedded repo (close #212) #214

Add list-schemas for the embedded repo (close #212) #214

Conversation

voropaevp commented Nov 12, 2022

snowplowcla commented Nov 12, 2022

istreeter left a comment

Choose a reason for hiding this comment

istreeter Nov 14, 2022

Choose a reason for hiding this comment

istreeter Nov 14, 2022

Choose a reason for hiding this comment

istreeter Nov 14, 2022

Choose a reason for hiding this comment

istreeter Nov 14, 2022

Choose a reason for hiding this comment

voropaevp Nov 15, 2022

Choose a reason for hiding this comment

voropaevp commented Nov 15, 2022

voropaevp commented Nov 17, 2022

istreeter left a comment

Choose a reason for hiding this comment

istreeter Nov 18, 2022

Choose a reason for hiding this comment