-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rework of the concept of collection in the Schema abstraction #290
Conversation
modules/core/src/smithy4s/internals/SchematicDocumentDecoder.scala
Outdated
Show resolved
Hide resolved
@daddykotex please run the jsoniter benchmarks to see between the current version and this branch to whether this impacts the performance when decoding/encoding lists. |
It will impact performance - please, see Do we have in this new design an option to have specialized codecs for some collection tags? |
@plokhotnyuk if we keep the abstraction sealed, and expose the sealed members to recover the abstracted type, it'll allow for specialisation. I'd be fine with this. |
Bellow are results of benchmarks for different sizes. Before:
After:
The command used for that:
|
Move `A` as a method param Add name: String on the interface
Last update, with Before:
After:
|
@daddykotex, in the Then, in the collection method, you'd pattern-match against the |
|
Got it, I'm sorry, I did not think we wanted to ditch Updating the code and re-running the benchmarks |
Before:
After:
After specialization:
|
package smithy4s | ||
package schema | ||
|
||
sealed trait CollectionTag[C[_]] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
def struct[S](shapeId: ShapeId, hints: Hints, fields: Vector[SchemaField[S, _]], make: IndexedSeq[Any] => S): MaybeCT[S] = None | ||
def union[U](shapeId: ShapeId, hints: Hints, alternatives: Vector[SchemaAlt[U, _]], dispatch: U => Alt.SchemaAndValue[U, _]): MaybeCT[U] = None | ||
def biject[A, B](schema: Schema[A], to: A => B, from: B => A): MaybeCT[B] = { | ||
if (to.isInstanceOf[Newtype.Make[A, B]]) apply(schema).asInstanceOf[MaybeCT[B]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a hacky but when we have newtypes of primitives, we should attempt to retain compaction
The changes seem to behave as expected. Using the Spark utility, we can estimate the size of instance on the heap Using the following smithy schema :
and the following script : //> using scala 2.13
import $ivy.`org.openjdk.jol:jol-core:0.16`
import $ivy.`org.apache.spark::spark-core:3.3.0`
import $ivy.`com.disneystreaming.smithy4s::smithy4s-json:dev`
import scala.reflect.ClassTag
import org.openjdk.jol.info.ClassLayout;
import scala.collection.immutable.ArraySeq
import org.apache.spark.util.SizeEstimator
import foo.Foo
import smithy4s.http.json._
import com.github.plokhotnyuk.jsoniter_scala.core._
import smithy4s.schema._
object Main {
implicit val indexedSeqFooCodec = smithy4s.http.json.JCodec.fromSchema(Schema.indexedSeq(Foo.schema))
implicit val indexedSeqIntCodec = smithy4s.http.json.JCodec.fromSchema(Schema.indexedSeq(Schema.int))
def main(args: Array[String]): Unit = {
implicit val ct: ClassTag[Foo] = implicitly[ClassTag[Int]].asInstanceOf[ClassTag[Foo]]
val naive = IndexedSeq.fill(1000)(1)
val ints = ArraySeq.fill(1000)(1)
val foos = ArraySeq.fill(1000)(Foo(1))
val json = writeToArray[IndexedSeq[Int]](ints)
val resultInt = readFromArray[IndexedSeq[Int]](json)
val resultFoo = readFromArray[IndexedSeq[Foo]](json)
println(SizeEstimator.estimate(naive)) // Ints are stored as objects in standard IndexedSeq construction, we get 4696
println(SizeEstimator.estimate(ints)) // 4032
println(SizeEstimator.estimate(foos)) // 4032
println(SizeEstimator.estimate(resultInt)) // 4032
println(SizeEstimator.estimate(resultFoo)) // 4032
}
} |
that's interesting, thanks for scala-cli script, TIL about the SizeEstimator |
modules/core/src-2/Newtype.scala
Outdated
@inline final def apply(a: A): Type = a.asInstanceOf[Type] | ||
final val make: Newtype.Make[A, Type] = apply(_: A) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With this change, is there ever a case where someone should use the apply method rather than make? If not, should the apply method be private?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Users should very much keep using the apply method to construct instances of newtypes. This is very much a kind of hack to ensure we have a way of detecting whether a bijection is actually coming from a newtype. I'll try to hide it further and make the Make
construct package private.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
263a7a2
to
e2d5e2e
Compare
I'm gonna go forward with this. We can refine later if need be. |
[Oli's comment]
This replaces list/set in Schema by a generic collection construct, that takes a
CollectionTag
as a parameter. TheCollectionTag
carries methods to (de)construct the collections, and can be used to generically implement SchemaVisitors without worrying about what collection we're dealing with. In addition, theCollectionTag
is sealed, and can be inspected to provide specialised version of the encoding/decoding logic when relevant.