Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Schema defn syntax #125

Closed
ikitommi opened this issue Nov 28, 2019 · 30 comments
Closed

Support Schema defn syntax #125

ikitommi opened this issue Nov 28, 2019 · 30 comments
Labels
enhancement New feature or request for discussion Discussion is main focus of PR
Milestone

Comments

@ikitommi
Copy link
Member

(m/defn plus :- int?
   "adds two numbers together"
  [x :- int?, y :- int?]
  (+ x y))

Links

Similar

@ikitommi ikitommi added the enhancement New feature or request label Nov 28, 2019
@ikitommi
Copy link
Member Author

Should this be a separate ns, like malli.fn?

@ikitommi ikitommi added the for discussion Discussion is main focus of PR label Nov 28, 2019
@rschmukler
Copy link
Contributor

rschmukler commented Nov 28, 2019

I'm very interested in this type of tool. I've written a proof of concept that will generate the code based off of a map passed in. ie. Parsing the defn, whether ghostwheel or schema, can be independent of the resulting code generation. I will likely support both syntaxes.

I'm personally a fan of the ghostwheel style syntax, and I think supporting both is pretty easy.

(>defn add
  [x y]
  [pos-int? pos-int? => pos-int?]
  (+ x y))

Features that I think are valuable in this kind of functionality:

  1. Instrumentation (ie. validating input and output)
  2. Stubbing pure functions (ie. if you define with an empty body, it just calls (m/generate pos-int?) in addition to validating the input.
  3. Validating that functions are pure (ie. what ghosthweel does)
  4. Configuring a default schema used as well as allowing overrides as metadata on the function
  5. Optionally compiling away entirely (for production)
  6. Integration with clojure.test to easily assert that all functions adhere to their definitions (ie. (assert-schema 'sample.ns) or something similar.
  7. Nrepl middleware to integrate it into editors.
  8. Integration with criterion via metadata (ie. adding a ::benchmark true metadata).
  9. Strong repl oreitned API to introspect and invoke pieces of the above functionality.

I realize that my goals might be different from what malli is intending to do so I'm also happy to do it as a library. If we were keeping it within malli I would definitely do it in a different namespace / package so that it's easier to do compile time removal (ghostwheel does this with stubs) and extra dependencies for analysis (eg. a graph library, nrepl, etc).

@esuomi
Copy link

esuomi commented Nov 29, 2019

For deeper integration purposes I'd like to see metadata and type hints supported as well - in fact while I'm normally against implicit magic, I wouldn't mind if the schemas themselves could introduce type hints making these two snippets sort of same:

(m/defn ^:silly magic :- string?
  [i :- int?]
  (-> i inc dec str))
(defn ^:silly magic ^String 
  [^Integer i]
  (-> i inc dec str))

(yes, the latter sample lacks all Malli calls for functional equality...imagine they're there)

Other than that, having private variants available (defn-, potentially def- although that doesn't really exist) is also what I'd consider a low hanging fruit. Scopes and restrictions matter :)

And yes, separate ns - or even a module - would make the most sense, as these macros would mainly be just bridges/expansions of core functionality. In fact that would probably encourage multiple implementations, one for us Plumatic fans, another for spec people and a lot more for those with even wilder ideas.

@rschmukler
Copy link
Contributor

Just released my first draft of the project mentioned above. Feel free to give it a go at teknql/aave. Happy to potentially merge some of it back under malli if the goals are in-line with what I'm trying to do. Otherwise, plenty of room in open-source!

@ikitommi ikitommi added this to the alpha1 milestone Feb 1, 2020
@kwrooijen
Copy link
Contributor

Having a way to define schemas for functions would be extremely useful. Personally I'm more a fan of not overruling defn though. For a couple of reasons. (Please note that I'm just sharing my opinion. I'm not saying this is the better choice)

  1. People that are not familiar with Malli, who are reading your code, don't have to forcefully learn what m/defn is. If Malli has something like s/fdef the user can ignore it.
  2. You can add schemas to defns in another file, granting more flexibility
  3. If a schema definition is not embedded in the arguments, IMO the code is harder to read. If the schema definition is isolated, you can read the definition and have a good idea of what it does, without reading the code at all.
  4. You don't interfere with text parsers.

@ikitommi
Copy link
Member Author

draft (from ClojureD) from the plumatic syntax + clj-kondo -integration

clj-kondo

@rschmukler
Copy link
Contributor

@ikitommi Did you do that with the new clj-kondo hooks API? That's brilliant...

@sundbp
Copy link

sundbp commented Jul 29, 2020

This looks awesome! I take the points from @kwrooijen about providing an s/fdef equivalent way to hint a plain defn separately - but I also very much like the m/defn way of working in my own code most of the time as it (for me) makes it easier to keep everything in my head + remember to add and update schema info when I update a function. I'd be very open to having both ways available out of the box in a standard way - rather than have the clojure.spec outcome of the lib providing s/fdef only and several other libraries providing their own different versions of the m/defn type functionality and hence making code harder to read as there is more fragmentation.

@ikitommi ikitommi added the blocked Can't be done yet, waiting for some other issue to complete label Oct 5, 2020
@ingesolvoll
Copy link
Contributor

ingesolvoll commented Oct 30, 2020

EDIT: I misunderstood the example in the picture above, I mixed args with return value. Spec of args is exactly like I'm suggesting below, sorry. So my question then is basically: Will there be an API for anonymous functions?


Re-frame users like myself would love to be able to spec fn forms like this, without the hassle of creating a named defn outside of, say, reg-sub.

Some more detail:

It is a very common source of bugs that re-frame event or subscription args get out of sync. They are (mostly) defined as anonymous functions, and the interesting bits need to be destructured from a vector. Ideally, if specing such functions is possible, it should be as concise as possible. IMO the suggested syntax above would be a bit verbose and I would prefer inline specs next to each val.

Something like this would be ideal:

(f/reg-sub ::plus
  (fn-spec [db [_ x :- :int y :- :int]]
    (+ x y))

@eneroth
Copy link

eneroth commented Nov 10, 2020

I think both syntax versions (Ghostwheel/Plumatic) would benefit from being able to throw in more complex schemas, such as when one parameter depends on another. For example, ensuring that (> x y). The Ghostwheel syntax has some support for this (perhaps the Plumatic syntax allows for this as well, I'm unfamiliar).

The downside of the Ghostwheel version currently is that you have to specify the schema in-line, which sometimes becomes verbose as well as not being reusable in other places.

Also, for me, there's something a little bit odd with specifying the return before the parameters in the Plumatic version. If you think about the asymmetry: functions specify their parameters first and their return value last. The specs specify their return value first and their parameters last.

Borrowing the syntax from Ghostwheel would be preferable in my opinion.

[x :- int?, y :- int? => [:tuple int? int?]]

For complex specs, dependent parameters, perhaps something like this would be possible?

[x :- int?, y :- int?, x y :- #(> %1 %2) => [:tuple int? int?]]

(Technically, the predicate would be enough above; the anonymous function would be redundant)

@ikitommi
Copy link
Member Author

ikitommi commented Nov 15, 2020

@ingesolvoll Plumatic Schemas has support for schematized defn, fn and defrecord. those would come... free.

(ms/fn :- [:tuple int? pos-int?] [x :- int?, y :- int?] [x (* x x)])

about the syntax in which to describe the functions. I think it's ok to have multiple syntaxes as separate libraries.

the core malli would:

  • define needed protocols/records for describing the function schemas with malli
  • provide a single registry to hold the function schemas attached to existing vars
  • utilities to configure appying function schemas (inputs, outputs, input+outputs, never, ...)
  • an utility ns to map malli fn-schemas to clj-kondo

optional malli ns would:

  • offer plumatic syntax as-is. It's well known and GOOD :)
  • could be malli.schema.

optional libs (aave &) can:

  • offer new syntax and features on top of the core thing.

future

if/when Rich discovers a new and good syntax for inline spec-definitions, malli could follow that.

proposal

given an example:

(require '[malli.schema :as ms])

(ms/defn ^:always-validate fun :- [:tuple int? pos-int?]
  "returns a tuple of a number and it's value squared"
  ([x :- int?]
   (fun x x))
  ([x :- int?, y :- int?]
   [x (* x x)]))

options to describe that in malli are (at least):

1) schema-style

[:=>
 [:tuple int? pos-int?] ;; return (same for all arities?)
 [:tuple int?]          ;; first defined arity
 [:tuple int? int?]]    ;; second defined arity
  • good: compact
  • bad: not easy to extend? (e.g. add input+output correlation)

2) same but with vector of input arities:

[:=>
 [:tuple int? pos-int?] ;; return (same for all arities?)
 [[:tuple int?]         ;; defined arities
  [:tuple int? int?]]] 
  • good: compact
  • bad: not easy to extend? (e.g. add input+output correlation)

3) as properties

[:=> {:output [:tuple int? pos-int?]
      :input [[:tuple int?]
              [:tuple int? int?]]}]
  • good: easy to extend
  • bad: more top-level reserved keys in malli

... given the last example, to attach function schemas into existing vars, would be quite streightforward:

(defn fun
  "returns a tuple of a number and it's value squared"
  ([x] (fun x x))
  ([x y] [x (* x x)]))

;; a function to register a new function schema into existing var
(m/=> #'fun {:output [:tuple int? pos-int?]
             :input [[:tuple int?]
                     [:tuple int? int?]]})

;; .. or same as a macro:
(m/=> fun {:output [:tuple int? pos-int?]
           :input [[:tuple int?]
                   [:tuple int? int?]]})

... and would enable growth as it's easy to add also more docs, input+output correlation etc.

@ikitommi
Copy link
Member Author

ikitommi commented Nov 15, 2020

good points from Slack, by @borkdude:

  • return schemas should be per arity, not per Var

enchansed malli.schema syntax:

(defn fun
  ([x :- int?] :- int?
   x)
  ([x :- int? y :- string?] :- [:tuple string? int?]
   [y (* x x)]))

:=> would always be for single arity:

(defn fun1 [x] (* x x))

(m/=> fun1 [:=> int? [:tuple pos-int?]])

(defn fun
  ([x] (fun x x))
  ([x y] [x (* x x)]))

(m/=> fun [:or
           [:=> int? [:tuple int?]]
           [:=> [:tuple int? pos-int?] [:tuple int?]]])

... could also be a just/additional map-syntax for these:

(m/=> fun1 {:arities {1 {:input int?
                         :output [:tuple pos-int?]}}})

(m/=> fun {:arities {1 {:output int?
                        :input [:tuple int?]}
                     2 {:output [:tuple int? pos-int?]
                        :input [:tuple int? int?]}}})

the latter is basically what clj-kondo has.

@ikitommi
Copy link
Member Author

changed the schema-style impl to support per arity return schemas:

(require '[malli.schema :as ms])

(ms/defn fun :- [:tuple int? pos-int?]
  "returns a tuple of a number and it's value squared"
  ([x :- int?] :- any? ;; arity-level override
   (fun x x))
  ([x :- int?, y :- int?] ;; uses the default return
   [x (* x x)]))

(clojure.repl/doc fun)
; -------------------------
; demo/fun
; ([x] [x y])
;   
;   [:-> [:tuple int?] any?]
;   [:-> [:tuple int? int?] [:tuple int? pos-int?]]
; 
;   returns a tuple of a number and it's value squared

@ingesolvoll
Copy link
Contributor

ingesolvoll commented Nov 15, 2020

@ikitommi Looks good!

What I had in mind for my re-frame code was the ability to spec subscription and event vectors in a very compact way. From what I can see, schema way would be able to check the event vector as a whole, but not by annotating inline. So this works with Plumatic:

(def Ref*
  [(s/one s/Keyword "k")
   (s/one s/Uuid "u")])

(f/reg-sub ::my-sub
  (s/fn [[_ ref some-key] :- [(s/one s/Keyword "k") (s/one Ref* "ref") s/Any]]
    ...))

But not this:

(f/reg-sub ::loading-state
  (s/fn [[_ ref :- Ref* some-key]]
    ...))

The Ref* is the only thing I'm interested in specing in this case, and the only thing likely to fail, so having to spec the entire vector is painful enough for this to not be used. I can see that supporting vector and map destructuring in this way might be out of scope for this API though...

EDIT: Did a bit of research in plumatic docs, and found that the problem I'm outlining above is indeed documented as a gotcha for the plumatic schema defn:

https://plumatic.github.io/schema/schema.core.html#var-defn
"Schema metadata is only processed on top-level arguments. I.e., you can use destructuring, but you must put schema metadata on the top-level arguments, not the destructured variables."

@sunnygunnar
Copy link
Contributor

sunnygunnar commented Jan 8, 2021

I would propose instead the syntax in https://github.com/alexandergunnarson/gradual:

(gs/defn abc
  [a pos-int?
   b (s/and double? #(> % 3))
   c _
   | (> b a)
   > map?]
  ...)

where specs go inline with the params, there's a "such that" spec (notated by the |), and there's an output spec notated by >.

Also, this works with destructuring:

(gs/defn abc 
  [{:as m
    :keys [a keyword?
           b string?]
    [:c number?]}
   ::some-map-spec]
  ...)

@vemv
Copy link

vemv commented Jan 10, 2021

Hi!

I'd like to contribute a proposal which I believe would be benefitial to Malli, and the Clojure community at large.

For a little context, like many people I have used Schema, spec1 and spec2 in various jobs. Also at work sometimes the topic of Malli pops up - it's a well regarded solution :)

An observation which one can make is that, to put it informally, life is too short for syntax. Plumatic Schema is syntax-y. So are a few alternatives.

The specific problems with syntax are:

  • they deviate from clojure.core
    • more stuff to learn, more tools to be adapted
  • they fragment the community
  • one cannot trivially toggle/rewrite defns from one form (clojure.core/defn) to the other (schema/defn)
  • other conceptual concerns that might be apparent to many people convinced of the value of Lisp and Simplicity.

Some problems, one can argue, shouldn't be solved over and over again by different people, because they can be surprisingly intrincate. Some of those problems/features are:

  • type annotation inference/integration out of specs/schemas
    • for better performance, and guaranteed up-to-date type hints
  • clojure/script compatibility
  • destructuring support
  • validation modes (e.g. assert-based, unconditional on a per-case basis, etc)
  • support for the full defn syntax, and other forms like, fn, let, defprotocol, letfn

Out of those beliefs, I've created https://github.com/nedap/speced.def . Interestingly, speced.def is bit of a misnomer - it basically revolves around assertions, not Spec. Spec is mostly an implementation detail.

I have already POCed Plumatic Schema support for speced.def, and could verify that I had to change very little (in fact most of the changes went to https://github.com/nedap/utils.spec, which is a very tiny lib).

So, I found out that one can have a simple, generalized 'protocol' (not necessarily a literal runtime protocol) that spec1, spec2, Schema and Malli all can perfectly support. After all, all of those can be seen as validation libraries, doing essentially the same.

With that protocol, and with impl-agnostic macros (for defn etc), one can create a '100%' solution which could be really great for the community.

Specific proposal

  • I'd like to implement a namespace, e.g. meta.malli which offers metadata-based, Malli-powered versions of defn, defprotocol, etc.
  • I think I can commit to deliver it by a very reasonable timeframe
  • Feedback from the Malli team would be taken into account as-is, for all things Malli (e.g. how is validation performed)
  • The result would be the official Malli offering

Unknowns

  • We don't know what syntax/etc spec2 will introduce
    • OTOH, we know that spec rejects output validation, which often surprises people and most libs in this space dismiss (notably Schema)
    • Also, speced.def syntax is basically clojure.core/* syntax with metadata. So it remains interchangeable with clojure.core/* syntax which obviously is not going away, ever.
  • such-that is not implemented in speced.def
    • generally it favors :pre/:post
    • this is not written in stone, and could be added as simply another metadata-based directive
  • generative testing of defn isn't automatic
    • i.e., out of a speced/defn no automatic test will be generated
    • this characteristic is shared with Plumatic Schema et al
    • again this limitation is not written in stone - it originates from my habit of favoring data generation over function call generation
    • I don't think there's any fundamental limitation preventing this feature.
  • clj-kondo type checking integration
    • until very recently I didn't know this was a thing, it surely is feasible.

Other

  • Right now speced.def master emits clojure.core/assert/:pre/post, which not everyone likes.
    • The branch that addresses that is finally ready. It's precisely that work that prompted me to do this proposal.

@vemv
Copy link

vemv commented Jan 10, 2021

...I'm aware that this might be seen as bit of a rant or self-promotion coming out of the blue, but I see the work I'm offering as something very carefully designed and implemented - it comes from no other place that from my love for Clojure, its community and ever safer, pragmatic programming.

If you check out the linked project you might notice the amount of work that has gone into it, the lack of significant bugs it has sustained over two years and the quite limited scope it has self-imposed.

Please let me know if this sounds attractive and feel free to ping me over Slack as well!

@borkdude
Copy link
Contributor

@vemv

FYI: I've proposed a similar idea to Alex once in the spec channel, to use metadata for types. This was something they considered as one of the alternatives for spec2, but probably won't make it as the solution.

I'm looking forward to defn that will be embraced by the Clojure community at large, so we can finally leverage this as a standard (and clj-kondo can automatically pick up on this).

@ikitommi
Copy link
Member Author

ikitommi commented Jan 10, 2021

Once #317 gets merged, the function schema definitions look like ~this (inspired by clj-condo):

(defn fun
  ([x] (* x x))
  ([x y] [x (* y y)])
  ([x y & zs] (into (fun x y) zs)))

(fun 1)         ; => 1
(fun 1 -2)      ; => [1 4]
(fun 1 -2 3 -4) ; => [1 4 3 -4]

(m/=> fun {:arities {1 {:input [:cat int?]
                        :output pos-int?}
                     2 {:input [:cat int? int?]
                        :output [:cat int? pos-int?]}
                     :varargs {:input [:cat int? int? [:+ int?]]
                               :output [:cat int? pos-int? [:+ int?]]}}})

There will be tools to instrument all functions (at dev-time) based on the definitions + tools for inferring the function schemas from instrumented functions based on sample inputs & outputs (e.g. tests).

All custom defn syntaxes can be built on top of the core functionality and they will not be part of the core library (namespaces). Each existing format (ghostwheel/guardrails/aave, gradual, plumatic, speced.def) has its fan base and in the domain of user applications (not libraries), for now I think it's ok to just pick the one that works best for you.

I agree that having one core-team & community designed and approved type-tagging specification would be great. I guess we'll have to wait what the core-team is doing for this. Having just one spec/schema library would also be great, but not expecting spec to become a tool for the runtime any time soon.

@vemv @borkdude the meta-data defn-syntax looks nice. Plumatic has var meta-data keys for scoping when the checks are made: :never-validate (just the docs) and :always-validate (enforce always at runtime). Have used the latter a lot in real projects.

@vemv
Copy link

vemv commented Jan 10, 2021

Thanks for sharing!

Seems a classy and healthily conservative choice, as no syntax is being introduced.

@ikitommi ikitommi removed the blocked Can't be done yet, waiting for some other issue to complete label Jan 26, 2021
@ikitommi
Copy link
Member Author

No blockers.

@miikka miikka modified the milestones: alpha1, 0.3.0 Jan 27, 2021
@j-cr
Copy link

j-cr commented Feb 23, 2021

It seems no one mentioned that defn wrappers don't compose, so there's that to consider too.

@vemv
Copy link

vemv commented Feb 23, 2021

That seems worth expanding with facts, since I can literally (comp f g) where f was defined with Schema and g with Orchestra

In the end there's a IFn protocol which all solutions have to speak to be useful. How a given library decides to perform its validations is not necessarily relevant to its consumers.

It's a problem similar to logging - there won't be ever (at least in JVM land) a one-true solution, so libraries have to pick something and trust that consumers won't care or will have it reasonably easy to tune things.

@j-cr
Copy link

j-cr commented Feb 23, 2021

@vemv I was referring to defn-like macros themselves, not the resulting fns (see the link in the comment above for details)

@ikitommi
Copy link
Member Author

ikitommi commented Mar 14, 2021

Learned that Plumatic also supports metadata on argument symbols:

(ms/defn fun :- [:tuple :int :int]
  "return number and the square"
  [^:int x]
  [x (* x x)])

is equivalent to:

(ms/defn fun :- [:tuple :int :int]
  "return number and the square"
  [x :- :int]
  [x (* x x)])

... sadly, Clojure doesn't allow vectors with this:

(ms/defn fun
  [^[:tuple :int] x]
  x)
; =throws=> Metadata must be Symbol,Keyword,String or Map

so instead:

(ms/defn fun
  [^{:tag [:tuple :int]} x]
  x)

which is worse IMO than:

(ms/defn fun
  [x :- [:tuple :int]]
  x)

@vemv
Copy link

vemv commented Mar 14, 2021

which is worse IMO than: [...]

one can also consider that non-reusable inline specs are not supposed to be something to be used all the time. The https://clojure.org/about/spec document emphasizes names and reusability (a point that IIRC is expanded in RH's talks). So, making inline specs easy but not too easy is kind of a feature :)

In the wild it's common to see Clojure people repeating an Int annotation while that knowledge could be named instead. That seems pretty wrong and if libraries can have some influence in correcting that, so much the better?

Other objections also apply. The most practical one being that IDEs, formatters and linters would have to deal with extra syntax.

@marciol
Copy link

marciol commented Apr 8, 2021

One interesting advantage of providing this syntax, even as a separated library, is the possibility to make easier the migration of old Schema projects to Malli.

@vemv
Copy link

vemv commented Apr 8, 2021

One interesting advantage of providing this syntax, even as a separated library, is the possibility to make easier the migration of old Schema projects to Malli.

While it's true that a separate library is fairly harmless, there's also a substantial 'risk' that this library could actually become the main thing. People can always end up using an arbitrary choice if there are no other appealing options.

Coincidentally, as a (legacy) user of Plumatic Schema I've sporadically worked on a tool that rewrites Schema syntax to a non-syntax (i.e. the metadata-based defn forms I tend to advocate).

Here it is: https://github.com/threatgrid/clj-experiments/tree/master/refactor-defn

And here are some examples (inputs -> outputs): https://github.com/threatgrid/clj-experiments/blob/21314bd62c0fdef4035c2166538b737efc869be0/refactor-defn/test/unit/cisco/refactor_defn/impl/rewriting.clj#L21-L34

This tool actually works - said tests pass. I wrote this thing some months before https://github.com/clj-commons/rewrite-clj/ was revamped. Now that it's distributed as an alpha I could actually go ahead and publish my POC :)

I could further work on my projects (speced.def, refactor-defn) for Malli compat, if there's sufficient interest from Malli side.

@ikitommi
Copy link
Member Author

ikitommi commented Jan 9, 2022

@vemv I totally missed or forgot your last comment. Just merged the plumatic syntax into malli, under malli.experimental. While doing that, I did a Malli Schema for both Defn and Destructuring syntax, which allows easy translation from plumatic to meta-data based solution. Does not rewrite, but understands and parses the syntaxes IMO correctly. Should work with Plumatic too.

Anyway, malli now has support for plumatic syntax: just reads the schema from args, emits normal defn and m/=>:

(require '[malli.experimental :as mx])

(macroexpand-1
 `(mx/defn kikka :- :int
    "this is a kikka"
    [x :- :int]
    (inc x)))
;(do
;  (clojure.core/let
;   [defn__21996__auto__
;    (clojure.core/defn
;      user/kikka
;      "this is a kikka"
;      {:raw-arglists (quote ([user/x :- :int])), :schema [:=> [:cat :int] :int]}
;      ([user/x] nil (clojure.core/inc user/x)))]
;    (malli.core/=> user/kikka [:=> [:cat :int] :int])
;    defn__21996__auto__))

@ikitommi ikitommi closed this as completed Jan 9, 2022
@vemv
Copy link

vemv commented Jan 9, 2022

No issue, feel free to reach out at some point if interested in this metadata-based syntax (particularly outside of defn: let, defprotocol, etc)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request for discussion Discussion is main focus of PR
Projects
None yet
Development

Successfully merging a pull request may close this issue.