-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prevent normalization of records #18
Conversation
- Records behave both like maps and collections, but should probably be treated as opaque by Pyramid.
;; Don't normalize records | ||
(record? v) v | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not so sure this is right yet. I'm not sure it's wrong yet either.
The root cause of #17 is that in main
this takes the (map? v)
path which calls (empty v)
. In Clojure this is an error, it seems that in ClojureScript this returns nil
.
Another potential fix would be to appropriately consume and reconstruct the record here. I've been reading the source for clojure.walk
a lot recently and the way they do it is by using the record v
as the initial arg, e.g.
;; Don't normalize records | |
(record? v) v | |
;; Normalize records while preserving type | |
;; TODO could we move this into the `map?` branch and just change what we pass in to `into` | |
(record? v) (into v (map (juxt first (comp #(replace-all-nested-entities identify %) second))) v) | |
I haven't tested this yet but something akin to that should preserve the type of v
while also allowing us to descend into it and continue normalizing. In the case of the lambdaisland/uri data, I expect it wouldn't change anything about it unless you configured your identify
function to do something special with the shape of URI records. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting! But isn't the semantics of normalizing something from inside a record rather dubious?
Ie., if Pyramid finds something inside the record that looks like an entity, is it correct to modify the record and leave a reference in its place? I'd guess that it would very, very rarely be correct. And if it's not correct, what's the point of descending into them? Unlike Walk, Pyramid has pretty clear semantics, and doesn't really need to do general purpose processing over arbitrary structures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree it's a bit nebulous at this point. I am not a heavy user of records, and so it's hard for me to anticipate how people might want to use them with pyramid.
Since I have to choose, I think I am leaning towards treating records like maps and normalizing them and their contents, since the point of records is that you have something that acts like clojure data but can be extended with additional protocols.
I think it's unlikely most records would contain information that should be normalized, but if they do, it would be unfortunate as a user of the library to be unable to pierce that veil.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the opposite is true as well: if the expectation is that the record subscribes to a certain shape, it would be unfortunate to have that shape changed in an unexpected manner.
Given that records behave like a type (but without having to create a special mini-DSL around it), enumerate their fields, and can have those fields type hinted, if they do take a map as one of the fields, I think it's unlikely that any protocol extending it would also expect to find an ident in that same field.
If you do have a record that incidentally have something that could be normalized in one of its fields, and you want to extract it, intact, after it's been through Pyramid, how would you do it? Will pull
reconstruct the record in the correct shape?
I think an expectation of records is that you can create something that is named, doesn't require a special DSL, but also has integrity. I think pushing the normalization straight through them will break the last part of that promise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pull
could certainly recursively reconstruct descendants of records, although the type of the result of pulling a record would depend on the user asking for all of the requisite fields of the record in the query else it would have to return a map, just like dissocing a field from a record returns a map.
Perhaps there's a third way - could we create an API that would allow a user to opt in or out of the normalization process per type? I could see this done with a protocol that the record can implement if they choose to opt in, e.g.
(ns pyramid.core)
(defprotocol EntityRecord)
,,,,
(ns foo.bar
(:require
[pyramid.core :as p]))
;; opt in to normalization
(defrecord User [id name friends]
p/EntityRecord
,,,)
;; this would not be normalized, nor would any of its fields be normalized,
;; because it does not implement p/EntityRecord
(defrecord SystemOperation [id status])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opting in to normalization is an intriguing idea. To me, that would be the unsurprising option, that still allows for the functionality.
From a purely conceptual perspective, treating records as anything other than maps seems unidiomatic. If there’s an interface that describes the data, that might be reasonable, but an interface that is very specific to pyramid that is just making data less open and extensible seems like the wrong direction. Maybe listing some specific use cases that might want this would be helpful. |
That would depend on what you mean by "unidiomatic". From Pyramid's perspective, @lilactown would ultimately have to decide what's idiomatic or not. From Clojure's perspective, I'd argue that it's idiomatic. Records support the interface of maps because we don't want to write a new API every time we create a record. This doesn't mean that they are semantically interchangeable with maps in every circumstance. Can we safely assume that every record we encounter will not be modelling some type or unit, where there are assumptions as to how the internals of the record looks? Here's an example. Let's say I'm making a CRDT, and I'm modelling a range: (defrecord Range [type begin end])
(Range. :bold {:id 0 :target/id 100} {:id 1 :target/id 102}) When normalizing this, Pyramid would effectively convert the record into a new record that looks like this: (Range. :bold [:id 0] [:id 1]) Is this a good idea? I'm not sure, it depends on the expectations. One thing is for sure: the Regardless, the two scenarios are:
In scenario 2, I have both options available to me. In scenario 1, I have no power over what happens. There's potentially option 3 as well, as described by @lilactown here. |
@eneroth just to be clear, there is already a way to avoid normalizing of any data: you can change the "identify" function that the db is constructed with. Example: (require '[pyramid.core :as p])
(require '[pyramid.ident :as ident])
(def db (p/db [] (ident/by-keys :person/id :product/id)))
(defrecord Range [type begin end])
(def db1 (p/add db {:person/id 1 :person/friend {:person/id 2}}))
db1
;; => #:person{:id {2 #:person{:id 2}, 1 #:person{:id 1, :friend [:person/id 2]}}}
(def db2 (p/add db1 {:current-selection (->Range :select {:id 0 :target/id 100} {:id 1 :target/id 102})}))
db2
;; => {:person/id {2 #:person{:id 2}, 1 #:person{:id 1, :friend [:person/id 2]}},
;; :current-selection
;; {:type :select, :begin {:id 0, :target/id 100}, :end {:id 1, :target/id 102}}} You can see here how this will only normalize information about With the latest change in master, I would expect that records should have their type preserved (sort of by accident) due to the refactor I did to use I'll think more about opting out on a per-type or per-object basis - checking |
Nothing is being "converted". Whether or not you receive
This exact same argument applies to maps. The main difference between normalizing and not normalizing is that if you don't normalize then 1) updating the entity with I agree that figuring out how to have queries that allow shallow denormalization or deep denormalization is a tough question. I don't see how normalization depends on records when all the same arguments also apply to maps.
There are other options where whether or not a fully denormalized result is returned has nothing to do with whether or not a record was added to the db. Just to show how other dbs address this problem, asami has a |
I don't think that is correct, normalization happens when you're constructing or adding to the DB, and denormalization when you query it.
True, but only relevant if the basic assumption is that maps and records are equal in all aspects and use cases. I understand that your perspective is that they are, but please understand that my perspective is that they are not. (def entity
(Range. :bold {:id 0 :target/id 100} {:id 1 :target/id 102}))
(= entity (into {} entity))
;; => false 99.99997% of the time we use maps (for good reason), and the rest of the time there's some use case where I think the main point of view here boils down to a question of record duck-typing: if it has the interface of a map, should it in all cases be treated like a map? Consider the original example where this started: I switched from using And yes, not being normalized, I couldn't expect the power that comes from normalization to apply here. Do I want all URIs to change database-wide, because I updated one of them? To me that's equivalent of asking if I want all instances of the number
|
Records aren't just map-like. Records are maps. (defrecord Foo [])
(map? (->Foo)) ;; true This isn't even duck-typing. Duck typing means that you can implicitly or incidentally conform to an interface if you implement methods with the same name in a shared, global namespace. Records explicitly and intentionally implement the map interface.
I do not assume this because it is not true. The different map implementations(eg. hash, array, record, etc) do differ. Some differences include performance and, in some cases, behavior. However, they all adhere to the map interface.
Records are "object-like" in the sense that they directly support polymorphism, but they are different from objects in most OO languages since they are immutable. It's true that a Record can't always be replaced with a map of the same keys and values, but that's also true for maps! Since maps can implement protocols via metadata, a map also can't be substituted for another map with different metadata in all cases. Generally, Clojure is written in terms of abstractions.
Why would that happen? If they refer to the same URI entity, then yes, they should change. If they refer to different URI entities, then only the entities that are the same should update.
Why not? There are multiple options for achieving this.
For eample, Pyramid already has a convention for determining identity. For any record, you could simply Having said that, I think there might be good reasons for not treating records the same as maps. Since records are relatively rare and one of the goals for Pyramid is read performance, it's totally reasonable to treat maps and records differently to meet those goals. It also might make sense to treat records opaquely just to simplify the codebase. However, treating maps differently from records because "records aren't maps" doesn't make sense because records are, in fact, maps. I think it's also totally plausible that it might actually simplify the code base and provide more leverage to utilize the fact that records are maps. |
Hello! Sorry for the long bout of silence; I finally got that covid I've heard so much about and have been out of commission for a bit. In the meantime, there seems to have been some changes in the main that has unblocked the main problem we had: I think we've reached a point for pros and cons have been clearly spelled out, and I don't think we need to delve any further into this subject. Thanks for the discussion, and feel free to reopen if I'm wrong. |
Records behave both like maps and collections, but should probably be treated as opaque by Pyramid.
This prevents,
from turning into,
And instead returns,
This is in reference to #17.