-
Notifications
You must be signed in to change notification settings - Fork 7
Enable no-redeclare
rule
#94
Comments
Since it's also time to move away from immutable I will do this in this issue as well. This issue will thus also remove all References to the |
I stumbled into a problem moving away from Immutable. In |
I also don't think @rubensworks will let me write my own equals within the Array prototype :p . |
This is the indeed some difficult code, as we basically define a DSL for function overloading in Typescript in type safe way. Lemme check! In the meanwhile, what's the reason with moving away from immutable (i'm not up to date with the ecosystem)? Context: A couple of years ago I think Immutable was heavily used in comunica, and @rubensworks was definitely fan of using it in Sparqlee as well. I needed a List object that was a valid entry for a HashMap key (which a standard JS is not). I'll have a look at the code, I have some ideas. |
I don't really know the reason. Maybe @rubensworks isn't that good of a friend with them anymore. I love that you have some ideas because I am fresh out of them? Btw, what does DSL stand for? (sorry) |
So the general gist behind OverloadMap is that SPARQL functions are overloaded (the same function has different amount of arguments, or the arguments can be different types). OverloadMap keeps an implementation of this function for each overload. Overloads are uniquely defined by the list of arguments to the function and their types (which is runtime information). So when we know at runtime which concrete types are passed to our function, we can look up the corresponding definition in the overloadmap. It becomes more difficult due to subclassing/subtypes. When a function takes a generic RDF Term, it can be a Literal, a blank node, an IRI. So we can't just lookup the concrete types, as those might not be in the overload map (only a definition for Term would be there). This is why we have the monomorphization function: it looks implementations up in the overload map in manner of progressing generality: it first looks up the most concrete types (these are specific literal types, such as int and string), then more general terms (literal, blank, IRI), and only then do we consider the arguments as general types. THIS IS VERY LIMITED, it's basically sufficient because there are no overloads that take different levels of granularity, e.g. if a function existed that takes 2 arguments, with 3 overloads: (int, term), (str, term), and a fallback (term, term), we would always use the fallback (as the second argument wouldn't match in the overloadmap, until we come to the last monomorphization step). More soon. |
DSL is Domain Specific Language. |
Note: when I say types in the above, these are types of the RDF terms expressed in strings, not TS types! |
Option 1 (the straightforward approach): You create a class that wraps plain JS arrays, but defines a sound equality operator, e.g. Option 2 (fix the problem in the meanwhile): Now we have a single Map per function, but what we actually want is a tree (of maps). Bear with me. The root level map has as keys the allowed types of the first argument of the function, and as values it has maps. These maps have as keys the allowed types of the second argument, and as values it has maps. Now I say "the values are maps", these are the intermediate nodes of the tree, but the values can also be concrete implementations, these are the leaves of the tree, and are reached when the function can take no more arguments. The advantage is that this does implementation resolving correctly (and more flexible), monomorphization is still in constant time, and the keys (which are the type of an argument) are just a single string, so we don't even need an ES6 map, a javascript object will do. Type signature: |
I do like you second approach. I think I'll try that. I have one question tho (I might be able to just look it up to but you know), is there a function that tests if a certain type is inherited? I know it's not clear what I mean, an example should do it. You said:
I can imagine this also goes for types like double and int? Do we have a function that would test, provided an int we know it's also a doulble? Or is this not necessary? My question just boils down to the fact that I don't understand how the sub-classing problem would be solved using the type signature you mentioned
|
+1 on the raw hash-based approach. Constant-time lookups is definitely something we want here, as performance is key. |
So you prefer the first one better? Alright, I guess I'll be checking that one out then :D . |
Both are constant time, since the amount of arguments is constant (with the exception of varargs, but there you can't escape it). I think performance overhead will be negligible, or potentially even better in the tree based approach, as no string manipulation needs to be done. It's just (1/2/3/ key lookups) times (superclassing checks) vs (strinconcat) times (superclassing checks). The more structure is encoded in the data, the less needs to be done at runtime. |
There is not, cause all functions are 'inherited'. You have 3 options: (abstract) Literals, Blank Nodes, and IRIs, all 3 of which are a Term. The literals are divided again into concrete types: int, float, string, ... There are also some automatic casting rules IIRC (which are currently not implemented), which might be able to be solved with the tree approach.
So I don't think this is necessary. We now the exact inheritance tree. It's static. So just "hardcode" it.
The subclassing isn't really solved by the type signature, but by the way it's structure allows us to select concrete types. Now, we can incrementally interpret each argument more general, on a per argument basis. This is just a way to check all possible options/combinations. You could also do this with the plain map, but then you would have A LOT of options, increasing combinatorically for each extra argument and allowed type. I'd be happy to have a call about this. |
It might not be a bad idea to start with the hash-based approach in any case, just to get a feel for the code. It's not gonna be an easy ride. If there's time left, the tree approach can be done still (which would be very cool!). |
Ah no, I was actually referring to the second approach (I was referring to "so we don't even need an ES6 map, a javascript object will do"). For reference, this issue sounds a bit similar to an implementation of the |
I think I understand what you mean Wout. Having a call to make sure we're on the same page on this would not be that bad since it is pretty complicated. And a vital piece of code considering how many tests broke :p . |
After calling we decided to also fix the autocast issue mentioned earlier (hard coded and starting only for xsd). I will briefly describe what we'll try. Now, the overload map itself will have a type like Please know this is a first draft of what we'll try to implement and it can still be changed significantly. |
Or in pseudo-code:
|
Just to be absolutely clear and because I've made this mistake. The |
Will you handle being a subtype of Literal and Term with this table as well? (but I'm guessing not, since this approach only makes sense for concrete types). Anyway this is very much the right direction! The key part to focus on is rewriting the Anyway, you will hit another difficulty with the extensionTable, specifically, how do you find the
So from a map like that you need to extract xsd:int and xsd:string (and filter out literal and term), and check in the extensionTable for which the argument is valid for each one. I think I should also warn against moving to a more object-oriented approach (OverloadNode & Leaf) at this stage (vs the functional style it is now), although I give this warning with low confidence. I don't think the surrounding code is going to work well with it (especially the function definitions DSL)., and rewriting everything at once might be a mess. |
I just noticed that our tree structure will need to be searched depth first. Example: |
Sniped! |
To find the This way we can make some kind of stack and have the order of the stack defined. (First the first argument by 'correctness', then the second by correctness). |
I probably wasn't clear enough in my pseudo-code, but I would suggest to pre-calculate all possible extension types of each given type during the init phase ( So that In that case, lookups can be done in constant time, instead of |
I know I would also just calculate these max distances during the |
Not sure what the value of knowing these max distances is then. |
I would only use this system for literals with non-default So the two-tier system you talk about makes sense to keep. |
Yes, in a way. But take |
Suggestion: what if we turn extension map on it's head? For every type, we have an (ordered) list of supertypes? Then what you do is just check in order whether that supertype is present as a key in the overloads. I think this also matches more closely with how sub/super method resolving works in OO programming. This also extends nicely to Literal and Term, which should just always be at the end of the list (which can ofc be handled specially), and is just a generalization of the way it currently works. E.g. |
That could work but would require linear time in term super types (which could be pretty big?) I think linear time in term of overloads would be better? So for a certain node we would check all types it can accept (this is max like 3 or something?) and then take the one with the lowest distance I already spoke of? This would be n log (n) (we need to sort the distances so we can add them to the stack in the right order) n= arguments overloaded. |
If I'm not mistaken your approach would be n*m (n= overloaded arguments, m= amount of supertypes) ? For every overloaded type you would need to check if it is contained within the list of super types. Although writing this i realize you could also make this the same n log(n) I talked about in my approach. |
I think you can ignore the overloads of types that are adjacent (not in sub/super relation) with the approach I suggested.
and an argument of type |
But my approach is not really relevant since I do like the idea of the orginal extension types: it's less "clean" (not the usual way to do it), with it will be faster likely because there will be less overloads than supertypes generally. But what you need is not a distance, but an ordering. If you have |
Your types and their sup/super relations form a tree. Each of the types is thus at a specific level (or depth). What you can do is walk over the overloads in deepest to highest. If the argument's type is deeper (or at the same level) and part of that subtree (in extensionmap), it fits. If extensionmap is a set, this is (constant time) * n_overloads. |
* Fix non 'Immutable' no-redeclare rule * Quicksave before bos battle (Converting Helpers) * Why do we not have unit tests? Debugging this is gonna hurt * Comiling & tested version after enabeling no-redeclare rule #94 * Fix backwards compatibility issue * Resolve changes requested by @rubenworks * Resolve changes requested by @wschella
Question for @wschella regarding implementation. Online I found an overview of all xsd types and their relation. Within sparqlee I found an object with type URL's. I was wondering whether there's a reason not all xsd types are in the object? |
Because I think these are the only ones the SPARQL spec talks about (in all function specifications). I also see no mention of http://www.w3.org/2001/XMLSchema#dayTimeDuration there, which is definitely required for SPARQL |
It would be very nice to create such a graph ourselves and add it to the repo. E.g. SPARQL also uses the terminology "numeric", and I don't see any reference to langString in the this schema either. |
+1, this would be highly valuable! (might even want to add it to the comunica website) |
So you're saying this is still incomplete? |
@jitsedesmet Yes and no, I think how this organization maps to the types as mentioned in the SPARQL spec is not clear to me, so for our purpose it's incomplete. There needs to be explicit mention of at least 2 extra concrete types:
Also many other types are largely irrelevant, but could be included for completeness i guess. |
So what are you suggesting? Support all types (the one I found yesterday + the object with type URL's? Or support only the object? I wouldn't see a reason to support something we don't need? It also wouldn't be consistent, we need to draw a clear line on what we support and what not. |
I suggest you do what you think is best! And I agree it's no need to support things we don't need and a clear line is better. |
I'll try with just the types already within sparqlee. |
For now, I'll be working with the types from https://www.w3.org/TR/xmlschema/#built-in-datatypes. There seems to be no reason to take http://www.w3.org/1999/02/22-rdf-syntax-ns#langString into account as its definition is: rdf:langString a rdfs:Datatype ;
rdfs:subClassOf rdfs:Literal ;
rdfs:isDefinedBy <http://www.w3.org/1999/02/22-rdf-syntax-ns#> ;
rdfs:seeAlso <http://www.w3.org/TR/rdf11-concepts/#section-Graph-Literal> ;
rdfs:label "langString" ;
rdfs:comment "The datatype of language-tagged string values" . In other words, it seems to be just a literal? It doesn't say its a xsd string? |
@wschella , since I'm editing the type links i wonder if I can make the links used in these objects just point to |
If you can that would be nice! When I wrote this TS couldn't handle this cleanly yet. |
There definitely is reason to take rdf:langString into account, as many SPARQL functions work specifically on that datatype! For these things, I always worked from the types as mentioned here https://www.w3.org/TR/sparql11-query/#OperatorMapping and here https://www.w3.org/TR/sparql11-query/#SparqlOps |
You'll for example also see mention of 'numeric' there. We want to represent this in the graph as well I think, as the graph should map cleanly to the SPARQL spec. |
Apparently it still can't. I should have tested this before asking the question. Sorry |
After having a call with @wschella and considering his comment #101 (comment) . We probably want to rewrite |
While trying to fix #93 it became clear I needed to disable the
no-redeclare
rule because an implementation ofSet
is used in the filelib/util/Consts.ts
I couldn't just use the default implementation of Set because aunion
function is used. This issue, tho probably small aims to enable theno-redeclare
rule.The text was updated successfully, but these errors were encountered: