Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upIssues with Max/Avg Per Property measure #65
Comments
LorenzBuehmann
added
bug
RDF Spark
labels
Jul 3, 2018
LorenzBuehmann
added this to the 0.5 milestone
Jul 3, 2018
LorenzBuehmann
assigned
LorenzBuehmann and
GezimSejdiu
Jul 3, 2018
added a commit
that referenced
this issue
Jul 6, 2018
patrickwestphal
added
the
RDF
label
Aug 16, 2018
GezimSejdiu
closed this
Dec 6, 2018
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
LorenzBuehmann commentedJul 3, 2018
•
edited
SANSA-RDF/sansa-rdf-spark/src/main/scala/net/sansa_stack/rdf/spark/stats/RDFStatistics.scala
Lines 309 to 316 in 4b1e3ab
does not what it should do:
filter
does filter for triples with an object having a URIxsd:int
,etc. - this is clearly wrong, it has to be filtered by the datatype of objects being a literalRDD
anymore?takeOrdered(1)
returns exactly 1 element from the RDD w.r.t. the ordering, thus, you'll get only one pair with the highest value as second element among all pairs independently of the propertyInt
only, thus, onlyxsd:int
would be covered - what aboutxsd:float
andxsd:dateTime
values?(Triple, Int)
is returned, but it should be (Node, Scalatype_of_Literal) - how do you want to do this generic? I guess we should returnRDD[(Node, Node)]
the same holds for Avg Per Property measure.
In addition, what would be the avg. of some
xsd:dateTime
values?