-
Notifications
You must be signed in to change notification settings - Fork 16
Refactor vandalism detection package #12
Refactor vandalism detection package #12
Conversation
There is no Spark or Flink involved on feature extraction, just string manipulation I moved them up to common layer in order to be able to be used by Spark or Flink module in the future
Hello @GezimSejdiu |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @GezimSejdiu
Thank you very much for your contribution! Indeed, having a common layer for feature extractions is really an interesting factorization.
I approve these changes for a merge. 👍
Hi @dgraux , Best, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @GezimSejdiu
Thank you very much for the unit-test addition.
These are being deprecated since the same parser/reader do exist on the SANSA io. See here : https://github.com/SANSA-Stack/SANSA-RDF/tree/develop/sansa-rdf-spark/src/main/scala/net/sansa_stack/rdf/spark/io for more details
The TRIX RDF serialization format will be supported by SANSA-RDF io
It has been disabled as of now since the modeling the data (based on the input dataset) is taking longer. We are going to provide a smaller dataset which takes less time to run.
This PR does a cleanup and refactor the vandalism detection package. It introduces a common layer for feature extractions since it is done on the string manipulation only and can be used further by Spark and Flink module.
Best regards,