Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move ActOfDataTransformation up to the Event Ontology #211

Open
APCox opened this issue Jan 10, 2024 · 2 comments
Open

Move ActOfDataTransformation up to the Event Ontology #211

APCox opened this issue Jan 10, 2024 · 2 comments

Comments

@APCox
Copy link
Contributor

APCox commented Jan 10, 2024

cco:ActOfDataTransformation currently resides in a CCO extension ontology, but is generic enough that it more appropriately belongs in the CCO. Propose moving the term into the Event Ontology along with its parent class cco:ActOfInformationProcessing.

# http://www.ontologyrepository.com/CommonCoreOntologies/ActOfDataTransformation
:ActOfDataTransformation a owl:Class;
  rdfs:subClassOf :ActOfInformationProcessing;
  :definition "An Act of Information Processing in which an algorithm is executed to transform one or more input Information Content Entities into one or more output Information Content Entities."@en;
  :elucidation "It is not a requirement that the output Information Content Entity(ies) be qualitatively distinct from the input(s) as a result of an Act of Data Transformation, though doing so is typically the goal of performing this Act. Consider, for example, selecting a column in an Excel spreadsheet then executing the \"Remove Duplicates\" Algorithm on it. The intent is to remove rows in that column containing duplicate content. If no duplicate values are present, the information in the column remains unchanged but an Act of Data Transformation was nonetheless performed."@en;
  :is_curated_in_ontology "http://www.ontologyrepository.com/CommonCoreOntologies/Mid/EventOntology"^^xsd:anyURI;
  rdfs:label "Act of Data Transformation"@en .

# http://www.ontologyrepository.com/CommonCoreOntologies/ActOfInformationProcessing
:ActOfInformationProcessing a owl:Class;
  rdfs:subClassOf :IntentionalAct;
  :definition "A Planned Act in which one or more input Information Content Entities are received, manipulated, transferred, or stored by an Agent."@en;
  :is_curated_in_ontology "http://www.ontologyrepository.com/CommonCoreOntologies/Mid/EventOntology"^^xsd:anyURI;
  rdfs:label "Act of Information Processing"@en .
@mark-jensen
Copy link
Contributor

I agree these probably deserve a home in CCO-mid. Although I can see case for scoping a domain ontology for information processing. But until then, let's keep them here.

The definition for ActOfDataTransformation is circular. What it means to transform something needs to articulated. Presumably 'manipulated' in the parent class includes transformation.

Fist hit on google:
"Data transformation is the process of converting, cleansing, and structuring data into a usable format ...". It goes on to suggest four types:

  • Constructive, where data is added, copied or replicated
  • Destructive, where records and fields are deleted
  • Aesthetic, where certain values are standardized, or
  • Structural, which includes columns being renamed, moved, and combined

Wikipedia says "In computing, data transformation is the process of converting data from one format or structure into another format or structure."

Question is: do we cast a wide net and allow transformation to include generating new content, eg- when a table is "transformed" into a graph with added content provided by the semantic model is added, or, limit it to formatting and structural changes? If the former, then how do we reconcile transformation with statistical and ML processes?

@cameronmore
Copy link
Contributor

Act of Data Transformation = An Act of Information Processing in which an algorithm is executed to act upon one or more input Information Content Entities into one or more output Information Content Entities.

Saying 'act upon' avoids the problem of enumerating the possibilities of transformation (conversion, restructuring, etc), and also (per the elucidation) allows for the possibility that the data is not changed, just acted upon. I may have a function that removes references to a certain word in a body of text, but if the text never contained that word, then the text data that was transformed never actually changed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants