Docs revamp (#176)

* new viz * docs revamp * update image * update docs
cognitedata · Nov 13, 2023 · eb02c34 · eb02c34
1 parent 9340e33
commit eb02c34
Show file tree

Hide file tree

Showing 15 changed files with 225 additions and 42 deletions.
diff --git a/docs/api/rules/exporters.md b/docs/api/rules/exporters.md
@@ -1,13 +1,9 @@
-::: cognite.neat.rules.exporter.rules2dms.DataModel
+::: cognite.neat.rules.exporter.rules2excel.ExcelExporter
 
-::: cognite.neat.rules.exporter.rules2graphql.GraphQLSchema
+::: cognite.neat.rules.exporter.rules2dms.DataModel
 
 ::: cognite.neat.rules.exporter.rules2ontology.Ontology
 
-::: cognite.neat.rules.exporter.rules2pydantic_models.rules_to_pydantic_models
-
-::: cognite.neat.rules.exporter.rules2rules.subset_rules
-
-::: cognite.neat.rules.exporter.rules2triples.get_instances_as_triples
+::: cognite.neat.rules.exporter.rules2graphql.GraphQLSchema
 
-::: cognite.neat.rules.exporter.core.rules2labels.get_labels
+::: cognite.neat.rules.exporter.rules2pydantic_models.rules_to_pydantic_models
diff --git a/docs/componenets-lifecycle-policy.md b/docs/componenets-lifecycle-policy.md
@@ -0,0 +1,3 @@
+# NEAT Components Lifecycle
+
+The NEAT Components Lifecycle policy is a set of guidelines that govern the lifecycle of application components, from testing to formal implementation in the underlying component library. The policy outlines the various stages that a component must go through before it can be considered for inclusion in the library, including design, development, testing, and documentation. The policy also includes guidelines for maintaining and updating components over time, ensuring that they remain compatible with the latest technologies and standards. By following this policy, NEAT ensures that its component library is of high quality and meets the needs of its users.
diff --git a/docs/data-modeling-flow.md b/docs/data-modeling-flow.md
@@ -0,0 +1,105 @@
+# Data Modeling Flow
+
+The data modeling flow, depicted in the figure below, consists of three main steps, and corresponding three subpackages in NEAT:
+
+- [Importer](./api/rules/importers.md)
+- [Validator](./api/rules/models.md)
+- [Exporter](./api/rules/exporters.md)
+
+The importer step is responsible for importing data models from various sources and converting them into `RawRules`, an internal unvalidated NEAT data model represenation, that can be used in the rest of the flow. The imported `RawRules` are run through validator that checks for errors and inconsistencies, ensuring that it meets the required standards and specifications of `Rules` object, an internal validated NEAT data model represenation.
+
+Finally, the exporter step takes the validated `Rules` and exports it to a format that can be used by other systems or applications, such as Cognite Data Fusion.
+
+All three subpackages are modular with well defined interfaces allowing further extensibility of NEAT.
+
+![NEAT High Level](./figs/data-modeling-flow.png)
+
+## Importers
+
+### Excel Importer
+
+The Excel importer imports `RawRules`/`Rules` object from its Excel/Spreadsheet representation.
+The Excel `Rules` is the main `Rules` representation. It provides the easiest and the most inclusive and collaborative way to create and edit data models.
+More about `Rules`, and especially their Excel representation, can be found in [this dedicated section of the documentation](./rules.md).
+
+### Google Sheet Importer
+
+The Google Sheet importer is same as the Excel importer with the main difference being that it allows fetching Google Sheet from a Google Drive.
+
+### YAML Importer
+
+The YAML importer import `Rules` serialization (`pydantic.model_dump`) into `Rules` object.
+
+### JSON Importer
+
+The JSON importer, same as the YAML importer, import `Rules` serialization (`model_dump`) into `Rules` object.
+
+### DMS Importer
+
+The Data Model Storage (DMS) importer imports DMS data model, composed of views, into `RawRules`/`Rules` representation. [Consult reference library documentation for the importer implementation](./api/rules/importers.md#cognite.neat.rules.importer.DMSImporter).
+
+### Graph Importer
+
+The Graph importer performs analysis and inference of data model which resultes in the inferred `RawRules`/`Rules` object, from the RDF graph (graph tech used by NEAT). [CConsult reference library documentation for the importer implementation](./api/rules/importers.md#cognite.neat.rules.importer.GraphImporter).
+
+### OWL Ontology Importer
+
+The OWL importer imports OWL (Web Ontology Language) ontology to `RawRules` representation. It is critical to know that ontologies, due to the various degree of completition and as they are often used for information modeling and not data modeling, often cannot be resolved into validated `Rules`. It is strongly suggested to export the derived `RawRules` to the Excel representation for futher editing towards the data model completition. [Consult reference library documentation for the importer implementation](./api/rules/importers.md#cognite.neat.rules.importer.OWLImporter).
+
+### XML Importer
+
+The XML importer import `RawRules`/`Rules` from an XML Schema. [Consult reference library documentation for the importer implementation](./api/rules/importers.md#cognite.neat.rules.importer.XMLImporter).
+
+!!! note annotate "WIP"
+
+    This importer is work in progress!
+
+### OpenAPI Importer
+
+The OpenAPI importer, imports a schema from OpenAPI specification to `RawRules`/`Rules` object. Currently this importer is only available as the [OpenApiToRules](./steps-library.html#openapitorules) step in the [library of steps](./steps-library.md), therefore only users of docker container distribution of NEAT have access to it.
+
+!!! note annotate "Alpha feature"
+
+    This importer is tagged as alpha, meaning it is not a stable component to become a part of the standard NEAT library. Read more about our policy of [Lifecycle of NEAT Components](./componenets-lifecycle-policy.md)
+
+## Raw Rules
+
+`Rules` are typically in the first place imported in their raw representation `RawRules`, which is basically object which is not validated against a number of validators.
+
+## Validator
+
+NEAT contains large number of validators, which assure that validates `RawRules` can be used for downstream applications, such as for example configuration of data model in Cognite Data Fusion. When resolving `RawRules` to `Rules` user can configure NEAT to generate validation report that will indicate all the errors and warnings the validation step brought up. By resolving errors neat users are able to create a valid `Rules` object. To help them in this NEAT contains a dedicated [Exceptions module](./api/exceptions.md) with humane definition of NEAT exceptions with aim of educating users. Each exception contains resolvable identifier (basically URL) to the additional description on when an exception occurs and how to fix it.
+
+## Rules
+
+Once `RawRules` goes through the validator module/step they are turned into `Rules`, an internal validated NEAT data model represenation.
+
+## Exporters
+
+Similar to [Importers](./data-modeling-flow.md#importers), NEAT comprises of a suite of `Rules` exporters to formats that can be used by downstreams applications, such as Cognite Data Fusion or [Protégé](https://protege.stanford.edu/).
+
+### Excel Exporter
+
+The Excel exporter exports `RawRules` or `Rules` object to their Excel representation. [Consult reference library documentation for the exporter implementation](./api/rules/exporters.md#cognite.neat.rules.exporter.rules2excel.ExcelExporter).
+
+### DMS Exporter
+
+The DMS Exporter exports `Rules` as a DMS data model comprised of a set of DMS containers and DMS views. The DMS exporter will be continiously developed and extended, as it represents the most critical NEAT exporter. [Consult reference library documentation for the exporter implementation](./api/rules/exporters.md#cognite.neat.rules.exporter.rules2dms.DataModel).
+
+!!! note annotate "Current Limitations"
+
+    Currently `Rules` are exported as a set of views and containers with the same name and in the same space.
+    There is currently ongoing a "Phase2" project which ends in March '24 which would enable advance data modeling concepts,
+    such as mapping between views and containers accross various spaces. Contact NEAT developers for more details.
+
+### Ontology Exporter
+
+The Ontology exporter exports `Rules` object to [OWL](https://www.w3.org/OWL/) ontology, which can be used in for example [Protégé](https://protege.stanford.edu/), or deployed and published on web via [an RDF store](https://en.wikipedia.org/wiki/Triplestore). [Consult reference library documentation for the exporter implementation](./api/rules/exporters.md#cognite.neat.rules.exporter.rules2ontology.Ontology.as_owl).
+
+### SHACL Exporter
+
+Similar to the Ontology exporter, [SHACL](https://www.w3.org/TR/shacl/) (Shapes Constraint Language) exporter exports `Rules` to underlaying SHACL shapes provided as triples, which can be used to validated RDF graphs. [Consult reference library documentation for the exporter implementation](./api/rules/exporters.md#cognite.neat.rules.exporter.rules2ontology.Ontology.as_shacl).
+
+### GraphQL Exporter
+
+The Graph exporter exports `Rules` to `GraphQL` schema, consisting of `types` and `fields`. [Consult reference library documentation for the exporter implementation](./api/rules/exporters.md#cognite.neat.rules.exporter.rules2graphql.GraphQLSchema).
diff --git a/docs/feature-overview.md b/docs/feature-overview.md
@@ -1,23 +1,41 @@
 # Features Overview
+
 This section provides a high-level overview of the NEAT features. Where possible, links to more detailed documentation are provided.
-## Subject Matter Expert Centric
+
+![NEAT High Level](./figs/features.png)
+
+## Domain Expert Centric
+
+NEAT is a domain expert-centric application, meaning that it is designed to be used by subject matter experts rather than just developers. Overall, NEAT's domain expert-centric approach helps to bridge the gap between business needs and technical implementation, resulting in more effective and efficient data modeling.
+
 ## Low/No Code
 
+NEAT provides a low/no-code environment for creating data modeling and knowledge graph workflows. This approach enables subject matter experts to easily create and manage data models without needing extensive programming knowledge, resulting in more accurate and relevant models that better reflect the needs of the business. The underlaying methods are wrapped in steps, which are exposed through UI for building so-called knowledge onboarding workflows.
+
 ## Batteries Included
 
-## Data QC/QA
+NEAT provides a batteries-included environment for creating data modeling and knowledge graph workflows. This means that the application includes pre-built components that enable users to quickly and easily create desired solutions.
 
-## Meaningful Errors
+## Modular Design
 
-## Linear Learning Curve
+NEAT's well-defined interfaces and separation of concerns make it simple to add missing features. The application's modular architecture allows for easy integration of new components, and the well-defined interfaces ensure that new components can be added without disrupting the existing underlaying library. This approach enables developers to quickly add missing features and extend the functionality of NEAT, without needing to rewrite large portions of the codebase. Overall, NEAT's architecture provides a flexible and scalable platform for data modeling and knowledge graph workflows.
 
+## Data QC/QA
 
-## Anything to Graph
+NEAT provides extensive QA/QC (Quality Assurance/Quality Control) features that safeguard Cognite Data Fusion from 'rubbish' data models and graphs. These features ensure that the data models and knowledge graphs created in NEAT are accurate and reliable, prior landing in Cognite Data Fusion, and that they meet the required standards and specifications. The QA/QC features include data validation, error checking, and data lineage tracking, which help to identify and correct errors in the data models.
 
-## Data Lineage
+## Meaningful Errors
 
+NEAT provides meaningful errors that help users rapidly fix errors in their data models and knowledge graphs. The application's error messages are designed to be clear and concise, making it easy for users to understand what went wrong and how to fix it. This approach helps to reduce the time and effort required to correct errors, enabling users to quickly iterate and improve their models.
 
-## Rapid Data Onboarding
+## Linear Learning Curve
 
+NEAT enables a linear learning curve due to its thorough design. The application's user-friendly interface and pre-built components make it easy for users to quickly learn and understand the data modeling process.
 
 ## Run Anywhere
+
+NEAT has the ability to run anywhere, whether it be locally or on infrastructure. This means that users can choose to run NEAT on their own local machine or on a cloud-based infrastructure, depending on their specific needs and requirements. This approach provides users with the flexibility to choose the deployment option that best suits their needs, while still being able to take advantage of NEAT's powerful data modeling and knowledge graph capabilities. Overall, NEAT's ability to run anywhere makes it a versatile and scalable solution for data modeling and knowledge graph workflows.
+
+## Rapid Knowledge Onboarding
+
+NEAT's features, including its low/no-code environment, batteries-included approach, modular design, extensive QA/QC features, meaningful errors, linear learning curve, and ability to run anywhere, enables rapid knowledge (i.e.,data models and graphs) onboarding.
diff --git a/docs/figs/data-modeling-flow.png b/docs/figs/data-modeling-flow.png
diff --git a/docs/figs/features.png b/docs/figs/features.png
diff --git a/docs/figs/graph-etl-flow.png b/docs/figs/graph-etl-flow.png
diff --git a/docs/figs/logo.png b/docs/figs/logo.png
diff --git a/docs/figs/neat-high-level.png b/docs/figs/neat-high-level.png
diff --git a/docs/figs/neat-two_flows.png b/docs/figs/neat-two_flows.png
diff --git a/docs/graph-etl-flow.md b/docs/graph-etl-flow.md
@@ -0,0 +1,54 @@
+# Graph ETL Flow
+The Graph ETL (Extract, Transform, Load) flow in NEAT is a systematic flow that involves extracting graph from a source system/format, optionally transforming this graph by reducing complexity, enriching it with additional information, and then loading the transformed or original graph into Cognite Data Fusion. This process facilitates the efficient manipulation and loading of graph data, enabling users to form knowledge base in Cognite Data Fusion. The ETL flow is divided into three main stages: the Extractor, the Transformer, and the Loader.
+
+To use Graph ETL Flow one needs to have compliant `Rules` extended with configuration of graph ETL through combination of `rdfpath`, `sparql` and/or `rawlookup` directives (aka transformation rules).
+
+![NEAT High Level](./figs/graph-etl-flow.png)
+
+## Extractor
+It is worth to mention that all the extractors have one aim and that is to extract RDF triples from source to [NeatGraphStore](./api/graph/stores.md#neatgraph-store). In some cases a certain conversion of source data to triples is performed.
+
+### RDF Extractor
+The RDF Extractor is a component of the Graph ETL flow in NEAT. It is responsible for extracting RDF triples from RDF (Resource Description Framework) sources. RDF is a standard model for data interchange on the Web. The RDF Extractor reads RDF data, which can be in various formats such as XML, Turtle, or JSON-LD, and load them to `NeatGraphStore`, which is then used in the rest of the Graph ETL flow. The most simplest use-case is to extract triples from a RDF file dump.
+
+### Graph Capturing Sheet Extractor
+This extractor extracts triples from a tailored spreadsheet template generated based on data model described in `Rules`. More details about this extractor can be found in [the reference documentation](./api/graph/extractors.md#cognite.neat.graph.extractors.graph_sheet_to_graph.extract_graph_from_sheet).
+
+### DMS Extractor
+This extractor extracts triples from nodes and edges stored in Cognite's Data Model Storage. Basically this extractor has also a role of converting nodes and edges to a set of triples which can be loaded to `NeatGraphStore` for downstream processing and transformations.
+
+!!! note annotate "WIP"
+
+    This extractor is work in progress, and it not in general availability!
+
+
+## Source Graph
+Source graph is stored in [NeatGraphStore](./api/graph/stores.md#neatgraph-store) that can be configured as:
+
+- internal `in-memory` or `on-disk` RDF triple store
+- remote RDF triple store (requires connection to the remote [SPARQL endpoint](https://medium.com/virtuoso-blog/what-is-a-sparql-endpoint-and-why-is-it-important-b3c9e6a20a8b))
+
+
+## Transformer
+NEAT contains its own transformation engine which, as mentioned earlier, is configured through `Rules` via [transformation rules](./rule-types.md). Predominately, the transformation engine leverages on graph traversal via `SPARQL` queries against the source graph. These queries are either explicitly stated through `sparql` directives, or implicitly constructed using `rdfpath` ([see more details](./rule-types.md#rdfpath-rule-singleproperty)). The library module for this step in the graph ETL flow consists of a single method which is described in more details in [the reference library](./api/graph/transformers.md).
+
+
+## Transformed Graph
+The derived transformed graph also makes use of `NeatGraphStore`.
+
+## Loader
+Opposite to Extractors, loaders resolve RDF triples stored in `NeatGraphStore` to downstream Cognite Data Fusion representations.
+
+### Asset Hierarchy Loader
+Asset hierarchy loader turns RDF triples to CDF asset hierarchy and relations among the assets. This downstream representation is also known as classic CDF, as it was the first data model representation in Cognite Data Fusion.
+
+
+### DMS Loader
+DMS loader turns RDF triples in set of nodes and edges which are stored in DMS.
+
+### RDF Loader
+Optionally RDF triples stored in the `NeatGraphStore` can be exported as an RDF drop for later use.
+
+!!! note annotate "WIP"
+
+    This extractor is work in progress, and it not in general availability!