This notebook outlines the sort of information that is available in kiara, which could be included in the codeview and/or rendered notebook.

In [7]:
from kiara import Kiara

kiara = Kiara.instance()

# Module-related information

Kiara modules are Python classes that inherit from the [KiaraModule](https://dharpa.org/kiara/api_reference/kiara.module/#kiara.module.KiaraModule) base class.

A module can only be used when it was instantiated with some configuration (which can be empty), which means that there are 2 ways to look at a module:

- the module class
- an instance of the module class, that was created with module configuration

## The module class

Module class information is static, and contains information about its purpopse (aka documentation), who created it, where it lives, etc.

Here is a list of attributes that can be queried for a module class

### Module id

Technically not really an attribute, but it makes sense to mention it explicitely. This is the name that is used to refer to a module class. In most circumstances it's a namespaced string (using '.' as separators), like:

- 'network.graph.find_shortest_path'
- 'table.import.from_local_file'
- 'language.tokens.remove_stopwords'

There is no hard rule about how those modules should be named, in general it's just advisable to make the purpose of the module (and also maybe the data type it operates on) clear to users without having to read further documentation.

### Module type documentation

This can be considered 2 different things, even though the first one is included in the 2nd.

#### Module description

A single sentence that describes the module purpose.

In [8]:
import_file_module = kiara.get_module_class("table.import.from_local_file")

import_file_module_type_metadta = import_file_module.get_type_metadata()
import_file_module_type_metadta.documentation.description

'Import a supported file and create a table from it.'

#### Module documentation

One or several paragraphs of markdown text that describe in detail (hopefully) what the module does, and how it does those things. The 'Module description' is included in this text, as the first paragraph.

In [9]:
import_file_module_type_metadta.documentation.full_doc

'Import a supported file and create a table from it.\n\nCurrently, only csv files are supported.'

#### Origin

Mostly, this is authorship information. But could potentially also include things like copyright, etc.

In [11]:
import_file_module_type_metadta.origin.authors

[AuthorModel(name='Markus Binsteiner', email='markus@frkl.io')]

#### Context

This describes the context the module lives in (like Python package, git repo, etc.), and some of the properties that help distinguish/filter it from the other modules that live in the same place.

##### Tags

Tags here are keywords that are associated with some of the modules properties.

In [12]:
module_type_context = import_file_module_type_metadta.context
module_type_context.tags

{'core', 'import', 'onboarding', 'pipeline'}

##### Labels

Labels are similar to tags, but are in a key/value pair form. This makes it easier to query for certain aspects that modules share (like whether they are a pipline-type module, or not).

In [13]:
module_type_context.labels

{'package': 'kiara_modules.core', 'pipeline': 'yes'}

##### References

The 'references' value in the context is a free form key-/value pair dictionary (similar to labels), that contain links to urls that are relevant to or important for the module.

In [16]:
module_type_context.references

{'source_repo': LinkModel(url=AnyUrl('https://github.com/DHARPA-Project/kiara_modules.core', scheme='https', host='github.com', tld='com', host_type='domain', path='/DHARPA-Project/kiara_modules.core'), desc='The module package git repository.'),
 'documentation': LinkModel(url=AnyUrl('https://dharpa.org/kiara_modules.core/', scheme='https', host='dharpa.org', tld='org', host_type='domain', path='/kiara_modules.core/'), desc='The url for the module package documentation.'),
 'module_doc': LinkModel(url=AnyUrl('https://dharpa.org/kiara_modules.core/pipelines_list.html#tableimportfrom_local_file', scheme='https', host='dharpa.org', tld='org', host_type='domain', path='/kiara_modules.core/pipelines_list.html', fragment='tableimportfrom_local_file'), desc='A link to the published, auto-generated module documentation.')}

#### Python class

This section contains information about the underlying Python class of the module. If the module is assembled from a pipeline description, *kiara* dyanmically creates a Python class for it, so you might or might not find the Python class in the source files, depending on that. In the following, you wouldn't, for example.

In [17]:
import_file_module_type_metadta.python_class

#### Pipeline config / Source code

Depending on whether the module is a pipeline module, or not, this section will differ.

##### Pipeline config

In the case of a pipeline, you'll the the configuration that was used to produce it.


In [19]:
import_file_module_type_metadta.pipeline_config.dict()

{'constants': {},
 'defaults': {},
 'steps': [{'module_type': 'import.local_file',
   'module_config': {},
   'step_id': 'read_file',
   'input_links': {}},
  {'module_type': 'table.create.from_file',
   'module_config': {},
   'step_id': 'create_table_from_file',
   'input_links': {'file': [{'step_id': 'read_file',
      'value_name': 'file',
      'sub_value': None}]}},
  {'module_type': 'value.save',
   'module_config': {'value_type': 'table'},
   'step_id': 'save_table',
   'input_links': {'value_item': [{'step_id': 'create_table_from_file',
      'value_name': 'table',
      'sub_value': None}]}}],
 'input_aliases': {'read_file__path': 'path',
  'read_file__aliases': 'file_aliases',
  'save_table__aliases': 'aliases'},
 'output_aliases': {'create_table_from_file__table': 'table',
  'save_table__value_id': 'value_id'},
 'documentation': 'Import a supported file and create a table from it.\n\nCurrently, only csv files are supported.\n',
 'context': {},
 'module_type_name': 'from_loc

##### Source code

And for Python modules, you can display the source code of the ``process`` method (which is the one where the important stuff happens).

In [21]:
sql_query_model = kiara.get_module_class("table.query.sql")
print(sql_query_model.get_type_metadata().process_src)

def process(self, inputs: ValueSet, outputs: ValueSet) -> None:

    import duckdb

    _relation_name: str = inputs.get_value_data("relation_name")
    if _relation_name.upper() in RESERVED_SQL_KEYWORDS:
        raise KiaraProcessingException(
            f"Invalid relation name '{_relation_name}': this is a reserved sql keyword, please select a different name."
        )

    _table = inputs.get_value_data("table")
    _query = inputs.get_value_data("query")

    relation: duckdb.DuckDBPyRelation = duckdb.arrow(_table)
    result: duckdb.DuckDBPyResult = relation.query(_relation_name, _query)

    outputs.set_value("query_result", result.arrow())

