Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[enhancement] Hybrid archetype instantiation for Machine Learning: combine instances from external resource with sampling of qualities already modeled in k.LAB #14

Open
diegomvd opened this issue Sep 27, 2022 · 0 comments

Comments

@diegomvd
Copy link
Contributor

diegomvd commented Sep 27, 2022

Context

At the moment learners can be defined in two ways:

  1. With an implicit archetype in a distributed context:
   learn geography:Elevation
	observing 
	@predictor earth:AtmosphericTemperature in Celsius
	@predictor geography:Slope in degree_angle
	using im.weka.bayesnet( learned.elevation )
  1. With an explicit archetype describing the learning instances:
// archetype definition from an external resource, most likely a shapefile containing geo-localized data of the observables involved on the
// learning process.
 model each "elevation:URN" 
	as earth:Site with im:High geography:Elevation,
	elev as geography:Elevation,
	temp as earth:AtmosphericTemperature in Celsius,
	slope as geography:Slope in degree_angle;

// note the explicit mention to the archetype
learn geography:Elevation within earth:Site
	observing 
	@archetype earth:Site with im:High geography:Elevation
        @predictor earth:AtmosphericTemperature in Celsius
	@predictor geography:Slope in degree_angle
	using im.weka.bayesnet( learned.elevation )

Limitation

There are situations when part of the training data is imported as a shapefile resource containing all the instances (typical of data coming from experimental measurements that are localized in space , e.g. tree height measures, or data of events, e.g. start of a fire), and the rest of the training data is already present and semantically modelled in k.LAB (e.g. atmospheric temperature). The current procedure to solve this is to manually add the data already present in k.LAB to the shapefile by matching the coordinates of the shapefile instances. For example: look at the coordinates where fires started and pick the values of temperature at these coordinates, merge the data and use it to build an explicit archetype. This is tedious, unpractical and prone to errors as well as limiting interoperability of data.

Proposed feature

When the archetype is instantiated from an imported shapefile resource allow observing other qualities already modeled within k.LAB and automatize the selection of the values of these observables at the coordinates of the shapefile instances to automatically build an archetype with both sources of data.

Possible syntax with a minimal simple example:

// assuming elevation is not already modeled in k.LAB
model each "elevation.data:URN"
  as earth:Site with occurrence of im:High geography:Elevation,
  elev as geograhpy:Elevation // get elevation data from the resource
  earth:AtmosphericTemperature in Celsius
  observing 
  earth:AtmosphericTemperature in Celsius
  using gis.points.extract(select = [expression to select only the coordinates of the elevation instances ]);
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant