# cjio API design


In [1]:
from datetime import datetime
i = datetime.now()
print("Version {}-{}-{}".format(i.year, i.month, i.day))


Version 2019-5-20


## Why?

Let's take a look at the problems that I'm trying to solve with the API:

1. *It is cumbersome to work with 3D city models in a Python script or program, partially because the data model is complex and libraries for parsing it don't really exist.* Why Python? Because in my opinion 3D city models should be similarly easy to work with as other data models/formats in the analysis pipeline. Python is one of the most common scripting languages for data analysis, which includes GIS. Also this is the language that we teach to our students at the Geomatics MSc.

  + Luckily we have CityJSON already, which simplifies the data model part considerably. Also cjio works well in the command line. But in my experience, and according to a few other people, it would be handy to have a python library that allows to easily operate on a 3D city model. I'm heavily influenced by the *tidyverse* ecosystem in R, and also by the *pandas*, *scikit-learn* libraries in python. Thus in my head, cjio would allow to work with data as easily as these packages do, and also integrate with them.
  
## What has been done already?

When I started, cjio already had a well developed CLI. The idea is to expose the same functionality through an API, and also to provide and object-based interface to CityJSON.

Also there is [citygson](https://github.com/citygml4j/citygson), a Gson based library for parsing and serializing CityJSON. This is a Java library.

## The API

I approach the API design from the workflow that I aim at. As in **data preparation**, which entails reading a city model into python in a way that it becomes easy to work with each CityObject individually. Then **feature generation**, which is computing any measures, statistics from the CityObjects which can be fed into an analysis process.

### 1. Data preparation

#### CityObject types

A cityobject with children.


In [2]:
co_1 = {
"CityObjects": {
  "id-1": {
    "type": "Building",
    "geographicalExtent": [ 84710.1, 446846.0, -5.3, 84757.1, 446944.0, 40.9 ], 
    "attributes": { 
      "measuredHeight": 22.3,
      "roofType": "gable",
      "owner": "Elvis Presley"
    },
    "children": ["id-2"],
    "geometry": [{...}]
  },
  "id-2": {
    "type": "BuildingPart", 
    "parents": ["id-1"],
    "children": ["id-3"]
  },
  "id-3": {
    "type": "BuildingInstallation", 
    "parents": ["id-2"]
  },
  "id-4": {
    "type": "LandUse"
  }
}
}

Using a single getter function, and pass the type as argument. It should get both 1st-level and 2nd-level city objects. But in case of 2nd-level objects, how do we keep the reference to the parents?


In [None]:
def get_cityobjects(type):
    """Return a generator over the CityObjects of the given type. Type can be 1st-level or 2nd-level CityObject."""
    if type is None:
        yield all cityobjects
    else:
        yield cityobjects of the given type

def get_children():
    if cityobject has children:
        yield list of children
    else:
        yield list()

def get_parents():
    if cityobject has parents:
        yield list of parents
    else:
        yield list()

cm = cjio.load("some_model.json")
buildings = cm.get_cityobjects("building")
for building in buildings:
    children = building.get_children()

buildingparts = cm.get_cityobjects("buildingpart")
for part in buildingparts:
    part.get_children()
    part.get_parents()


Or for instance, sth like this should get the roof geometry of a building, provided that surfaces have semantics.


In [None]:
cm = cjio.load("some_model.json")
building_1 = cm.building.get(1)
roof_geom = building_1.roofsurface.geometry


Get footprints, wall, roofs from LoD1 AND LoD2

How to work with a 3d model and its pointcloud?

#### Working with semantics


In [None]:
class Geometry(object):
    def __init__(self, co):
        self.lod = co['geometry']['lod']
        self.type = co['geometry']['type']
        self.boundaries = list(Boundary(co))
        # also need to handle surface semantics here somewhere
        self.semantics = self._get_semantics(co)
        
    def _get_semantics(self, co):
        """Return a set of semantic surfaces that the CityObject has"""
        return set([sem['type'] for sem in co['geometry']['semantics']['surfaces']])

class SemanticSurface(object):
    def __init__(self):
        self.type = "RoofSurface"
        self.children = List[int]
        self.parent = int
        self.attributes = dict()
        self.boundaries = Boundary() 
    
    def _get_geometry(self):
        """Get the geometry of the surface"""
        # this might duplicate the geometry, because the full geometry is already exists, dereferenced in the parent Geometry object
        extract the related parts of the CityObject geometry

class Boundary(object):
    def __init__(self,co):
        self.vertices = list(of the veritces) # no need to duplicate these, enough to extract from the geometry
        self.geom = self._get_geometry(co)
        
    def _get_geometry(self, co):
        loop through co['geometry']['boundaries'] and dereference the geometry
        return geometry sf style

cm = cjio.load("some_model.json")
for building in cm.get_cityobjects("building"):
    # so, what exactly does this geometry object contain? For now, we only return the Geometry object from JSON as it is. The same as cm['CityObjects'][0]['geometry']. Later we can think about converting the json to something.
    geometry = building.get_geometry()
    isinstance(geometry, Geometry)
    geometry.lod
    geometry.type
    geometry.boundaries # I think we should return the boundaries simple feature style, verticies included. It makes it much easier to operate on it.    
    # or just dump all the vertices of the geometry as [(x,y,z),(x,y,z),...]
    vertices = building.get_vertices()
    
    roofs = building.get_surfaces('roofsurface')
    walls = building.get_surfaces('wallsurface')
    grounds = building.get_surfaces('groundsurface')
    for roof in roofs:
        geometry = roof.geometry
        geometry.boundaries
        roof.type
        roof.attributes
        children = roof.get_children()
        
cm = cjio.load("some_model.json")
for building in cm.get_cityobjects("building"):
    # so, what exactly does this geometry object contain? For now, we only return the Geometry object from JSON as it is. The same as cm['CityObjects'][0]['geometry']. Later we can think about converting the json to something.
    geometry = building.get_geometry()
    isinstance(geometry, Geometry)
    geometry.lod
    geometry.type
    geometry.boundaries # I think we should return the boundaries simple feature style, verticies included. It makes it much easier to operate on it.    
    # or just dump all the vertices of the geometry as [(x,y,z),(x,y,z),...]
    vertices = building.get_vertices()
    
    geometry.semantics
    roofs = geometry.get_surface('roofsurface')



### 2. Feature generation

How do we operate with 3D geometry? Do we cast to something from some library that has 3D geom? Or just provide getters for the vertices?

Python libraries with 3D geoms:

+ [open3d](http://www.open3d.org/docs/python_api/open3d.geometry.Geometry3D.html#open3d.geometry.Geometry3D)

#### Compute the volume of a building.


In [None]:
class Geometry:
    def __init__(self, co):
        self.lod = co['geometry']['lod']
        self.type = co['geometry']['type']
        self.boundaries = self._get_boundary(co)
    
    def _get_geometry(self, co):
        loop through co['geometry']['boundaries'] and get the vertex coordinates
        return geometry sf style

def compute_volume(geometry):
    if geometry.boundaries is empty:
        return 0
    if geometry.lod < 2:
        figure out what surface is what
    else:
        use the surface semantics 
    if geometry.type == 'Solid':
        compute the volume
    elif geometry.type == 'Point':
        raise TypeError("Cannot compute the volume of Point geometry")
    return volume

cm = cjio.load("some_model.json")
for building in cm.get_cityobjects("building"):
    # we need to check if the parent has geometry, but also if the child has geometry, because it is not prescribed how this should be
    geometry = building.get_geometry()
    volume_parent = compute_volume(geometry)
    
    for child in building.get_children():
    # actually, we need to do this recursively in order to visit the children of children too, because it is
    # not defined how many level deep we need to go
        geometry = child.get_geometry()
        volume_child = compute_volume(geometry)
        
    volume = volume_parent + volume_child


Get shape descriptors from the footrpints

Compute roof overhang as a distance between footprint and roofprint

Compute roof levels and roof types

(compare model to point cloud)

**!!! the most important software feature here is to allow the users to easily integrate their own cityobject/geometry processing functions with cjio !!!**

### 3. Export

Save cityobject attributes in tabular format (eg. tsv)

Save cityobject attributes in pandas dataframe

### 4. ML

Use the tabular output as input for scikit-learn (or any library). One can use `feather` to transport the objects to R.