Hierarchical data

## Issue

Allow a way of easily implementing the setting or getting of "hierarchical data". There are two approaches to this problem:

1. **Set downstream:** Whenever setting the data make sure it's set and updated downstream. As such children assets would inherit the changes made on parents. This would allow slightly heavier write changes at those occasions but far more optimal queries, as the asset itself would fully encapsulate the data. In a MongoDB design this is called ["embedding" the data](https://docs.mongodb.com/manual/tutorial/model-embedded-one-to-many-relationships-between-documents/) as opposed to [linking by reference](https://docs.mongodb.com/manual/tutorial/model-referenced-one-to-many-relationships-between-documents/).

2. **Query upstream**: Whenever the asset is lacking a specific piece of data query upstream for that data. This could introduce quite some more querying which could turn out slow when querying the data for a lot of asset (also making optimized queries for that slightly more involved). - However, that could be optimized.

#### Example for point 1. Setting values downstream

Setting data downstream is trivial. 
```python
from avalon import io

def set_data_hierarchical(id, data, traversal="data.visualParent"):
    """Update the asset's .data including downstream children
    
    Args:
        id (io.ObjectId): Database document id.
        data (dict): The key,values to update.
        traversal (str): The database data key to use 
            to traverse downstream. Defaults to 
            data.visualParent
            
    Returns:
        None
       
    """
    
    assert isinstance(data, dict), "data must be dict"
    
    document = io.find_one({"_id": id, "type": "asset"})
    document["data"].update(data)
    # Save back into database
    io.save(document)
    
    # Traverse downstream
    for child in io.find({traversal: id, "type": "asset"}):
        set_data_hierarchical(child["_id"], 
                              data=data,
                              traversal=traversal)
```

However, it's good to understand when to set it or not, e.g. a child could already have an override of its own which you'd want to keep. I believe this is best addressed with a good UI design where it's easy for the users to decide and tweak what children will get updated too, also [noted here](https://github.com/getavalon/core/issues/406#issuecomment-518270054)


#### Example for point 2. Querying data upstream and optimizing it.

A simple example of a hierarchical upstream query can be found [here](https://github.com/Colorbleed/colorbleed-config/blob/master/colorbleed/lib.py#L274) where the FPS is queried on the asset and when not found is queried from the project.

A potential optimization when querying for hundreds of assets in one go (including upstream traversal) one could potentially keep around a temporary cache for data of the assets/parent assets.

```python
# psuedocode that is not implemented currently
from avalon.io

with io.cached_queries():
    # Here whenever a query is done through `io.find_one` or 
    # `io.find` the result is cached based on *args+**kwargs
    # passed to the functions. If there's a cache hit the
    # previously retrieved value would get returned as opposed
    # to the database getting queried.
    
    # First query in this context, so no cache.
    a = io.find_one({"id": custom_id_a})
    
    # Second query with the same values will hit the stored
    # cache value and will not result in a hit to the database
    a = io.find_one({"id": custom_id_a})   # no database query
    
    b = io.find_one({"id": custom_id_b})   # database query
    
    # As such whenever you query upstream for tons of assets
    # any previous hits would not be database queries.
```

An example of caching functions with decorators I guess can be found [here](https://stackoverflow.com/questions/815110/is-there-a-decorator-to-simply-cache-function-return-values) and also [described here](https://dbader.org/blog/python-memoization)

---

Reference:
- This discussion started in #406 and got some input regarding choosing between 1 and 2 [here](https://github.com/getavalon/core/issues/406#issuecomment-518261311)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Hierarchical data #410

Issue

Example for point 1. Setting values downstream

Example for point 2. Querying data upstream and optimizing it.

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Hierarchical data #410

Description

Issue

Example for point 1. Setting values downstream

Example for point 2. Querying data upstream and optimizing it.

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions