# Query Rubcion Logs with RubiconJSON

If we want to query our Rubicon logs in an easy way, we can utilize the `RubiconJSON` class to query Rubicon logs in a JSONPath-like manner. `RubiconJSON` can take in top-level `rubicon_ml` objects, `projects`, and/or `experiments` and will return a json `dict`. We can call `search` on a `RubiconJSON` returned json for JSONPath-like querying by relying on the `jsonpath_ng` python implementation (https://github.com/h2non/jsonpath-ng) under the hood. 


### Let's create some example `rubicon_ml` objects for demonstration

In [1]:
import random

random.seed(24)

In [2]:
import pandas as pd
from rubicon_ml import Rubicon

NUM_EXPERIMENTS = 4

rb = Rubicon(persistence="memory")
pr = rb.get_or_create_project(name="jsonpath")

for _ in range(NUM_EXPERIMENTS):
    tags = [random.choice(["a", "b", "c"])]
    ex = pr.log_experiment(tags=tags)
        
    for feature in ["f", "g", "h", "i"]:
        ex.log_feature(name=feature)
            
    for parameter in [("d", 100), ("e", 1000), ("f", 1000)]:
        name, value = parameter
        ex.log_parameter(name=name, value=value)
        
    for metric in ["j", "k"]:
        value = random.choice([0.0, 1.0])
        tags = [random.choice(["l", "m", "n"])]
        ex.log_metric(name=metric, value=value, tags=tags)
        
    ex.log_artifact(name="o", data_bytes=b"o")
    ex.log_dataframe(pd.DataFrame([[0, 1], [1, 0]]))
    
pr.log_artifact(name="p", data_bytes=b"p")
pr.log_dataframe(pd.DataFrame([[0, 1], [1, 0]]))

pr

<rubicon_ml.client.project.Project at 0x1374d95d0>

### Demonstrating RubiconJSON class

`RubiconJSON` can take in any combination of top-level "rubicon_objects" (single or `list`), "projects" (single or `list`), and "experiments" (single or `list`) as keyword arguments. It returns a `RubiconJSON` object that has a json property with the file structure shown above. Here we'll simply demonstrate with the example project we just created. 

In [3]:
from rubicon_ml import RubiconJSON

pr_json = RubiconJSON(projects=pr)
pr_json.json

{'project': [{'name': 'jsonpath',
   'id': '14dc8759-8403-459b-876d-4bad6374a0a6',
   'description': None,
   'github_url': None,
   'training_metadata': None,
   'created_at': datetime.datetime(2023, 1, 4, 21, 34, 44, 295975),
   'artifact': [{'name': 'p',
     'id': 'a62d1af0-27da-4a90-8277-b38761d475d7',
     'description': None,
     'created_at': datetime.datetime(2023, 1, 4, 21, 34, 44, 299675),
     'tags': [],
     'parent_id': '14dc8759-8403-459b-876d-4bad6374a0a6'}],
   'dataframe': [{'id': 'ec45458e-3258-432a-be53-91b380abdbd3',
     'name': None,
     'description': None,
     'tags': [],
     'created_at': datetime.datetime(2023, 1, 4, 21, 34, 44, 299832),
     'parent_id': '14dc8759-8403-459b-876d-4bad6374a0a6'}],
   'experiment': [{'project_name': 'jsonpath',
     'id': '6e97f0fb-505c-4460-9e5d-2741111f443f',
     'name': None,
     'description': None,
     'model_name': None,
     'branch_name': None,
     'commit_hash': None,
     'training_metadata': None,
     'tags

### Demonstrating RubiconJSON search functionality

#### Here we'll get the metrics from each experiment

In [4]:
res = pr_json.search("$..experiment[*].metric")

print(f"{len(res)} experiments")
for match in res:
    print(f"{len(match.value)} metrics")
    print(match.value)

4 experiments
2 metrics
[{'name': 'j', 'value': 1.0, 'id': '89b243a6-6687-44ad-84d2-5e2fd920f99c', 'description': None, 'directionality': 'score', 'created_at': datetime.datetime(2023, 1, 4, 21, 34, 44, 296681), 'tags': ['n']}, {'name': 'k', 'value': 0.0, 'id': '608deb15-bfde-4830-9aee-5324689f2d7c', 'description': None, 'directionality': 'score', 'created_at': datetime.datetime(2023, 1, 4, 21, 34, 44, 296744), 'tags': ['l']}]
2 metrics
[{'name': 'j', 'value': 0.0, 'id': 'ad75fa80-864d-4f08-a144-0e7357464d4a', 'description': None, 'directionality': 'score', 'created_at': datetime.datetime(2023, 1, 4, 21, 34, 44, 297664), 'tags': ['l']}, {'name': 'k', 'value': 0.0, 'id': '896626d1-f3cc-418d-b227-aabce774f83c', 'description': None, 'directionality': 'score', 'created_at': datetime.datetime(2023, 1, 4, 21, 34, 44, 297724), 'tags': ['n']}]
2 metrics
[{'name': 'j', 'value': 1.0, 'id': '3ee54ec9-64f0-47e1-8549-cb41e1049480', 'description': None, 'directionality': 'score', 'created_at': datet

#### Now let's get all experiments with tag 'b'

In [5]:
res = pr_json.search("$..experiment[?(@.tags[*]=='b')]")

print(f"{len(res)} experiments")
for match in res:
    print(match.value)

1 experiments
{'project_name': 'jsonpath', 'id': '953e1550-29df-4fda-9972-e3a7b65e8f25', 'name': None, 'description': None, 'model_name': None, 'branch_name': None, 'commit_hash': None, 'training_metadata': None, 'tags': ['b'], 'created_at': datetime.datetime(2023, 1, 4, 21, 34, 44, 298836), 'feature': [{'name': 'f', 'id': 'c86a3fa6-a4b8-4538-bc75-a5f588187060', 'description': None, 'importance': None, 'tags': [], 'created_at': datetime.datetime(2023, 1, 4, 21, 34, 44, 298882)}, {'name': 'g', 'id': '0d2c1514-0420-459b-937e-8b4e707c5767', 'description': None, 'importance': None, 'tags': [], 'created_at': datetime.datetime(2023, 1, 4, 21, 34, 44, 298950)}, {'name': 'h', 'id': 'dbc8e53b-e7ca-4357-8278-a96da8b0b4e6', 'description': None, 'importance': None, 'tags': [], 'created_at': datetime.datetime(2023, 1, 4, 21, 34, 44, 298997)}, {'name': 'i', 'id': 'bc81ed11-8230-48b9-ba8a-68023248e055', 'description': None, 'importance': None, 'tags': [], 'created_at': datetime.datetime(2023, 1, 4, 2

#### To narrow our scope let's get all metrics named 'j' with a value greater than 0.5 from each experiment

In [6]:
res = pr_json.search("$..experiment[*].metric[?(@.name=='j' & @.value>=0.5)]")

print(f"{len(res)} metrics")
for match in res:
    print(match.value)

2 metrics
{'name': 'j', 'value': 1.0, 'id': '89b243a6-6687-44ad-84d2-5e2fd920f99c', 'description': None, 'directionality': 'score', 'created_at': datetime.datetime(2023, 1, 4, 21, 34, 44, 296681), 'tags': ['n']}
{'name': 'j', 'value': 1.0, 'id': '3ee54ec9-64f0-47e1-8549-cb41e1049480', 'description': None, 'directionality': 'score', 'created_at': datetime.datetime(2023, 1, 4, 21, 34, 44, 298445), 'tags': ['n']}


#### Now getting all experiments that contian a metric named 'j' with a value less than 0.5

In [7]:
res = pr_json.search("$..experiment[?(@.metric[?(@.name=='j')].value<=0.5)]")

print(f"{len(res)} experiments")
for match in res:
    print(match.value)

2 experiments
{'project_name': 'jsonpath', 'id': 'cc1b8445-b67a-4007-a0bd-dbda2a56f8c5', 'name': None, 'description': None, 'model_name': None, 'branch_name': None, 'commit_hash': None, 'training_metadata': None, 'tags': ['a'], 'created_at': datetime.datetime(2023, 1, 4, 21, 34, 44, 297268), 'feature': [{'name': 'f', 'id': '8edfdcbe-29ff-4cad-afcf-25bb03e0eb3e', 'description': None, 'importance': None, 'tags': [], 'created_at': datetime.datetime(2023, 1, 4, 21, 34, 44, 297317)}, {'name': 'g', 'id': '92890604-d1f2-4883-8957-14984417470b', 'description': None, 'importance': None, 'tags': [], 'created_at': datetime.datetime(2023, 1, 4, 21, 34, 44, 297378)}, {'name': 'h', 'id': '7100bca1-2322-425f-a57e-65b12be7889d', 'description': None, 'importance': None, 'tags': [], 'created_at': datetime.datetime(2023, 1, 4, 21, 34, 44, 297426)}, {'name': 'i', 'id': 'f7b7a878-10ca-4b2e-9ad2-1d19e7fdc12d', 'description': None, 'importance': None, 'tags': [], 'created_at': datetime.datetime(2023, 1, 4, 2

### Returning `rubicon_ml` objects from `RubiconJSON` search

We can specify the "return_type" parameter of `RubiconJSON`'s search method and return our query as `rubicon_ml` objects. That is we can retreive objects as `artifact`, `dataframe`, `experiment`, `feature`, `metric`, `parameter`, or `project` by setting "return_type" as "artifact", "dataframe", "experiment", "feature", "metric", "parameter", or "project", respectively.

In [8]:
res = pr_json.search("$..experiment[*].metric[?(@.name=='j' & @.value>=0.5)]", return_type="metric")
res

[<rubicon_ml.client.metric.Metric at 0x160dbb6d0>,
 <rubicon_ml.client.metric.Metric at 0x160dbb730>]

In [9]:
for m in res:
    print(m.name, m.value)

j 1.0
j 1.0
