In [1]:
import os
import sys
sys.path.append(os.path.dirname(os.path.dirname(os.getcwd())))

# Setup stuff.  The cell just supports the workbook, you can ignore it
EXAMPLE_DO_FOLDER = os.path.join(os.getcwd(), "example_do_folder")
INSERTION_FOLDER = EXAMPLE_DO_FOLDER

def show_loadable(at):
    with open(os.path.join(INSERTION_FOLDER, at)) as file:
        contents = file.read()
    with_bar = "\n  | ".join(contents.split("\n"))
    print(F"\n### SHOWING MODULE {'.../'+at!r}\n  | {with_bar}\n\n")

def run_do(*args, **kwargs):
    parts = [repr(x) for x in args] + [F"{k}={v!r}" for k, v in kwargs.items()]
    print(F"do({', '.join(parts)})")
    result = do(*args, **kwargs)
    print(F"--> {result!r}")
    print()


print(F"### INSERTION FOLDER        = {INSERTION_FOLDER!r}")
print(F"### DO FOLDER (in Jupyter)  = {EXAMPLE_DO_FOLDER}")
sys.path.append(os.path.dirname(os.path.dirname(EXAMPLE_DO_FOLDER)))
from dvc_dat import do, dat_config   # Add all loadables BEFORE loading this module
if not os.path.exists(INSERTION_FOLDER):
    input(f"WARNING: INSERTION_FOLDER {INSERTION_FOLDER!r} not found.")
    os.makedirs(INSERTION_FOLDER)
print("\n\n\n\n")




<style>
.output_scroll {
    height: 1500px !important; /* You can change 500px to your desired height */
}
</style>

# Mapping dotted-strings to python source code objects.
The "do.load" function provides a simple way to load functions and data directly from
python source modules.


### Registering and accessing python objects

In [2]:
do.mount(at="foo.bar.baz", value=[11, "two"])

In [3]:
do.load("foo.bar.baz")

.
##### And you can see semantically this is just a dict-tree of values:

In [4]:
# Code needs to be refactored for this to work:
# do.load("foo.bar")   

### Registering python modules
Here we register a python module, then load data a functions from it via dotted.names.

In [5]:
path = f"{EXAMPLE_DO_FOLDER}/some_python_file.py"
print(f"here we are registering 'params' as {path!r}.")
do.mount(at="params", file=path)

In [6]:
show_loadable("some_python_file.py")

In [7]:
do.load("params.a_value")

In [8]:
fn = do.load("params.a_function")
fn()

In [9]:
# And we can see that the do function is just a wrapper around the do.load.  This performs the same function as the above cell.
do("params.a_function")

### DATCONFIG - Implicitly registered modules
'Do' will scan from CWD to find '.datconfig' and use the do_folder it specifies.  At load time this folder tree is scanned and all .py .json and .yaml files found are implicitly registered according to the basename of each file.

In [10]:
show_loadable("../.datconfig")

In [11]:
show_loadable("hello/do_examples/my_data.py")


#### Loading from implicitly defined modules
These values are all loaded from python and yaml files implicitly registered since they are contained under the do_folder.

In [12]:
do.load("my_data.message")    # Returns the global variable 'message' from the my_data.py file.

In [13]:
do.load("my_data.a_tree.b")   # Returns the value of the nested variable 'b' from the 'a_tree' dictionary in my_data.py

In [14]:
#
# But under the covers, this tree of values is really still just data contained in some python module.  This module can be accessed directly if needed:
do.load("my_data")

In [15]:
show_loadable("hello/do_examples/my_yaml_data.yaml")

In [16]:
do.load("my_yaml_data.two")   # loading within a substructure

In [17]:
do.load("my_data.a_tree.a.two")  # Same data included in to another structure

# CORE "DO" FUNCTIONALITY -- Dynamically Loaded Function

The do function provides efficient access to dynamically searched and loaded python functions:
1. that are referenced by a naming string
2. that are dynmaically loaded from a python module
3. that accept fixed & keyword arguments and return results as any function does

.
## EXAMPLE -- A SIMPLEST "DO" CALL
This "do" loads hello_world.py and runs the function hello_world from it.

In [18]:
show_loadable("hello/do_examples/hello_world.py")

In [19]:
do("hello_world")

.
## EXAMPLE -- MULTIPLE DO FUNCTIONS DEFINED IN ONE MODULE
One can put multiple do functions in one file and reference them with a dot notion as shown here.

In [20]:
show_loadable("hello/do_examples/hello_again.py")

In [21]:
do("hello_again.hella")

.
## EXAMPLE -- PASSING ARGS AND RESULTS
Here we see fixed and keyword args being forwarded by do to the underlying function.
And likewise its result is forward back to be the result of the do call.

In [22]:
do("hello_again.salutation", "Michael", emphasis=True)

.
## EXAMPLE -- ALL SUB-FOLDERS OF A DO_FOLDER ARE SCANNED AND ADDED TO THE ROOT OF THE DO NAMESPACE
Here we see "deep_hello" is called even when it occurs deeply within the folder tree.

In [23]:
show_loadable("hello/do_examples/deep/deep/deep/deep_hello.py")

In [24]:
do("deep_hello")

.
.
# >>> CALLING A CONFIGURATION <<<
In addition to invoking a simple function, "do" can also invoke a configuration dict.
In this case:
1. The dict is expanded by recursively looking up "main.base" and using its values tree as defaults
2. Then finally calling the function associated with "main.do"
3. The expanded dict is passed as the first arg followed by args passed to do

.
## EXAMPLE -- CALLING A CONFIG
Here 'hello_config' loads a json file instead of a python function.
In this case the "main.do" value of "hello config action" is loaded and called.

In [25]:
show_loadable("hello/do_examples/configurable_salutation.py")

In [26]:
show_loadable("hello/do_examples/hello_config.json")

In [27]:
do("hello_config", "Martin")

.
## EXAMPLE -- CONFIG INHERITANCE
Here 'hello_shadowing_config' sets lucky_number to 777 and inherits function to call and other parameters from 'hello_config'.

In [28]:
show_loadable("hello/do_examples/hello_shadowed_config.json")

In [29]:
do("hello_shadowed_config")

_
## EXAMPLE -- COMBINING CONFIGS AND CODE
Complex tools (including nearly a visualizers/report generators) naturally have simple config info best expressed as a config dict,
and complex config best expressed in python.  Forcing these to be separate loadables will generate confusing sea of many tiny 
separate 2-line loadable files.

To address this "do" allows config data (normally stored in .json) to be stored in a variable in a .py file.  This allows that
config info to be bundled with functions that are referenced by that same config in the same module.  

The example below shows a silly complex tool that applies a sequence to text transformation rules to a sequence of letters.
The first loadable provides a config with the base parameters and the rule engine itself.  The second loadable configures the tool and 
provides a couple of small python rule functions that are used by the configuration all nicely wrapped up in a single .py file.

In [30]:
show_loadable("hello/do_examples/letterator.py")

In [31]:
show_loadable("hello/do_examples/my_letters.py")

In [32]:
do("my_letters")

.
.
# USE CASE - SELF DOCUMENTING PROCESSES
When possible we can use simple verisoned object to help us execute coding processes, and 
track/maintain those processes.  

.
## EXAMPLE -- Loadable constant
Here we show that a loadable can be any python constant data value.
In this example we have a set of named lists that are used to track 
our supported dataset, metrics, and tools.

This versioned data structure is used as input by the 'naughtly_list' script that scans
supported components to see that each has (1) a doc string, both quick and full regression tests, etc.


In [33]:
show_loadable("hello/do_examples/supported.yaml")

In [34]:
show_loadable("hello/do_examples/team_highlight.py")

In [35]:
show_loadable("hello/do_examples/naughty_list.py")

In [36]:
# Note this code presently does not run


from dvc_dat import Dat, do

reg1_name = do.load("supported")["datasets"][0]   # Gets then name of a mcproc result to use
reg1 = Dat.create(spec={})     # This should be Dat.load(reg1) but that dat does not exist here
score = do("team_highlight.money", reg1)  # computes money metric on reg1

do("naughty_list")   # runs our checking code

.
.
# USING DO FROM THE COMMAND LINE
Do encapsulates execution as a self describing building block.  The do function is designed to be easily 
embedded within larger execution scripts.  In some cases it is convenient for a user to directly invoke do
as a toplevel command.  The do commandline interface provide command line support "for free" for any such 
do function. It defines a simple mapping from expected --arguments and -a argument onto Python fixed and kwargs.
This probably best shown using a series of examples:

.
### EXAMPLE -- Showing default usage command for 

In [37]:
!./do --usage

.
### EXAMPLE -- INVOKING A DO FUNCTION FROM THE COMMAND LINE
Earlier we had hello salutation that took fixed and keyword args.
Without additonal configuration we can invoke it from the command line
using UNIX style args and flags as shown here:

In [38]:
!./do hello_again.salutation Maxim --emphasis

.
### EXAMPLE -- INVOKING A CONFIGURED TOOL FROM THE COMMAND LINE
In this example we show one also can invoke a do configuration from the command line as well.
Here we have the same configurable "letterator" tool invoked as a do function above:

In [39]:
show_loadable("hello/do_examples/my_letters.py")

In [40]:
!./do my_letters

### EXAMPLE -- TWEAK CONFIG FROM COMMANDLINE
Often we script and configure a complex test, but then we want to tweak one or two parameters over and over and check our results.
(This becomes especially powerful when intermediate results are cached, so retesting is fast.)

In [41]:
!./do my_letters --set main.title "Re-configured letterator" --json rules '[[2, "my_letters.triple_it"]]'

.
### EXAMPLE -- SETTING MULTIPLE PARAMETERS AT ONCE
The --sets keyword can perform multiple simple assigments at once

In [42]:
!./do my_letters --sets main.title=Quickie,start=100,end=110

# WRAPPING IT ALL UP - Templated Script Runs
As a special case, if the string passed to 'do' loads a dict instead of a callable then:
1. `dat_from_template` is called to create a fresh Dat based on this dict template.
2. `main.path` is expanded to get the name (path) for this dat.
3. Args and kwargs passed to `do` are added to `main.args` and `main.kwargs` in the
   newly created Dat.
3. Finally, any fn associated with `main.do` is called to perform the actual work of
   this Dat.  Results from this run are stored in _results_.json
