# Functionality
## PyTrack

The `@PyTrack()` decorator is the main part of PyTrack.
It converts a class into DVC stage and enables the tracking.
Therefore, the class must implement an `__init__`, `__call__` and `run` method.

In [1]:
from pytrack import PyTrack

@PyTrack()
class HelloWorld:
    def __init__(self):
        pass
    def __call__(self, *args, **kwargs):
        pass
    def run(self):
        pass

hello_world = HelloWorld()

The decorator adds the `pytrack` attribute to the passed class, which contains most of the required methods.
This is used internally and usually does not require any changes.
Furthermore, it is protected against accidental overwriting.


In [2]:
print(vars(hello_world.pytrack))

try:
    hello_world.pytrack = "PyTrack"
except AttributeError:
    print("AttributeError: can't set attribute")

{'child': <__main__.HelloWorld object at 0x000002B781273190>, 'slurm_config': SlurmConfig(n=1), '_id': 0, '_running': False, '_module': None, 'dvc_file': 'dvc.yaml', 'was_called': False, 'allow_param_change': True, 'allow_result_change': False, 'dvc': DVCParams(multi_use=False, params_file=WindowsPath('config/params.json'), internals_file=WindowsPath('config/pytrack.json'), json_file=None, deps=[], outs=[], outs_path=WindowsPath('outs'), outs_no_cache=[], outs_no_cache_path=WindowsPath('outs'), outs_persistent=[], outs_persistent_path=WindowsPath('outs'), metrics=[], metrics_path=WindowsPath('metrics'), metrics_no_cache=[], metrics_no_cache_path=WindowsPath('metrics'), plots=[], plots_path=WindowsPath('plots'), plots_no_cache=[], plots_no_cache_path=WindowsPath('plots')), 'nb_mode': False}
AttributeError: can't set attribute


Using PyTrack starts becoming more interesting when parameters, dependencies and outputs are defined.
They will be explained in more detail in the upcoming tutorials.
The basic principle is, that one defines them in the `__init__` using `from pytrack import DVC` and then updates the user input within the `__call__` and the outputs through `run`.
There are two different types of outputs - `DVC.results` and e.g., `DVC.outs`.
The first is a special PyTrack output, that is written to a json file.
It requires JSON serializable data and is designed for smaller information, such as small lists, dictionaries, etc. .
All other supported outputs, such as `DVC.outs` are purely `pathlib.Path` objects that have to be read manually.

In [3]:
from pytrack import DVC

@PyTrack()
class HelloWorld:
    def __init__(self):
        self.result = DVC.result()
        self.output_file = DVC.outs('some_file.txt')

    def __call__(self, *args, **kwargs):
        pass
    def run(self):
        self.result = {'name': "HelloWorld", "ids": [1, 2, 3, 4, 5, 6]}
        self.output_file.write_text("Lorem Ipsum")

hello_world = HelloWorld()
# Let's have a look
print(hello_world.output_file)
print(hello_world.result)

outs\some_file.txt
{'name': 'HelloWorld', 'ids': [1, 2, 3, 4]}


We can see that the output_file is put into a directory `outs` which is used by PyTrack to handle stage outputs.
The directory is defined in

In [4]:
hello_world.pytrack.dvc.outs_path

WindowsPath('outs')

At this point it should be highlighted, that the code, that writes to results or the other output files should be put or called from within the `run` method.
It is of course possible to have additional functions, e.g., some post-evaluation of the results, but they will not be run by DVC and will not be tracked!

In [5]:
hello_world.run()

print(hello_world.output_file.read_text())
print(hello_world.result)

Lorem Ipsum
{'name': 'HelloWorld', 'ids': [1, 2, 3, 4, 5, 6]}


It is not possible to overwrite the result outside of the `run` method anyway!

In [6]:
hello_world.result = "New Result"
print(hello_world.result)

Result can only be changed within `run` call!


{'name': 'HelloWorld', 'ids': [1, 2, 3, 4, 5, 6]}


What is currently still missing from our example is any user input.
We can add user interaction through `DVC.params` and the `__call__` method.

In [7]:
@PyTrack()
class HelloWorld:
    def __init__(self):
        self.result = DVC.result()
        self.output_file = DVC.outs('some_file.txt')
        self.list_length = DVC.params() # you can pass a default here as DVC.params(10)

    def __call__(self, list_length):
        self.list_length = list_length

    def run(self):
        self.result = {'name': "HelloWorld", "ids": [x for x in range(self.list_length)]}
        self.output_file.write_text("Lorem Ipsum")

hello_world = HelloWorld()
hello_world(10)
hello_world.run()
hello_world.result

--- Writing new DVC file! ---
Overwriting existing configuration!
ERROR: you are not inside of a DVC repository (checked up to mount point 'C:\')



{'name': 'HelloWorld', 'ids': [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]}

Similar to the `DVC.results` the parameters can only be changed within the `__init__` or `__call__` method.

In [8]:
hello_world.list_length = 27
hello_world.list_length

This stage is being loaded. No internals will be changed!


10

It is also possible to pass lists or dictionaries to the e.g., `DVC.params`

In [9]:
@PyTrack()
class HelloWorld:
    def __init__(self):
        self.result = DVC.result()
        self.output_files = DVC.outs(['first_file.txt', 'second_file.txt'])
        self.list_options = DVC.params() # you can pass a default here as DVC.params(10)

    def __call__(self, list_options):
        self.list_options = list_options

    def run(self):
        self.result = {
            'name': "HelloWorld",
            "ids": [x for x in range(self.list_options.get('start'), self.list_options.get('stop'))]
        }
        self.output_files[0].write_text("Lorem Ipsum")
        self.output_files[1].write_text("Example Value")

hello_world = HelloWorld()
hello_world({'start': 1, 'stop':5})
hello_world.run()
print(hello_world.result)
print(hello_world.list_options)
print(hello_world.output_files)

Used mutable type list for outs! Always overwrite the outs and don't append to it! It won't work.
Used mutable type list for outs! Always overwrite the outs and don't append to it! It won't work.
Used mutable type dict for params! Always overwrite the params and don't alter it otherwise! It won't work.
--- Writing new DVC file! ---
Overwriting existing configuration!
ERROR: you are not inside of a DVC repository (checked up to mount point 'C:\')



{'name': 'HelloWorld', 'ids': [1, 2, 3, 4]}
{'start': 1, 'stop': 5}
[WindowsPath('outs/first_file.txt'), WindowsPath('outs/second_file.txt')]


But they can not be e.g., append to and must be overwritten!

In [10]:
@PyTrack()
class HelloWorld:
    def __init__(self):
        self.result = DVC.result()
        self.output_files = DVC.outs(['first_file.txt', 'second_file.txt'])

    def __call__(self):
        print(hello_world.output_files)
        self.output_files.append('third_file.txt')
        print(hello_world.output_files)
        self.output_files= [x.name for x in self.output_files] + ['third_file.txt']
        print(hello_world.output_files)

    def run(self):
        pass

hello_world = HelloWorld()
hello_world()

Used mutable type list for outs! Always overwrite the outs and don't append to it! It won't work.
Used mutable type list for outs! Always overwrite the outs and don't append to it! It won't work.
Used mutable type list for outs! Always overwrite the outs and don't append to it! It won't work.
--- Writing new DVC file! ---
Overwriting existing configuration!


[WindowsPath('outs/first_file.txt'), WindowsPath('outs/second_file.txt')]
[WindowsPath('outs/first_file.txt'), WindowsPath('outs/second_file.txt')]
[WindowsPath('outs/first_file.txt'), WindowsPath('outs/second_file.txt'), WindowsPath('outs/third_file.txt')]


ERROR: you are not inside of a DVC repository (checked up to mount point 'C:\')



## PyTrackProject

PyTrack also provides a Python interface to DVC via the `PyTrackProject`.