# Exercise 2 
The goal of the second exercise is to learn more about the `GenericJob` class and in particular how a job is executed inside pyiron when the user calls the `run()` function.

# Toyjob

As a first step we define a `ToyJob` class which has to be modified in the following steps to understand more about the functionality of the run function:

In [1]:
from os.path import join
from pyiron_base import TemplateJob

In [2]:
class ToyJob(TemplateJob):
    def __init__(self, project, job_name):
        super().__init__(project, job_name)
        # The input consists of just a single value 
        self.input['input_energy'] = 100
        # The content of the input file is copied to the output file
        self.executable = "cat input > output"

    def write_input(self):
        # The input is written to the input file
        file = join(self.working_directory, "input") 
        with open(file, "w") as f:
            line = f.writelines(
                str(self.input['input_energy'])
            )

    def collect_output(self):
        # The output file is read
        file = join(self.working_directory, "output") 
        with open(file) as f:
            line = f.readlines()
        # the output is parsed to get the copied energy
        energy = float(line[0])
        # the energy is stored in the HDF5 file 
        with self.project_hdf5.open("output/generic") as h5out: 
            h5out["energy_tot"] = energy

This simple class can be executed with pyiron using the following lines: 

In [3]:
from pyiron_base import Project
pr = Project('test')
pr.remove_jobs(recursive=True, silently=True)  # Delete all jobs in this project 
job = pr.create_job(job_type=ToyJob, job_name="toy")
job.run()
job.status

  0%|          | 0/1 [00:00<?, ?it/s]

The job toy was saved and received the ID: toy


'finished'

For simplicity we define a function to execute the job in the next cells: 

In [4]:
def test_job_run(pr, job_type):
    pr.remove_jobs(recursive=True, silently=True) 
    job = pr.create_job(job_type=job_type, job_name="toy")
    job.run()
    return job.status

#### Task 1: 
Test the `test_job_run()` function, implemented above, by setting the arguments for the project `pr` and the `job_type`:

In [5]:
test_job_run(pr=pr, job_type=ToyJob)

  0%|          | 0/1 [00:00<?, ?it/s]

The job toy was saved and received the ID: toy


'finished'

# The run() function
To better understand how the `run()` function works, please take a look at the source code inside Pycharm again. For this example we use the `super()` call to access the definition of the same function in the class the `ToyJob` is derived from, in this case `GenericJob`:

In [6]:
class ToyJobVerbose(TemplateJob):
    def __init__(self, project, job_name):
        super().__init__(project, job_name)
        self.input['input_energy'] = 100
        self.executable = "cat input > output"

    def write_input(self):
        file = join(self.working_directory, "input") 
        with open(file, "w") as f:
            line = f.writelines(
                str(self.input['input_energy'])
            )

    def collect_output(self):
        file = join(self.working_directory, "output") 
        with open(file) as f:
            line = f.readlines()
        energy = float(line[0])
        with self.project_hdf5.open("output/generic") as h5out: 
            h5out["energy_tot"] = energy
    
    def _run_if_new(self, debug=False):
        print("status: ", self.status)
        super()._run_if_new(debug=debug)
    
    def _run_if_created(self):
        print("status: ", self.status)
        super()._run_if_created()
        
    def _run_if_collect(self):
        print("status: ", self.status)
        super()._run_if_collect()

#### Task 2: 
Add the line `print("status: ", self.status)` to each of the `run_if_x()` functions to identify in which order they are called when `run()` is called:

In [7]:
test_job_run(pr=pr, job_type=ToyJobVerbose)

  0%|          | 0/1 [00:00<?, ?it/s]

status:  initialized
The job toy was saved and received the ID: toy
status:  created
status:  collect


'finished'

# The to_hdf() and from_hdf() functions 
One of the core features of pyiron is that the data is stored automatically without the user calling `to_hdf()` or `from_hdf()`. Still as a developer this might be a bit confusing from time to time. 

In [8]:
class ToyJobStorage(TemplateJob):
    def __init__(self, project, job_name):
        super().__init__(project, job_name)
        self.input['input_energy'] = 100
        self.executable = "cat input > output"

    def write_input(self):
        file = join(self.working_directory, "input") 
        with open(file, "w") as f:
            line = f.writelines(
                str(self.input['input_energy'])
            )

    def collect_output(self):
        file = join(self.working_directory, "output") 
        with open(file) as f:
            line = f.readlines()
        energy = float(line[0])
        with self.project_hdf5.open("output/generic") as h5out: 
            h5out["energy_tot"] = energy
    
    def _run_if_new(self, debug=False):
        print("status: ", self.status)
        super()._run_if_new(debug=debug)
    
    def _run_if_created(self):
        print("status: ", self.status)
        super()._run_if_created()
        
    def _run_if_collect(self):
        print("status: ", self.status)
        super()._run_if_collect()
        
    def to_hdf(self, hdf=None, group_name=None):
        print("to_hdf")
        super().to_hdf(hdf=hdf, group_name=group_name)
        
    def from_hdf(self, hdf=None, group_name=None):
        print("from_hdf")
        super().from_hdf(hdf=hdf, group_name=group_name)

#### Task 3 a: 
Look up the parameters for the `to_hdf()` and `from_hdf()` function, either using the `?` on `job.to_hdf?` and `job.from_hdf?` or by browsing the relevant code with Pycharm.

#### Task 3 b: 
Add the `to_hdf()` and `from_hdf()` function to the `ToyJobStorage` class again with a `super()` call and a print function to identify when `to_hdf()` and `from_hdf()` are called during the execution of `run()`.

In [9]:
test_job_run(pr=pr, job_type=ToyJobStorage)

  0%|          | 0/1 [00:00<?, ?it/s]

status:  initialized
to_hdf
The job toy was saved and received the ID: toy
status:  created
status:  collect


'finished'