Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce calculate() function #1477

Merged
merged 74 commits into from
Jun 20, 2024
Merged

Conversation

jan-janssen
Copy link
Member

@jan-janssen jan-janssen commented Jun 11, 2024

In the previous interface a new class could be implemented in pyiron by defining two functions write_input() and collect_output(). The example class would look like this:

from pyiron_base import GenericJob

class NewJobClass(GenericJob):
    def __init__(self, project, job_name):
        super().__init__(project, job_name)

    def write_input(self):
        ...

    def collect_output(self):
        ...

The new interface replaces the write_input() function with an get_input_parameter_dict() function which returns the files_to_write and the files_to_copy as a dictionary. In addition, the collect_output() function is placed outside the job class. Both of these changes are designed to support a more functional approach in pyiron_base. Only the get_input_parameter_dict() and get_output_parameter_dict() functions are now part of the new job class. The get_input_parameter_dict() function takes the input defined by the user on different properties like job.input.structure or functions like job.set_kpoints() and merges everything into a dictionary with input file content. This dictionary of input file content allows expert users to overwrite the pyiron generated input with their own input, if the get_input_parameter_dict() function defined in the new job class supports this. In the same way the get_output_parameter_dict() function allows the developer to provide additional parameters to the collect_output() function:

from pyiron_base import GenericJob

def collect_output(working_directory, output_parameter_1, output_parameter_2):
    ...
    return output_dict

class NewJobClass(GenericJob):
    def __init__(self, project, job_name):
        super().__init__(project, job_name)
        self._job_with_calculate_function = True
        self._collect_output_funct = collect_output

    def get_input_parameter_dict(self):
        ...
        return {
            "files_to_write": {"test.py": "if __name__ == '__main__':\n    print('Hello')\n"}, 
            "files_to_copy": {"my_bashrc": "~/.bashrc"}
        }

    def get_output_parameter_dict(self):
        return {"output_parameter_1": ..., "output_parameter_2": ...}

For those users who only want define the write_input() and collect_output() function, pyiron_base still supports the wrap_executable() function:

def write_input(input_dict, working_directory="."):
    ...

def collect_output(working_directory="."):
    ...
    return output_dict

job = pr.wrap_executable(
    job_name="Cat_Job_energy_1_0",
    write_input_funct=write_input,
    collect_output_funct=collect_output,
    input_dict={"energy": 1.0},
    executable_str="cat input_file > output_file",
    execute_job=True,
)

The limitations of the wrap_executable() function are: MPI parallelism is not supported and the input_dictionary is directly communicated to the write_input() function. So it is not possible to have a generic interface and an expert user interface with this approach. Internally the wrap_executable() already uses the new job class interface. In the same way the collect_output() function only accepts the working_directory as input parameter and it is currently not possible to provide additional parameters for the parsing of the output.

Base automatically changed from execute_subprocess to main June 12, 2024 08:15
@jan-janssen
Copy link
Member Author

I start by switching the ExecutableContainerJob to using the new calculate() function. Once this pull request is merged and the new pyiron_base version is released the same changes can be applied to LAMMPS and VASP in pyiron/pyiron_atomistics#1446 and pyiron/pyiron_atomistics#1447

Copy link
Contributor

@pmrv pmrv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit annoying that this now duplicates a lot of the logic present in GenericJob.run_static and execute_job_with_external_executable. Do you see an option to unify this again (in the future)?

@jan-janssen jan-janssen changed the title ExecutableContainerJob: Switch to run_static() function Introduce calculate() function Jun 20, 2024
@jan-janssen
Copy link
Member Author

It's a bit annoying that this now duplicates a lot of the logic present in GenericJob.run_static and execute_job_with_external_executable. Do you see an option to unify this again (in the future)?

I updated the run_static() function to either call execute_job_with_calculate_function() or execute_job_with_external_executable().

Copy link
Member

@samwaseda samwaseda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except for my minor comment I'm ok with this.

pyiron_base/jobs/job/runfunction.py Outdated Show resolved Hide resolved
pyiron_base/jobs/job/runfunction.py Outdated Show resolved Hide resolved
@@ -122,5 +122,5 @@ class PythonTemplateJob(TemplateJob):

def __init__(self, project, job_name):
super().__init__(project, job_name)
self._python_only_job = True
self._job_with_calculate_function = True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't this will actually work for PythonTemplateJob, because the CalculateCaller is now expected to run a script and won't touch run_static (which is what implementors are supposed to overload), right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As run_static() is defined in PythonTemplateJob it overwrites the run_static() in GenericJob.

Copy link
Contributor

@pmrv pmrv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like I wrote, I don't think this will work with the TemplateJob, but it also doesn't need to be unified now. Keeping _python_only_job around and checking for both in _check_if_input_should_be_written should be sufficient.

Copy link
Contributor

@pmrv pmrv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, I missed that it's overloaded anyway in case of TemplateJob.

@jan-janssen jan-janssen merged commit db1f546 into main Jun 20, 2024
23 of 24 checks passed
@jan-janssen jan-janssen deleted the executablecontainer_runstatic branch June 20, 2024 13:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
format_black reformat the code using the black standard
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants