Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define output dict for multi species in main.py #735

Merged
merged 3 commits into from
Apr 17, 2024
Merged

Conversation

JintaoWu98
Copy link
Member

@JintaoWu98 JintaoWu98 commented Feb 14, 2024

We have problem when we launch restart.yml saying unexpected keyword argument 'output_multi_spc'. It turns out previously we defined Scheduler.output_multi_spc parallel to Scheduler.output in Scheduler __init__, however, we forgot to define it in the ARC class in main.py. So now we add the relevant terms in main.py.

Copy link

codecov bot commented Feb 14, 2024

Codecov Report

Attention: Patch coverage is 28.57143% with 5 lines in your changes are missing coverage. Please review.

Project coverage is 73.81%. Comparing base (be1f6c8) to head (5ca8ae1).

Files Patch % Lines
arc/scheduler.py 0.00% 5 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #735      +/-   ##
==========================================
- Coverage   73.82%   73.81%   -0.01%     
==========================================
  Files          99       99              
  Lines       27346    27352       +6     
  Branches     5717     5718       +1     
==========================================
+ Hits        20187    20189       +2     
- Misses       5733     5737       +4     
  Partials     1426     1426              
Flag Coverage Δ
unittests 73.81% <28.57%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@alongd alongd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Can you please see the two comments below?

@@ -180,6 +180,8 @@ class ARC(object):
Job types not defined in adaptive levels will have non-adaptive (regular) levels.
output (dict): Output dictionary with status and final QM file paths for all species. Only used for restarting,
the actual object used is in the Scheduler class.
output_multi_spc (dict): Output dictionary with status and final QM file paths for the multi species.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please also add above under Args:

@@ -291,6 +294,7 @@ def __init__(self,
if not os.path.exists(self.project_directory):
os.makedirs(self.project_directory)
self.output = output
self.output_multi_spc = output_multi_spc
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we store it as an attribute, but we don't do anything with it. I think we should transmit it to Scheduler (and also add it as an argument in Scheduler like here)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We actually transmitted it to Scheduler as an attribute in the previous PR already (like what had been done to the output dict), I think we just forgot to define it in the main.py.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could be misunderstanding but I think @alongd meant here

ARC/arc/main.py

Line 591 in 688f960

self.scheduler = Scheduler(project=self.project,

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could be misunderstanding but I think @alongd meant here

ARC/arc/main.py

Line 591 in 688f960

self.scheduler = Scheduler(project=self.project,

Thank you for pointing that out. Indeed, we didn't transmit the output dictionary to the Scheduler. I have a question, though. I thought output_multi_spc was the multi-species counterpart to output. However, since we didn't transmit output here, why is there a need to transmit output_multi_spc? Am I missing something?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah maybe you're right. I'm not really familiar with the restart function. Do you know with these changes you've made that if a restart of a multi species works? As in ARC picks up that a multi species is/was run(ning)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not that familiar with the restart function either. Currently, I only have restart.yml from completed projects (both single and multi-species), which have already finished running. With these changes, I launched restart.yml for single and multi-species projects individually and compared them. They do yield similar folders, such as log_and_restart_archive, and no errors were reported in the err.txt. However, I'm not sure how this will turn out in the future with an unfinished project.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Areyou able to do with it unfinished. What I mean is start the job with the input.yml and then as soon as you see it submits its first job to the server, cancel ARC qdel <job id> and then restart it

Copy link
Member Author

@JintaoWu98 JintaoWu98 Feb 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I get the following error, this is the error when I terminate the ARC submitted job

Traceback (most recent call last):
  File "/home/jintaowu/Code/ARC/ARC.py", line 69, in <module>
    main()
  File "/home/jintaowu/Code/ARC/ARC.py", line 65, in main
    arc_object.execute()
  File "/home/jintaowu/Code/ARC/arc/main.py", line 620, in execute
    skip_nmd=self.skip_nmd,
  File "/home/jintaowu/Code/ARC/arc/scheduler.py", line 508, in __init__
    self.schedule_jobs()
  File "/home/jintaowu/Code/ARC/arc/scheduler.py", line 595, in schedule_jobs
    successful_server_termination = self.end_job(job=job, label=label, job_name=job_name)
  File "/home/jintaowu/Code/ARC/arc/scheduler.py", line 993, in end_job
    self._run_a_job(job=job, label=label)
  File "/home/jintaowu/Code/ARC/arc/scheduler.py", line 1060, in _run_a_job
    xyz=job.xyz,
  File "/home/jintaowu/Code/ARC/arc/scheduler.py", line 801, in run_job
    checkfile = self.species_dict[label].checkfile if isinstance(label, str) else None
KeyError: 'multi_spc1'

And this is the error when I restart the job

Traceback (most recent call last):
  File "/home/jintaowu/Code/ARC/ARC.py", line 69, in <module>
    main()
  File "/home/jintaowu/Code/ARC/ARC.py", line 65, in main
    arc_object.execute()
  File "/home/jintaowu/Code/ARC/arc/main.py", line 620, in execute
    skip_nmd=self.skip_nmd,
  File "/home/jintaowu/Code/ARC/arc/scheduler.py", line 483, in __init__
    self.run_opt_job(species.label, fine=self.fine_only)
  File "/home/jintaowu/Code/ARC/arc/scheduler.py", line 1188, in run_opt_job
    if self.output_multi_spc[self.species_dict[label].multi_species].get(key, False):
KeyError: 'multi_spc1'

arc/main.py Fixed Show fixed Hide fixed
arc/scheduler.py Fixed Show fixed Hide fixed
arc/scheduler.py Fixed Show fixed Hide fixed
@JintaoWu98 JintaoWu98 force-pushed the restart_multi_spc branch 2 times, most recently from d1aadc1 to 5ada2a6 Compare March 3, 2024 20:08
arc/scheduler.py Fixed Show resolved Hide resolved
arc/scheduler.py Fixed Show resolved Hide resolved
@JintaoWu98 JintaoWu98 force-pushed the restart_multi_spc branch 5 times, most recently from d561230 to fe99b26 Compare March 4, 2024 12:11
@JintaoWu98 JintaoWu98 removed the request for review from calvinp0 March 4, 2024 12:13
@JintaoWu98 JintaoWu98 marked this pull request as draft March 4, 2024 12:14
@JintaoWu98 JintaoWu98 requested a review from alongd April 15, 2024 17:13
main.py adds `output_multi_spc` as key in ARC.as_dict

make output_multi_spc in restart_dict not None
@JintaoWu98 JintaoWu98 marked this pull request as ready for review April 15, 2024 17:19
@alongd
Copy link
Member

alongd commented Apr 16, 2024

I confirm that ARC's restart does not crash on this branch. @calvinp0, do you have any additional comments, or can we merge?

@calvinp0
Copy link
Member

I confirm that ARC's restart does not crash on this branch. @calvinp0, do you have any additional comments, or can we merge?

The code scanning issues need to be resolved I think

@JintaoWu98
Copy link
Member Author

JintaoWu98 commented Apr 16, 2024

I confirm that ARC's restart does not crash on this branch. @calvinp0, do you have any additional comments, or can we merge?

The code scanning issues need to be resolved I think

@calvinp0, thanks for pointing it out.

@alongd, the updated branch is now more straightforward and robust.

If there are any other comments, please feel free to let me know.

Copy link
Member

@calvinp0 calvinp0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @JintaoWu98!

@JintaoWu98 JintaoWu98 merged commit dfe93dd into main Apr 17, 2024
5 of 7 checks passed
@JintaoWu98 JintaoWu98 deleted the restart_multi_spc branch April 17, 2024 15:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants