Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pickup fab-classic#77 by bumping to 1.19.2 #1443

Merged
merged 2 commits into from Mar 1, 2023
Merged

pickup fab-classic#77 by bumping to 1.19.2 #1443

merged 2 commits into from Mar 1, 2023

Conversation

timsnyder-siv
Copy link
Collaborator

fab-classic has logic to react to paramiko SSHException raised when server load is too high. In this case, fab-classic should retry the connection until it gets to env.connection_attempts (see

env.connection_attempts = 10
) and then raise NetworkError.

The logic in fab-classic 1.19.1 does not correctly match the paramiko exception message in all the required cases and incorrectly stops retrying the connection before reaching env.connection_attempts (it falls down a path where it aborts because the code thinks it should be querying the user for a password).

This fix makes the firesim manager more robust in environments where the server networking performance is struggling to keep up.

We may further want to catch NetworkError when in the runworkload monitoring loop so that hosts could be marked ??? for as long as we're unable to connect to them, rather than causing the monitoring loop to die so that the user is responsible for manually monitoring their workload and collecting results. I leave that for a future PR.

Related PRs / Issues

fab-classic#77 fixing fab-classic#76

UI / API Impact

No impact

Verilog / AGFI Compatibility

No impact.

Contributor Checklist

  • Is this PR's title suitable for inclusion in the changelog and have you added a changelog:<topic> label?
  • Did you add Scaladoc/docstring/doxygen to every public function/method? n/a
  • Did you add at least one test demonstrating the PR? n/a
  • Did you delete any extraneous prints/debugging code?
  • Did you state the UI / API impact?
  • Did you specify the Verilog / AGFI compatibility impact?
  • If applicable, did you regenerate and publicly share default AGFIs?
  • If applicable, did you apply the ci:fpga-deploy label?
  • If applicable, did you apply the Please Backport label?

Reviewer Checklist (only modified by reviewer)

Note: to run CI on PRs from forks, comment @Mergifyio copy main and manage the change from the new PR.

  • Is the title suitable for inclusion in the changelog and does the PR have a changelog:<topic> label?
  • Did you mark the proper release milestone?
  • Did you check whether all relevant Contributor checkboxes have been checked?

@timsnyder-siv timsnyder-siv added the changelog:changed Put PR title in 'Changed' section of changelog label Feb 27, 2023
@@ -1,4 +1,4 @@
fab-classic==1.19.1
fab-classic==1.19.2
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@abejgonzalez why do we have this separate requirements.txt? Is this vestigial and can it be removed now? It would be good for this file to have a comment explaining what requirements are kept here and why. I suspect this is only stuff needed to bootstrap something running machine_launch_script.sh and that it is a very minimal set of things needed by a single script.

@t14916 & @sifive-benjamin-morse 👀

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. In order to spawn an AWS instance manager, we need to first boot into a GH managed VM, install the Fabric/AWS deps, then launch the manager. This still needs to be kept.

- name: Install Python CI requirements

We can probably dedup this with conda-lock's ability to have multiple conda files. I'll try something out to help this. Stay tuned.

Copy link
Contributor

@abejgonzalez abejgonzalez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@abejgonzalez abejgonzalez mentioned this pull request Feb 28, 2023
12 tasks
@timsnyder timsnyder merged commit 030c548 into main Mar 1, 2023
@timsnyder timsnyder deleted the bump_fabric branch March 1, 2023 04:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
changelog:changed Put PR title in 'Changed' section of changelog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants