-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FORT 1.5.3 Crashing - ERR: Unknown protocol: 114 #83
Comments
Found this quirk while eyeballing #83. I don't think it's going to fix the problem, but it's definitely an improvement.
I uploaded a small patch. I don't think it's going to solve the problem, but you might as well try it. Are you using If you enable it, do you get a slightly different error mesage? Can you please post your |
Hey, sorry for the late reply..... another instance just crashed. Command Line: Config file:
|
|
Interesting the process provides a stack trace if you provide it a unknown option.
|
Ive left the service running with no BGP services using it and lost 2 instances this weekend. note: dont update librtr to version 8 |
Do you have files in the SLURM directory? ( |
Mostly quality of life improvements. On the other hand, it looks like the notfatal hash table API was being used incorrectly. HASH_ADD_KEYPTR can OOM, but `errno` wasn't being catched. Fixing this is nontrivial, however, because strange `reqs_error` functions are in the way, and that's a spaggetti I decided to avoid. Instead, I converted HASH_ADD_KEYPTR usage to the fatal hash table API. That's the future according to #40, anyway. I don't think this has anything to do with #83, though.
Ok, it looks like this is going to be a difficult bug. Is either of you willing to run a custom debug-heavy Fort binary? |
I will do that, no problem |
This is all I have in that file
|
Sorry it's taken so long. Debug commit is at branch issue83. I need the first logging line that contains the string "VRP Corrupted!":
It shouldn't crash anymore, but I'm not entirely sure what side effects the bogus VRP might induce.
Ok thank you. Probably not the problem either. |
Have you gotten any "VRP corrupted!" messages yet? Just to clarify: The issue83 branch contains a patch that prevents Fort from crashing, but does not, in fact, fix the bug. |
1. Revert panic back into the code. - Fort SHOULD die as soon as it realizes the VRP table is corrupted, as we should not send garbage to the routers. - Also, I'm not entirely sure the code would not crash later anyway, since the table is, in fact, corrupted. - Plus, if it doesn't crash, there would be no core dump to further analyze the bug. 2. Point bug output to the currently active bug report Might help us get some output earlier.
Didn't mean to close this. |
With us sometimes it crashes after 1 day, sometimes after more than 6 weeks... (Cannot implement 1.5.4 though because that would require a RPM package. |
Ok, I managed to apparently successfully generate the RPMs for 1.5.4, and uploaded them here. (I say "apparently" because CentOS 8's death forced me to migrate to Rocky Linux 8, and I'm not sure if packages generated there will be compatible with other RHELs. Please feedback.) In other news, I have so far discovered and fixed at least one undefined behavior during the development of 1.5.5, so the bug might already be fixed in the main branch. For your convenience, I packaged this as rpm-1.5.4.1.tar.gz. Please install either 1.5.4 or 1.5.4.1, and provide the crashing output once it happens. If it never happens, I would also like to know it. |
Do you mind tagging 1.5.4 (and 1.5.4.1?) in the repository? This way I will be able to update the Debian package. |
What do you mean? It's been tagged since release. |
Nevermind: I tought that you had released a new version with the more recent changes. I will wait for the next one, unless you think that I should package a snapshot right now. |
RPM 1.5.4-1 package installs fine on RHEL. Thank you. |
Well, it looks like it did the trick. No crashes in more than a month. Chapeau and thanks! :) |
Hello,
I am picking up that the latest version of FORT 1.5.3 keeps crashing on a regular basis. We has paired FORT with FRRouting which is also running on the latest version on Oracle Linux V8
Below is the extract from the log showing the crashed process.
The text was updated successfully, but these errors were encountered: