Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mdsip.hosts file does not working in mdsip #2652

Closed
ahaglory opened this issue Nov 10, 2023 · 13 comments · Fixed by #2714
Closed

mdsip.hosts file does not working in mdsip #2652

ahaglory opened this issue Nov 10, 2023 · 13 comments · Fixed by #2714
Assignees
Labels
branch/stable This is present on or relates to the stable branch bug An unexpected problem or unintended behavior core Relates to the core libraries and scripts os/linux This is present on or relates to Linux

Comments

@ahaglory
Copy link

Affiliation
What University / Laboratory / Company are you from, if any?
KFE - KSTAR
Version(s) Affected
The MDSplus Version(s) affected, if any.
All (Test ver : 7_131_6 & 7_132_0)
Platform
The Operating System you're running on.
Rocky Linux 8.5
Describe the bug
A clear and concise description of what the bug is.

It is possible to plot data on the client even if all contents delete of the '/etc/mdsip.hosts' file.
Even if the mdsip error log file says 'Access denied', I can plot the data on the client.
So, I confirmed that there was an error in line 559 'status= SendMdsMsgC' of the 'mdstcpip/mdsipshr/Connections.c' file on github.
After removing 'status' from the "Connections.c" file and testing mdsip, data access was blocked normally.

Could you please check this?
To Reproduce
Steps to reproduce the behavior:

  1. Yum install el8 version on Linux Rocky 8.5
  2. run : rpm - post_install_script & setup.sh
  3. edit /etc/xinetd.d/mdsip (disable yes -> no) and setup tree path
  4. delete and comment out in /etc/mdsip.hosts
  5. Other PC - jScope -> connect and plot data

Expected behavior
Even if the mdsip error log file says 'Access denied', I can plot the data on the client.
After removing 'status' from the "Connections.c" file and testing mdsip, data access was blocked normally.

Screenshots
n/a

Additional context
When source compiling the mdsplus stable version, the version information is displayed as '0.0.0' using mdstcl, and the contents of the ChangeLog file are empty.

@ahaglory ahaglory added the bug An unexpected problem or unintended behavior label Nov 10, 2023
@mwinkel-dev mwinkel-dev self-assigned this Nov 10, 2023
@mwinkel-dev mwinkel-dev added branch/stable This is present on or relates to the stable branch os/linux This is present on or relates to Linux core Relates to the core libraries and scripts labels Nov 10, 2023
@mwinkel-dev
Copy link
Contributor

Hi @ahaglory -- Thanks for submitting this issue. We are busy with other projects during the week of November 13 - 17, but should be able to investigate this issue within a few days.

@mwinkel-dev
Copy link
Contributor

Hi @ahaglory -- Apologies for the delay. (We've been very busy lately.) I am now working on this issue and will see if I can reproduce it today.

@mwinkel-dev
Copy link
Contributor

mwinkel-dev commented Dec 15, 2023

Here is a conjecture about the likely root cause of this issue . . .

Based on an inspection of the source code, the likely issue is that the SendMdsMsgC() call is returning status of SsINTERNAL. (The mdstcpip/mdsipshr/SendMdsMsg.c file defines SendMdsMsgC() which in turn calls send_bytes() -- also defined in the same file -- which can return SsINTERNAL.).

Note that SsINTERNAL = -1 and does not conform to the usual definition of a MDSplus status code. Although the -1 is used to denote a severe internal error, its low order bit is set. And thus the STATUS_OK macro treats SsINTERNAL as a variant of "success".

Proper handling of SsINTERNAL requires logic similar to that seen in the SendMdsMsg() routine (see the SendMdsMsg.c source file).

If the above conjecture is correct, then the Connections.c file should retain the status = SendMdsMsgC(c, msg, 0) line. However, that line should be followed by an if statement to process a status of SsINTERNAL similar to that found in the SendMdsMsg() routine.

@mwinkel-dev
Copy link
Contributor

Next steps in fixing this bug . . .

  1. reproduce the issue, and
  2. investigate the conjecture in the preceding post.

@mwinkel-dev
Copy link
Contributor

Hi @ahaglory,

What version of MDSplus are you running on your PC? Are both computers -- the PC and the mdsip Server -- running the same version of MDSplus?

If they are different versions, then I will have to change my computers accordingly to investigate this issue.

Tonight, I used two computers running the same version of MDSplus alpha, and was unable to reproduce the problem you reported. (Although the conjecture I made two posts above is not the root cause, there is indeed an issue with SsINTERNAL that also needs to be fixed.)

@mwinkel-dev
Copy link
Contributor

mwinkel-dev commented Dec 18, 2023

Hi @ahaglory -- The root cause of the bug you reported is likely caused by having mismatched versions of MDSplus -- that the PCs are running an older version than the version your are running on your mdsip server.

Note that the message header data structure has evolved over time. For example in the stable_6_1_9 version, the MsgHdr structure contained an int fill; field. However in the current alpha, that field is absent from MsgHdr.

Thus the best conjecture is that the line that you flagged as the issue (i.e., the status = SendMdsMsgC(c, msg, 0); at line 559 of Connections.c) is failing -- it repeatedly tries sending a message to the older MDSplus software on the PC, gives up after ten tries, and returns a SsINTERNAL status. Which as described above is not being processed properly -- it is being treated as a variant of success, and then the code goes ahead and erroneously creates the connection.

If this conjecture is correct, then the solution involves changing two things.

  • The SendMdsMsgC() statement will be retained, but will be followed by an if statement to check for the SsINTERNAL status and process it properly.
  • And when installing new versions of MDSplus at your facility, the same version of MDSplus should be installed on server and client computers. For example, when testing MDSplus 7_131_6 on the data archive server, the client PCs should also be running version 7_131_6.

Later this week, I will submit a PR to add the proper processing of the SsINTERNAL status code.

@mwinkel-dev
Copy link
Contributor

mwinkel-dev commented Dec 19, 2023

Injecting faults into the source code still didn't reproduce the bug. Although forcing the condition that returns the SsINTERNAL does indeed cause the code to erroneously create a connection, the failure signature (log files, behavior, etc.) is different than described in the initial bug report. So, perhaps the above conjecture isn't the complete explanation.

Please provide additional information: version of software on PC, and whether any additional error messages appeared on the screen or in the mdsip log files (typically located in /var/log/mdsplus/mdsipd).

Note that my experiments have used the current alpha. I will repeat them with the current stable.

@ahaglory
Copy link
Author

Hi, Sorry for the late reply.
In 'SendMdsMsgC', the return value is always OK as the transmission is successful.
Regardless of the result of the verification of 'authorize client' (1 - line 538) above, the status value is overwritten at line 559.

Please check the source code again.

Uploading mdsplus_mdstcpip_mdsipshr_Connections_c.png…

Thank you for your hard work, it is greatly appreciated.

@mwinkel-dev
Copy link
Contributor

Hi @ahaglory,

Thanks for the clarification! You are correct about the source code (lines 538 versus line 559 of the mdstcpip/mdsipshr/Connections.c file).

The puzzle now is figuring out why this failure did not arise in my experiments. Today, I will repeat my experiments, but with additional test cases.

Please let me know if you are using different versions of MDSplus on your server than on your client workstations. (My testing has only been with the same version of MDSplus on both server and client.)

@mwinkel-dev
Copy link
Contributor

Hi @ahaglory,

On the server side, if a user is not authorized (by the mdsip.hosts file), the server sends a "failure" status to the client.

On the client side, if a message is received that contains the "failure" status, the client kills the connection. Which presumably affects the server side too.

In all client-server software, it is important to have compatible versions of software on both client and server. I will do more experiments to determine if the source code lines you have identified are truly failing when both client and server are running the same version of software. (It is possible that there is indeed a bug.)

I will also experiment with creating the equivalent of mismatched protocols on client and server, to see if that will enable me to reproduce this issue.

@mwinkel-dev
Copy link
Contributor

mwinkel-dev commented Jan 17, 2024

Hi @ahaglory,

The bug you reported has now been reproduced with the current alpha.

The problem with my previous experiments was that the bug was being masked. The client software was disconnecting because the user was unauthorized. So, I incorrectly assumed that the server was not saving the connection.

The addition of some debug print statements in the server code proves that even though the user is unauthorized, the connection is erroneously being saved. Output of the debug print statements follows.

Access denied: markw
XMW *********************
XMW SERVER - status before SendMdsMsgC: 0      # the zero indicates authorization failure
XMW SERVER - message header status field: 0    # client is being notified of the authorization failure        
XMW SERVER - status after SendMdsMsgC: 65545   # client displays error, but server receives MDSplusSUCCESS
XMW SERVER - before add connection             # thus server saves the connection even though it shouldn't

Now that the issue has been reproduced, the fix will be trivial. It will be ready as soon as the maintenance work is finished on the Jenkins build server. (The fix will also include the correct code to handle the SsINTERNAL error code.)

ADDENDUM: Although the server side is erroneously adding the connection, still unable to access information using that connection. (Have two identical Rocky 9.3 VMs running the current version of alpha.) No luck getting jScope to plot data as described in the initial bug report. Also unable to access data using a custom C program. Similarly,mdstcl also is unable to access data. Thus, there is still a possibility that different versions of software on the server and the PC might be a factor in this bug. Regardless, a fix will be made to eliminate the erroneous saving of the unauthorized connection.

@mwinkel-dev
Copy link
Contributor

Was able to force jScope to behave as described in the initial bug report. However, that required crippling the client-side software. Had to disable the code in the client that handles the authorization error returned by the server.

Perhaps similar issues can arise if the PC running jScope has an older version of MDSplus than is running on the mdsip server.

Regardless, there is indeed a bug on the server side that needs to be fixed.

@mwinkel-dev
Copy link
Contributor

Hi @ahaglory,

The fix for this issue is now available in alpha_release-7-139-67. You can obtain that release from the GitHub site.
https://github.com/MDSplus/mdsplus/releases

And after we complete some maintenance work on our build server, you will also be able to obtain it from the package repository on the mdsplus.org web site as per usual.

WhoBrokeTheBuild added a commit that referenced this issue May 21, 2024
* Gm apd java (#2729)

* Improve APD support for Java interface

* Improve APD support for Java - forgotten files

* Commit packages

* When activate debug trace, now compiles without error. (#2735)

This fixes Issue 2734.

* Fix: reduce open files due to dispatcher (#2740)

* Fix: reuse action_server connection id in ServerConnect; avoid duplicates in list

* Fix: set dispatched early; unset if dispatching failed; prevent race on fast actions

* Fix: lock Clients in ServerQAction; cleanup and check before use

* Fix: reconnect dropped connections

* Fix: use correct windows SOCKET print format

* Fix: satisfy rhel7 c standard

* Gm apd thin cpp (#2742)

* Added ADP support in C++ thin client

* Added tdi fun

* Added TDI FUn

* Fix commands

* Gm new marte (#2743)

* more parameters for marte2_simulink_generic

* Proceed with the new implementation

* Proceed

* Proceed

* Proceed

* Proceed

* Proceed

* proceed

* Proceed

* Proceed

* Partially tested version

* Added execution times recording

* Proceed

* Procced with debugging

* Proceed

* Proceed

* Proceed

* Fixes for multisampled acquisition

* Remove quotes from string parameters

* Minor fixes

* Procced debugging

* Debugging

* More channels

* Debug Distributed configuration

* Fix sognal recording for synchronized inputs

* Further debug

* Further debug

* Small fixes

* Close ti final version

* Forgotten fix

* Make port visible, fix parameter name

* unaligned nids

* Increase DiscontinuityFactor

* Discontinuityfactor

* More channels

* Proceed with the new implementation

* Proceed

* Proceed

* Proceed

* Proceed

* Proceed

* proceed

* Proceed

* Proceed

* Partially tested version

* Added execution times recording

* Proceed

* Procced with debugging

* Proceed

* Proceed

* Proceed

* Fixes for multisampled acquisition

* Remove quotes from string parameters

* Minor fixes

* Procced debugging

* Debugging

* More channels

* Debug Distributed configuration

* Fix sognal recording for synchronized inputs

* Further debug

* Further debug

* Small fixes

* Close ti final version

* Forgotten fix

* Make port visible, fix parameter name

* unaligned nids

* Increase DiscontinuityFactor

* Discontinuityfactor

* More channels

* Packages updated

* Remove print

* Remove error messages

---------

Co-authored-by: mdsplus <mdsplus@roactive2.rfx.local>

* Docs: Improve documentation for getSegment* python wrappers (#2732)

Add explanation and rename parameters for:
* getSegmentLimits
* getSegmentList

* Fix: Update JAVASOURCE to 8 to support JDK 17 (#2747)

* Fix: improve mdstcl's error handling and add comments (#2746)

* add comments regarding action service

* send_reply() now does cleanup_client() on bad socket

* explain mdstcl's receiver thread cannot access main thread's connection list

* Improve handling of non-MDSplus error codes

* add comments regarding action dispatch

* add comment explaining receiver thread select loop

* Fix: multiple string escape warnings thrown by python 12 (#2748)

```
mdsplus/pydevices/RfxDevices/FAKECAMERA.py:40: SyntaxWarning: invalid escape sequence '\C'
  {'path': ':EXP_NODE', 'type': 'text', 'value': '\CAMERATEST::FLIR:FRAMES'},

mdsplus/pydevices/RfxDevices/PLFE.py:220: SyntaxWarning: invalid escape sequence '\#'
  '^(\#[0-5][01]([01][0-9][0-9]|2[0-4][0-9]|25[0-5])){6}$', msg)

mdsplus/pydevices/RfxDevices/CYGNET4K.py:361: SyntaxWarning: invalid escape sequence '\E'
  self.serialIO(b'\x55\x99\x66\x11\x50\EB', None)

mdsplus/pydevices/RfxDevices/CYGNET4K.py:461: SyntaxWarning: invalid escape sequence '\8'
  return self.setValue(b'\81\x82', min(0xFFF, value), True)

mdsplus/pydevices/MitDevices/dt100.py:161: SyntaxWarning: invalid escape sequence '\.'
  regstr = '([0-9\.]*) [0-9] ST_(.*)\r\n'
```

The \CAMERATEST became \\CAMERATEST
The regex strings should be python r-strings `r""`, but to maintain backwards compatibility, we're using \\
The broken hex-codes now have x in them

* Build: Resolve linker error after updating the windows builder to Fedora 39 (#2749)

* Build: Resolve linker error after updating the windows builder to Fedora 39

This appeared after updating the mdsplus/builder:windows docker image to Fedora 39, and Wine to 9.0
The newer libxml2 tried to link dynamically unless we explicitly set LIBXML_STATIC

* Hopefully fix the MdsTreeNodeTest

It turns out that this was failing previously, but we weren't properly catching the error

* Fix errors in windows build from newer gcc

* Docs: Update sites.csv (#2615)

add Startorus Fusion in Xi'an, China

* Fix: mdsip now sends proper auth status back to the client (#2752)

Fixes issues #2750 and #2652

* Fix: mdstcl's `show current` no longer segfaults when no tree paths defined (#2754)

* Fix: "show current" no longer segfaults when no tree paths defined

* Fix: corrected typo in error message

* Use original error message so tests pass

* Fix: Add Debian 12 and Ubuntu 24.04 and support GCC 12+ (#2753)

* Build: Add Debian 12 and Ubuntu 24.04

* Add extra flags for GCC 12+ and stub imp for Python 3.12

GCC 12+ triggers a bunch of false positive warnings (which we treat as errors)
This adds AX_C_FLAGS to configure those `-Wno-*` flags for GCC 12+
`cmdExecute.c` now uses snprintf to avoid buffer overflow warnings, also generated by GCC 12+
`compound.py.in` now supports Python 3.12+

* compound.py now supports Python 2.7.. again

---------

Co-authored-by: Stephen Lane-Walsh <slwalsh@psfc.mit.edu>

* Fix: Improve error messaging when calling Setup Device in jTraverser (#2744)

* Improve error messaging when calling Setup Device in jTraverser

e.getMessage() sometimes returned null, but just e will always print something
Add a printStackTrace() for InvocationTargetException exceptions to show the encapsulated error

* Add import for InvocationTargetException

* Build: Fix off-by-one versions produced by Jenkins (#2756)

This fixes the bug where `--os=bootstrap` wasn't receiving the version from `--version=x.y.z`
However, confusingly, this also changes the Jenkinsfile to not use that feature, and instead use `git tag` in order to embed the proper git information as well as the proper version information
The `--os=bootstrap` and `--version` fix is still included just so that it doesn't break if someone else tries to use it

* Build: Increase default test timeout to 1h (#2757)

When the build server(s) are at capacity, it's not unreasonable for a test to take more than 10 seconds, which was the old default timeout
This sets the default to 1h, and removes the overrides in various tests

* Gm fix filter (#2755)

* Allow filtering data from MinMax resampling; remove useless thread in jServer

* Fix compile error

* Remove debug message

* Make Windows Compiler happy

* Build: Fix 'HEAD' in `show version` and tag error (#2758)

Jenkins builds in a detached HEAD state, which caused bootstrap to use HEAD as the branch name
We pass --branch= to the bootstrap call in Jenkins, but $BRANCH wasn't being passed into the bootstrap docker container
Also, attempts to build alpha versions with tags that already existed failed

* Fix: mdstcl show version tag and links (#2760)

Fixes Issue #2759

* Feature: CompileTree will exit with non-zero status code for error messages. (#2446)

And error message should go to stderr.

* Build: Add package override for ubuntu and debian (#2761)

Override sections for Ubuntu 24 and Debian Bookworm were added.

* Fix: Python release version tag (#2764)

* Feature: Add "Date:" to show version output (#2767)

Implements #2766

Example:
```
$ mdstcl sho ver

MDSplus version: 7.140.75
----------------------
  Release:  alpha_release-7-140-75
  Date:     Thu May 16 17:43:14 UTC 2024
  Browse:   https://github.com/MDSplus/mdsplus/tree/alpha_release-7-140-75
  Download: https://github.com/MDSplus/mdsplus/releases/tag/alpha_release-7-140-75
```

* Fix: remove abort flag from RfxDevices DIO2 initialization (#2769)

Fixes issue #2768

* Fix: Missing repo metadata signing (#2770)

This will hopefully fix the lack of signed metadata files that are preventing us from automatically publishing releases

---------

Co-authored-by: GabrieleManduchi <gabriele.manduchi@igi.cnr.it>
Co-authored-by: mwinkel-dev <122583770+mwinkel-dev@users.noreply.github.com>
Co-authored-by: Timo Schroeder <zack-vii@users.noreply.github.com>
Co-authored-by: mdsplus <mdsplus@roactive2.rfx.local>
Co-authored-by: Josh Stillerman <jas@psfc.mit.edu>
Co-authored-by: Fernando Santoro <44955673+santorofer@users.noreply.github.com>
Co-authored-by: Louwrensth <Louwrensth@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch/stable This is present on or relates to the stable branch bug An unexpected problem or unintended behavior core Relates to the core libraries and scripts os/linux This is present on or relates to Linux
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants