Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation Fault #42

Closed
JimKaidyNASA opened this issue Oct 8, 2019 · 16 comments
Closed

Segmentation Fault #42

JimKaidyNASA opened this issue Oct 8, 2019 · 16 comments

Comments

@JimKaidyNASA
Copy link

I am getting a seg fault when running my app (CNT_app) in core-linux.bin. Any assistance in tracing this back to the origin would be appreciated.

image

@SpaceSteve121
Copy link
Contributor

Jim, if you examine the traceback you'll see that the error is originating from the sb_init() function within ECI, specifically the CFE_PSP_MemSet() call here. That's initializing the command queue for your packets.

Looking at the arguments in the traceback, its attempting to initialize 4300 bytes starting at memory address 0. The fact that its attempting to set memory address zero is the part causing the segfault (because your program shouldn't/doesn't have permission to write there). Somehow the definition of the message point is invalid.

To try to track that down, can you post (at least):

  • the portion of eci_interface.h in which your command messages are defined
  • the header files containing the typedefs of the structures you're using for your messages

@JimKaidyNASA
Copy link
Author

image

The image above has the sent and received messages in eci_interface.h. Can you tell me where the header files are that contain the typedefs for the messages?

@JimKaidyNASA
Copy link
Author

Would this be it in platform_inc/cntECI_msgids.h:

image

@SpaceSteve121
Copy link
Contributor

If you look at the input messages that are being defined in the image you posted (ie, the elements on the MsgSend structure), you'll see the structure being instantiated is of type ECI_Msg_t. That structure type is defined here. Note that the 4th element is qptr, which is supposed to be a pointer to the command queue for commands and NULL for telemetry messages.

That qptr is the first argument to the CFE_PSP_MemSet that we noted above. In the image you posted the 4th element is NULL for all of your packets, which has a value of zero. So that explains why your app is attempting to set memory address zero.

The SIL will only define that 4th element to be NULL if you define the packet as a telemetry packet in Simulink. Can you confirm that's the case?

Note that the only case the ECI attempts to setup the queue is if the message is defined with a command message ID. Based on cntECI_msgids.h, it looks like all of your input packets are defined with command MID's despite them being defined as telemetry messages in your model. That mismatch is what's causing your problem.

If your inputs really are command messages, then you need to update your model to reflect that. If they're really telemetry messages, then you need to define the MID's appropriately to reflect that.

@SpaceSteve121
Copy link
Contributor

@BaldBeacon Please review the issue above.

What do you think about adding a check to ECI here to catch this sort of misconfiguration? I don't think there's ever a case where that's a valid configuration, and I think you might've actually ran into this same issue before (when working with ECI, not SIL specifically).

We'd just need to check that MsgRcv[idx].MsgStruct->qptr != NULL and issue an error event message warning the user about a misconfiguration, and either

  1. skip the configuration of the queue for that message (if we can establish that the app will still work without it)
  2. exit the app, if its going to be too broken to function

@JimKaidyNASA
Copy link
Author

I just realized (this cntECI_msgids.h was created for me and I was not paying attention to the order), that some commands and telemetry were switched. It should be like this. But I don't know if the hex values make any difference here.
image

@JimKaidyNASA
Copy link
Author

Would the renumbering like this help?
image

@SpaceSteve121
Copy link
Contributor

SpaceSteve121 commented Oct 9, 2019

If you look at the definition of the packet header, you'll see that bit 12 of the StreamId field defines the packet type. You need to ensure that's correctly set based on the type of message you're expecting. Those 4 input message ID's are setting bit 12 to 1, which indicates the message is a command. If that's not the case for your system then you need to set it to 0.

The ordering in the file does not matter to ECI, but may be helpful for humans looking for a particular packet.

@JimKaidyNASA
Copy link
Author

Bingo! Understood.

@BaldBeacon
Copy link
Contributor

@BaldBeacon Please review the issue above.

What do you think about adding a check to ECI here to catch this sort of misconfiguration? I don't think there's ever a case where that's a valid configuration, and I think you might've actually ran into this same issue before (when working with ECI, not SIL specifically).

We'd just need to check that MsgRcv[idx].MsgStruct->qptr != NULL and issue an error event message warning the user about a misconfiguration, and either

  1. skip the configuration of the queue for that message (if we can establish that the app will still work without it)
  2. exit the app, if its going to be too broken to function

Following up with this comment, I think any checks to ensure valid values are being entered by the user are valuable. I'll open a corresponding issue.

@JimKaidyNASA

Bingo! Understood.

Can I interpret this as your issuing being solved?

@JimKaidyNASA
Copy link
Author

A check would be great. This was a user error obviously but I was getting help from a more experienced user who was not sure which was a command and which was telem. I was not aware to catch the error. Once I made the corrections that problem was resolved. There's the bingo! Thx.

@SpaceSteve121
Copy link
Contributor

@BaldBeacon Once you've opened it, please comment here with a link to the new issue and then close this issue. Thanks!

@JimKaidyNASA
Copy link
Author

I'm trying to get the syntax right for the validation command. I've been running gdb to look at the contents of the message. All I get is a cryptic invalid msg length error. With the following from the cFS-GroundSystem/Guid, instructions, I know I am not interpreting it right.

image

image

image

@SpaceSteve121
Copy link
Contributor

@JimKaidyNASA I don't think this question is related to the segfault which is the topic of this issue. Please do not re-use issues as it makes it hard to keep things organized. Each question/problem/suggestion need to be made in its own issue so that it can be handled independently.

Unfortunately I don't know that we have the answer to your question. I've never used the python commanding interface and don't know how it formats its commands. Perhaps try opening an issue in the repo containing table services and/or the python ground system and/or the guide you're following?

@SpaceSteve121
Copy link
Contributor

@BaldBeacon I've opened the new issue for the check and will now close this issue.

@JimKaidyNASA
Copy link
Author

Steve I've opened up a new issue in cFS. Thx.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants