-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gateway Managed Storage support telemetry #51
Comments
I'm not following why you need FM to tell you what STRM should already know. If STRM is already tracking the files, wouldn't STRM already know the size and based on the current contents of the command to FM there's no need to traverse the entire file system. Even if STRM didn't know the size of the file, it can still just update the impacted directory statistics. Alternatively, if STRM is really what's being used to manage/track the file system, why use FM at all? If STRM "owned" those transactions there wouldn't be a disconnect. Analogous to typical file system operations... FM really is just the ops interface to rm, cp, mv, etc. If the intent is to really control/manage, maybe STRM should be explicitly in the loop to protect and manage directories or whatever, vs just getting downstream notification of what something else already did. If STRM really needs knowledge of every file manipulation call, I strongly recommend not modifying every single app that could possibly manipulate files. Instead get notifications from the OSAL layer for all open, create, delete (maybe write?) related calls. We've already got a pattern established for registering callbacks on actions (like task create), and adding callbacks on file system calls should be straight forward and much less impact compared to changing/adding tlm messages from numerous apps. |
Note we did add tlm for CF and DS creating files since that can happen autonomously. That said, there's files that get written all over cFS that are done by command (write table to file, write app data to file, write SB routing to file, write syslog to file, write events to file, etc)... modifying every single app to add a notification of file system actions would be a big impact across many elements. Getting a notification from OSAL on file API calls is trivial and 0 impact on apps. |
Jake,
See Inline comments.
From: Jake Hageman ***@***.******@***.***>
Sent: Wednesday, September 21, 2022 7:46 AM
To: nasa/FM ***@***.******@***.***>
Cc: LYNCH, NATHANIEL L. (JSC-ER611)[CACI NSS, INC] ***@***.******@***.***>; Author ***@***.******@***.***>
Subject: [EXTERNAL] Re: [nasa/FM] Gateway Managed Storage support telemetry (Issue #51)
I'm not following why you need FM to tell you what STRM should already know. If STRM is already tracking the files, wouldn't STRM already know the size and based on the current contents of the command to FM there's no need to traverse the entire file system. We only track the directory size not the individual file sizes. Doing so would take way too much RAM. Even if STRM didn't know the size of the file, it can still just update the impacted directory statistics. OS_stat on a directory doesn't return the sum of the file sizes in it. The size field represents how much space is taken by the directory table. Alternatively, if STRM is really what's being used to manage/track the file system, why use FM at all? Our app only cares about a few directories not the entire file system. It does not use FM, however FM is still needed for operations outside those directories. Additionally, ops probably needs to maintain a second means to access the data if our app is no longer functional. If STRM "owned" those transactions there wouldn't be a disconnect. Analogous to typical file system operations... FM really is just the ops interface to rm, cp, mv, etc. If the intent is to really control/manage, maybe STRM should be explicitly in the loop to protect and manage directories or whatever, vs just getting downstream notification of what something else already did. Our app is not intended to "control" or "limit" file system operations but it must be informed to make autonomous decisions.
If STRM really needs knowledge of every file manipulation call, I strongly recommend not modifying every single app that could possibly manipulate files. Instead get notifications from the OSAL layer for all open, create, delete (maybe write?) related calls. We've already got a pattern established for registering callbacks on actions (like task create), and adding callbacks on file system calls should be straight forward and much less impact compared to changing/adding tlm messages from numerous apps.
1. The Remote drives will be accessible from redundant systems. An OSAL callback would not provide us with information from that system's file manipulations in our directories.
2. We only care about a few directories. We would have to filter a lot more actions we don't care about even our own actions to OSAL. Unless there are callbacks for specific directories.
Operational Requirements may enforce which apps can manipulate files in our directories. However, FM must be allowed for the off nominal case needing operator intervention. Therefore, it is one of the very few we care about.
FM providing a couple of telemetry responses for bytes freed on file deletions both minimizes overall network traffic and CPU utilization.
All it takes is to call OS_stat on the file before it is deleted and copy the path and size to a telemetry packet and send. If needed, add a compile switch around it and a define in fm_platform_cfg.h.
-
Reply to this email directly, view it on GitHub<https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnasa%2FFM%2Fissues%2F51%23issuecomment-1253658388&data=05%7C01%7Cnathaniel.l.lynch%40nasa.gov%7C5d745a98c6c84cf8e03508da9bcf3cec%7C7005d45845be48ae8140d43da96dd17b%7C0%7C0%7C637993611624253272%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=6Cn7T1I63Q4oGv6B1VjPywLis9Um8MIJUtUIQfK%2BJ%2F8%3D&reserved=0>, or unsubscribe<https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FA3F3JMU3XZBLFGOQRRMFRRTV7L7QNANCNFSM6AAAAAAQRMUGUM&data=05%7C01%7Cnathaniel.l.lynch%40nasa.gov%7C5d745a98c6c84cf8e03508da9bcf3cec%7C7005d45845be48ae8140d43da96dd17b%7C0%7C0%7C637993611624263229%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=Df%2BdStjJRfY9DXG5rHCVkHpyJ1zWhiTwQUvvbXhuXl0%3D&reserved=0>.
You are receiving this because you authored the thread.Message ID: ***@***.******@***.***>
|
If you are already writing a "handler" for the FM telemetry to do things STRM needs to do, why not just make that a command handler and do the delete or delete all action from within STRM instead of relying on FM? If you really plan to decompress from within that managed directory, just set it up as a library call? You don't need to use FM to delete a file... it's a single call to OSAL. STRM could completely own all the actions in it's managed area which would reduce the need to modify common apps, and operationally all other apps could just stay out of the managed area. My hesitancy is relative to implementing in open source on a common app what seems to me to be an inconsistent pattern that may only ever be needed by one stakeholder to handle off-nominal conditions in a very specific use case. What I'm suggesting is maybe this use case is sufficiently complex and unique enough such that it would be better off handled via local/custom additions or modifications giving you the freedom to do whatever is necessary in your context and truly optimize the behavior per your requirements. |
Our app does NOT need or use FM to perform file functions. BUT it CANNOT prevent other apps like FM from tampering with our directories.
It might be possible if OSAL fully supported directory permissions but it DOESN'T. However, there is the off nominal case where FM might need access.
Because FM is operator driven we have to assume that operator error is possible.
Perhaps most cFS targets have other options. However, in the cases where it is not needed; this request is relatively benign especially if there is a platform config switch to not compile it in.
In the interest of keeping FM common open source, we could instead ask that FM provide a command to return a directory's disk usage in a telemetry packet. It's more work but I'm sure other FM users could benefit from that information.
|
I'm betting it's obvious on your side I really don't understand your use case 😄 Apologies for any frustrations it may have caused. Although I encourage project ownership of the design, implementation, and test of unique project requirements, I'll let those more informed (and responsible) negotiate the path forward. |
FYI - if your last suggestion for FM to provide directory disk usage is acceptable it should be a trivial change (although would have performance impacts with having to walk through an entire directory querying file sizes) There's a command already to request the directory file list here: Lines 244 to 259 in adc716e
and the associated telemetry is here for the main packet as well as each element (that can already include size): Lines 317 to 337 in adc716e
Could just add a total size to the main packet, and it would probably just be a few lines of code changes to query every file in the directory instead of just the ones reported in the packet. Note you've got the potential for a race w/ those numbers (if something is writing/deleting/moving while you are calculating) but at least you could get a snapshot. |
FM_GET_DIR_PKT_CC only gets the size if GetSizeTimeMode is set. It is also meant to be called multiple times to get the entire directory list. Would you really want to traverse the entire directory on every call? You might be better off making a new command that doesn’t create a file or the or the directory list just gets the sizes. That way it does not interfere with the existing commands. I think it would be simple copy/paste and remove what’s not needed. (Just don't remove the sleep) However, you would need a specific command and telemetry. |
A new command and new telemetry is certainly within the trade space. DirListPktCmd already traverses the entire directory to get Or could just add a tlm packt with just a summary ( Lots of options. |
Either way would suit our needs. Is it correct assume that if it has not completed the previous call you would not be able to call the function on different directory? |
You can queue up commands for the child task and it will process them in order, it only processes one at a time. See |
Looks like I'll be pulled into the mix here to actually implement this thing.... My interpretation of the original request is that it is necessary to track the overall filesystem usage, hence why the request is to add a notification every time something adds, moves, or deletes a file. I see two major problems with that naive approach:
Item (2) above might seem odd but there are lots of ways that these two become disjointed:
Given all the issues/limitations of tracking file system use by single file ops, my inclination is (strongly) toward an FM command that allows one to directly determine the actual filesystem usage by querying the filesystem for that info, not by making assumptions about how file ops might affect the value. OSAL already has such a call (statfs) which reports the usage and doesn't require going through all the directory entries or anything. |
Our App has specific directories that we want to track. There may be usage outside of those directories but we don't care about that usage.
Doesn't the OSAL call just give the total usage for the entire volume? In our case that doesn't help. If it however can be on directories, perhaps both Size and Side on Disk would be useful. |
OK, that's interesting. Not knowing the background, I don't fully understand how the notion of "directory usage" matters, because the underlying file systems simply don't work that way. Directories don't have the concept of "usage" at all - they are just logical constructs that determine how the file names are presented to the user, nothing more. In fact, files can be in more than one directory at the same time, and sparse files and what not could very much throw off any attempt to determine how much disk space a single directory is actually using. In short - fundamentally "disk space usage" is only a concept at the volume scope. That being said, we can certainly estimate such a thing by naively adding up all the file sizes. Of course that will not actually be accurate by any means (at least in terms of overall volume usage) but whether that matters really depends on what you are trying to accomplish. We could add some sort of command to FM that internally loops through all the files in a dir and adds up the sizes, thereby reducing the amount of CMD/TLM traffic that normally would be required to estimate such a thing. |
Limiting the impact on the COM traffic to get this information is the main reason for this request so a command to do that would be ideal for us. |
Digging deeper into this now. There is already a facility to monitor the free space at the volume scope, and it involves these items:
Two options for the path forward:
My preference is for (1) because it seems like a reasonable fit and avoids the clutter and complexity of introducing a whole additional table and TLM MSGID for this, but it will break compatibility with the existing table and TLM message definition as some additional fields will need to be added. If updating the existing Conversely, if there is a requirement not to change the existing TLM, then this would necessitate adding a separate TLM definition to report used space vs. the existing free space report. Please confirm, but otherwise I'm going to assume (1) is OK and pursue that route. |
Replace the "free space" table with a more general disk monitoring table, which can have entries for volume free space (which is what was there) as well as an estimate of directory usage. Table and TLM was renamed accordingly. This changes the definition of the TLM report to include the extra info, so it requires a ground system update.
Checklist (Please check before submitting)
Is your feature request related to a problem? Please describe.
We need to maintain current usage tracking on a directory-by-directory bases. If FM is used to modify the contents of our directories, our tracking would not be accurate which could lead to a loss of data.
For most commands to FM, we can gain the needed information by subscribing to the FM commands and querying OSAL directly. However, FM_DELETE_CC, FM_DELETE_ALL_CC, and FM_DECOMPRESS_CC commands are a problem because we would not have foreknowledge of the files size before the command is executed by FM. This would cause us to traverse an entire directory to correct the usage tracking.
We will have multiple remote drives. With potentially Terabyte drives, traversing the directories on a remote drives is too CPU and network intensive.
Describe the solution you'd like
FM_DELETE_CC:
typedef struct
{
CFE_MSG_TelemetryHeader_t TlmHeader; /< \brief Telemetry Header */
uint32 FileSize; /< \brief Size of the file deleted*/
char Filename[OS_MAX_PATH_LEN]; /**< \brief Delete filename */
} FM_DeleteFileTel_t;
FM_DELETE_CC:
typedef struct
{
CFE_MSG_TelemetryHeader_t TlmHeader; /< \brief Telemetry Header */
uint64 Freed; /< \brief Bytes freed in directory*/
char Directory[OS_MAX_PATH_LEN]; /**< \brief Directory name */
} FM_DeleteAllTel_t;
FM_DECOMPRESS_CC:
typedef struct
{
CFE_MSG_TelemetryHeader_t TlmHeader; /< \brief Telemetry Header */
uint32 SourceSize; /< \brief Size of the compressed file*/
uint32 TagetSize; /< \brief Size of the uncompressed file*/
char Source[OS_MAX_PATH_LEN]; /< \brief Source filename */
char Target[OS_MAX_PATH_LEN]; /**< \brief Target filename */
} FM_DecompressTel_t;
Describe alternatives you've considered
Similar feedback telemetry on following would be nice but not required
FM_COPY_CC
FM_MOVE_CC
FM_CONCAT_CC
Maybe there a way to have a common telemetry packet that include function code.
Additional context
Add any other context about the feature request here.
Requester Info
Nathan Lynch JSC-ER611
The text was updated successfully, but these errors were encountered: