Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add library API #115

Merged
merged 21 commits into from
Aug 10, 2023
Merged

Add library API #115

merged 21 commits into from
Aug 10, 2023

Conversation

tasleson
Copy link
Contributor

@tasleson tasleson commented Dec 13, 2022

This is an initial pass at providing a library interface. This PR is based on ses_slot_support #109 , that PR should be merged first. The number of changes is fairly significant and I'm assuming it will take a bit to get this fully reviewed, tested, and integrated. The first few commits (starting with f555300) are fairly invasive as it wasn't easy to constrain the changes when introducing a library context and support logging and get things to compile. The early commits disable compiling of ledmon in the makefile until the library was more complete, before re-enabling and updating. I tried to break out the commits to finer granularity a couple of times and gave up. As things progressed, I made smaller more contained edits to address specific issues.

Ledctl has good separation now and uses the public library interface. Ledmon is still tightly coupled with the code base. Eventually it maybe be useful to add the needed functionality to the public library interface to support ledmon, but I don't think it's required.

I've only done limited testing with ledctl and ledmon on a system I have with SES. I'll need to check around to see if I happen to have access to other hardware which is supported for further testing. I also wrote a simple program which uses the library to ensure some basic functionality. As we expand testing and examples, I think it would be good to include them in the source repository.

I followed the guidelines in libabc when crafting the API and previous discussions we had on the issue ref. #97.

Resolves: #97

@mtkaczyk
Copy link
Contributor

mtkaczyk commented Dec 15, 2022

@tasleson, now I looked into that and I see that you wrote:
"that PR should be merged first."

Should I revert SES support then?

@tasleson
Copy link
Contributor Author

@mtkaczyk I'll fix this PR, no need to revert SES PR.

@mtkaczyk
Copy link
Contributor

Hi,
I will review that all commits at once. It will hard to review each patch separately. Feel free to push next patches on top of this- I will squash it all to one commit at the end. There is no need to push-force.
I hope that it will help you.

Thanks,
Mariusz

Copy link
Contributor

@mtkaczyk mtkaczyk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HI @tasleson
I looked into it briefly. Please let me know what you think about my first findings.
Please also note that it is my first experience with library implementation. I'm learning by analyzing your code.

Thanks,
Mariusz

configure.ac Outdated Show resolved Hide resolved
doc/ledmon.conf.pod Outdated Show resolved Hide resolved
src/ahci.h Outdated Show resolved Hide resolved
src/sysfs.h Outdated Show resolved Hide resolved
src/led/libled.h Outdated Show resolved Hide resolved
src/led/libled.h Outdated Show resolved Hide resolved
src/led/libled.h Outdated Show resolved Hide resolved
src/libled.c Outdated Show resolved Hide resolved
src/led/libled.h Outdated
*
* Note: Needs to be followed with led_flush()
*/
led_status_t LED_SYM_PUBLIC led_get(struct led_ctx *ctx, const char *path, enum led_ibpi_pattern *ibpi);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we can drop this led_get and led_set in favor of led_get_slot and led_set_slot.
The result should be same. If you want to avoid changing it across all code- please mark those functions as backward compatibility only.
We should also export function for getting default controller (it is hidden now in led_get and led_set)
What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not all LED supported hardware supports get/set slot functionality, that's why this was included.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

led_get() - I can see that it reads current device->ibpi for device. That won't work in all cases. This is rather an interal ledmon value that the field we can rely on. At the begining it represents expected state for the device, the real get functionality was added by ctrl_get_slot. For me we should not export that.

led_set() - The subset of options for led_set is smaller than for slot_set, so simply, if we don't have set_slot we can fallback to old _write method. If slot property is not supported then we can throw error. For slots we can always pass device, it is supported. IMO library should not limit that.

src/led/libled.h Outdated Show resolved Hide resolved
@mtkaczyk
Copy link
Contributor

mtkaczyk commented Jan 9, 2023

@tasleson I would like to present you my point of view of the library implementation.
First, I would like to illustrate you how I understand conception of controllers, slots and disks and how they are connected. I created an illustration to visualize led management complexity:
image

Legend:

TYPE DEFINITION
CTRL controller
DISK SSD/NVME/HDD device connected (called as block device)
LEDS led system per disk
SLOT the pointer/ID of led system inside controller.

As you can see, there is no direct reference between CTRL and DISK but you can use DISK to retrieve slot identifier for a controller to blink appropriative led.
There isn't any functional difference between set_slot and _write. Some implementation defined behaviors like:

  • error handling
  • defaulting non-supported states to normal

may be different but the concept is same. If you want to update led for device using particular controller you need to determine slot. You can send just block device (if any) and slot will be detected automatically for you.

I also determined two problematic scenarios:

  • CTRL1->slot2 and CTRL2->slot0. Points to same LEDS0_0.
    That is real issue between VMD and NPEM controllers. It is resolved by giving NPEM higher priority, see block_get_controller()
  • LEDS_2_1 and LEDS2_0 both are connected with DISK2 but they are managed by two independent controllers.
    I'm not aware of any real life scenario yet. This is a theoretical problem that I would like to be aware of.

That is why for library I want to force user to determine controller for device. We can provide something like get_default_controller() but I want to include controller in _set or _get command:

  • it makes library more configurable and powerful
  • I want to make library users aware of controller abstraction

For me, that is all what library should provide. We should allow user to just set LED for proper slot.
Other functionalities are ledmon and ledctl related. I've spent some time on diagram to illustrate my expectations somehow:
image

The library implementation goals I determined:

  • controller, pci_slot and enclosure lists should stay inside the library. Volumes, slaves and ledmon_block_list are made to provide monitoring functionality and are used by ledmon and ledctl only.
  • Block list management should stay inside the library because we need to provide list of supported block devices . We should remove send_fd reference from strcut block_device and put in in the struct controller (because the function is controller related).
  • "Exlude" and "allowed/only" lists should stay in the library and be honored during led_scan_ctrls(). We should give possibility to manipulate them (add and remove functions)
    • Led_blk_get_def_ctrl(blk) should return "default "controller for disk.
  • The _determine() logic should stay in ledmon/ledctl

I understand that the concept I provided will require more changes. We should separate library content and ledmon/ledctl content more. Let me know what do you think.

@tasleson
Copy link
Contributor Author

tasleson commented Jan 10, 2023

I understand that the concept I provided will require more changes. We should separate library content and ledmon/ledctl content more. Let me know what do you think.

In my opinion we need to make the library as simple as possible for someone to use. If there is complexity, it should be in the library implementation, not pushed to the application developer. When I first approached the project I mentioned a use case where you supply a device node and a pattern, discussed here: #97 (comment) . Without that functionality I cannot simply drop it into libStorageMgmt as a replacement. This is a use case which any developer should be able to utilize very quickly. In your latest discussion you outline removing the functionality of the get/set by device node. Is it your intention that we are then removing support for all the others? So only NPEM, SES, VMD would be supported?

The notion of a controller and slot is useful for changing the state of LEDs when the block device no longer has a device node. The API should be simple to understand and utilize for this use case too.

You mention:

CTRL1->slot2 and CTRL2->slot0. Points to same LEDS0_0.
That is real issue between VMD and NPEM controllers. It is resolved by giving NPEM higher priority, see block_get_controller()

The function block_get_controller appears to only be used by ledmon. This was part of my question asking if existing ledctl is flawed. I was simply asking if ledctl does what it states it does. Which is having the ability to control LEDs by device node and by the controller/slot functionality.

You also mention:

The _determine() logic should stay in ledmon/ledctl

I believe I actually removed this from ledctl. I'm thinking that maybe I've introduced a bug in doing so.

Maybe the API isn't the best fit for ledctl/ledmon, but I think we should be striving for something that doesn't require the API user to (If I'm understanding your diagram correctly)

  • Separately scan controllers and block devices
  • Iterate over every block device to find out which controller it should use (default)
  • Use the default controller and the block device to see if it has LED support

Ultimately what I was trying to achieve is something simple, with enough functionality that people would find it useful and have broad hardware support. I also wanted to minimize code churn in the code base, to hopefully reduce the introduction of bugs.

@tasleson
Copy link
Contributor Author

tasleson commented Jan 10, 2023

Re-based and moved the commit for removing exclusionary language to beginning of changes. When #118 gets integrated I'll remove that change from this PR. I've not addressed any of other other comments at this time.

@mtkaczyk
Copy link
Contributor

In my opinion we need to make the library as simple as possible for someone to use. If there is complexity, it should be in the library implementation, not pushed to the application developer. When I first approached the project I mentioned a use case where you supply a device node and a pattern, discussed here: #97 (comment) .

Yeah, the initial discussion took place before we developed "get/set-slots" and I deeply realized how it all is connected. The current approach I proposed is based on my current experience. But no offence, we need all to be on the same page to make library implementation possible. I don't want to repeat myself and discuss things many times because my current view is quite different. I did the schema to visualize my mind to myself too.

Without that functionality I cannot simply drop it into libStorageMgmt as a replacement. This is a use case which any developer should be able to utilize very quickly. In your latest discussion you outline removing the functionality of the get/set by device node.

Yup, in my approach I proposed to use opaqued block device. That makes your implementation complicated. You want to have it as simple as possible. As a maintainer, I'm thinking of making it the most flexible, not the simplest for developer.
See that my approach wants to totally isolate ledctl and ledmon code from library code.

Beside the my personal opinion about the controllers I can see that all current code bases on default one. It seems that I want to create dead api. So, maybe instead giving possibility to choose controller now, we can create set of functions prefixed by _def_ to highlight that they are using default controller and later, based on needs we can export more API around the controller management.
So, to meet my and your expectations, feel free to add endpoint like led_def_set_by_path(chat *path, enum ibpi val) and appropriative getter. Unfortunately, I still want to isolate ledctl and ledmon code more from the library so I'm wondering how we can agree on that.

Is it your intention that we are then removing support for all the others? So only NPEM, SES, VMD would be supported?

No, but they have _write only so we can support _set_by_block and _set_by_path only. On other endpoints we will be legible to returns error, until appropriative support will be added. Do you agree here?

You mention:

CTRL1->slot2 and CTRL2->slot0. Points to same LEDS0_0.
That is real issue between VMD and NPEM controllers. It is resolved by giving NPEM higher priority, see block_get_controller()

The function block_get_controller appears to only be used by ledmon. This was part of my question asking if existing ledctl is flawed. I was simply asking if ledctl does what it states it does. Which is having the ability to control LEDs by device node and by the controller/slot functionality.

sysfs_scan() is shared between ledmon and ledctl. There is call to block_scan and later to block_device_init and this function calls block_get_controller and sets it in device->send_fn

You also mention:

The _determine() logic should stay in ledmon/ledctl

I believe I actually removed this from ledctl. I'm thinking that maybe I've introduced a bug in doing so.

Similar to above, it is called by sysfs_scan and determine_slaves.

Maybe the API isn't the best fit for ledctl/ledmon, but I think we should be striving for something that doesn't require the API user to (If I'm understanding your diagram correctly)

  • Separately scan controllers and block devices
  • Iterate over every block device to find out which controller it should use (default)
  • Use the default controller and the block device to see if it has LED support

Agree now, as above- we shuld just put _def in method.

Ultimately what I was trying to achieve is something simple, with enough functionality that people would find it useful and have broad hardware support. I also wanted to minimize code churn in the code base, to hopefully reduce the introduction of bugs.

Sure and I will support you with that but I also know if I don't handle this and don't force some good requirements from the start then we will end with a huge and understandable library with set of not needed functionalities inside. I know this code well so I'm trying to describe you what is for applications and what is for library.

No worries about bugs, we don't have unit test they are unavoidable. We need to deal with that.
Perhaps, we should start with unit tests but it also huge input. There is no simple way now...

@tasleson
Copy link
Contributor Author

functions prefixed by def to highlight that they are using default controller

I'm trying to understand better why we need to expose the idea of default controller. From your perspective, it seems like we must expose it and not handle it internally. Is VMD & NPEM the only one that has the requirement of using NPEM? In this case NPEM is the default yes? Are you wanting the ability for the library user to have the choice to use VMD or NPEM to change LED states instead of allowing the library to select the best one for a given hardware configuration?

In SES with multi-path, there are multiple controllers and paths to change the LED state for a specified slot in a given enclosure, but I wouldn't say one is better than another, unless one of the paths is down.

I'm trying to wrap my head around why ledctl works with all supported hardware when you do something like:

# ledctl normal=/dev/sdr

and the user doesn't need to worry about a controller. I need more of an explanation to why we can't have the library do that too? Thanks

@mtkaczyk
Copy link
Contributor

functions prefixed by def to highlight that they are using default controller

I'm trying to understand better why we need to expose the idea of default controller. From your perspective, it seems like we must expose it and not handle it internally. Is VMD & NPEM the only one that has the requirement of using NPEM? In this case NPEM is the default yes? Are you wanting the ability for the library user to have the choice to use VMD or NPEM to change LED states instead of allowing the library to select the best one for a given hardware configuration?

I'm trying to wrap my head around why ledctl works with all supported hardware when you do something like:

# ledctl normal=/dev/sdr

and the user doesn't need to worry about a controller. I need more of an explanation to why we can't have the library do that too? Thanks

Initially I wanted to give possibility to choose the controller explicitly as you said. But during discussion I lowered this requirement because it is not needed now, let me link all my comment:

Beside the my personal opinion about the controllers I can see that all current code bases on default one. It seems that I want to create dead api. So, maybe instead giving possibility to choose controller now, we can create set of functions prefixed by _def_ to highlight that they are using default controller and later, based on needs we can export more API around the controller management.

Here explanation:
We are doing software here. I cannot predict how it all will be connected on hardware level. Someone may come with not trivial case where it will be needed. It comes from "flexibility" approach. It may be never needed but since I've realized that something like that is possible I wanted to make our library API flexible to meet more business goal with minimal input later.

We have allowed/ denied controllers list- that gives some flexibility, we are unable to decide how to blink for particular disk but we can exclude the controller- should be sufficient in standard cases.

When someone will come with this problem that will require to explicitly set the controller we will add new methods to library API.
I agree that we don't need to care about that right now. I will update the draft I created if we agree here to give us both reference for later.

@tasleson
Copy link
Contributor Author

When someone will come with this problem that will require to explicitly set the controller we will add new methods to library API.
I agree that we don't need to care about that right now. I will update the draft I created if we agree here to give us both reference for later.

Agreed. The API can always be added/extended, I fully support the idea of not adding something until it's needed.

@mtkaczyk
Copy link
Contributor

image

This is not a complete representation, I focused here on ledmon mostly but I also wanted to list all expected library functions. This is why some of them doesn't have references.
I omitted slot management functionality implementation in ledctl- I believe that it does not require detailed view. Let me know if something is not clear for you.

Thanks!

@tasleson
Copy link
Contributor Author

@mtkaczyk Please take a look at updated PR and see if this is starting to look closer to something you like.

Copy link
Contributor

@mtkaczyk mtkaczyk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tasleson
Yes, it is going well. Thanks!
I've put some new comments please take a look.

The one objection I have is around accessing structures by ledmon and ledctl defined in libled_internal.h. I'm fine with including it in both applications for backward compatibility but I think that we should avoid accessing structures defined there as much as possible.
For example:

	while(LED_STATUS_SUCCESS == (status = led_slot_list_next(ctx, &slot))) {
		if (slot_req->cntrl == slot->cntrl)

Perhaps, we should consider new method in libled.h slot_get_controller to avoid referencing slot content directly? I've put comment there too.

src/amd_sgpio.h Outdated Show resolved Hide resolved
src/config_file.c Outdated Show resolved Hide resolved
src/config_file.c Outdated Show resolved Hide resolved
src/config_file.h Outdated Show resolved Hide resolved
src/led/libled.h Outdated Show resolved Hide resolved
src/led/libled.h Outdated Show resolved Hide resolved
src/ledctl.c Outdated Show resolved Hide resolved
src/libled_internal.c Outdated Show resolved Hide resolved
src/ledmon.c Outdated Show resolved Hide resolved
src/libled_internal.h Outdated Show resolved Hide resolved
Copy link
Contributor

@mtkaczyk mtkaczyk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HI @tasleson,
I tried to review it deeply. Please see a comments.

The next step (after addressing current issues) should be removing ledmon and ledctl lists from library ctx (slaves, volumes ledmon/ledctl block_list).

As the last step, I would like to propose code dividing into subdirectories:

  • common (the files used by applications and library (utils, lists)
  • ledmon (ledmon files, ledmon.c, udev.c, udev.h)
  • ledctl (ledctl files)
  • lib (library content)

Let me know if you have any objection or advises.
As I said- it is my first experience with library.
Thanks,
Mariusz

src/block.c Outdated Show resolved Hide resolved
src/block.c Show resolved Hide resolved
src/config_file.c Outdated Show resolved Hide resolved
src/config_file.c Outdated Show resolved Hide resolved
src/ibpi.h Show resolved Hide resolved
src/ledctl.c Outdated Show resolved Hide resolved
src/ledctl.c Outdated Show resolved Hide resolved
src/ledmon.c Outdated Show resolved Hide resolved
src/list.c Show resolved Hide resolved
src/list.h Outdated Show resolved Hide resolved
@tasleson
Copy link
Contributor Author

tasleson commented Jan 26, 2023

The next step (after addressing current issues) should be removing ledmon and ledctl lists from library ctx (slaves, volumes ledmon/ledctl block_list).

This may very well be a good change, but from the perspective of the public API, it's an implementation detail. As such, I would like to defer it until we get this PR merged.

As the last step, I would like to propose code dividing into subdirectories:

I'm not an autoconf wizzard, I don't muck with it enough to be super proficient. I'll spend some time on it, but if it takes to long I would like to defer and place it a separate PR after this gets merged.

Update: I started working on the re-org a bit. What occurred to me is we will break packaging with this change. Also list.c is used by library code and ledmon/ledctl, so it should be in the lib dir too. I've added a commit to the end which does the re-org, but I'll likely remove it with future changes or simply squash the entire PR to one commit to address any needed changes.

@tasleson tasleson force-pushed the library_rfc branch 3 times, most recently from d595980 to 421f496 Compare February 1, 2023 21:04
@tasleson
Copy link
Contributor Author

tasleson commented Feb 1, 2023

@mtkaczyk I build a local SRPM and ran it through our code analysis tools. It found ~34 defects which I'm attempting to work through. Some look like false positives, but some look like genuine bugs (memory leaks, possibility of de-referencing null, failure to check return codes etc.)

I'll try to get that completed tomorrow and do testing as you suggested on the command line.

Copy link
Contributor

@mtkaczyk mtkaczyk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may very well be a good change, but from the perspective of the public API, it's an implementation detail. As such, I would like to defer it until we get this PR merged.

Got it. Looks reasonable.

@mtkaczyk I build a local SRPM and ran it through our code analysis tools. It found ~34 defects which I'm attempting to work through. Some look like false positives, but some look like genuine bugs (memory leaks, possibility of de-referencing null, failure to check return codes etc.)

Great! My infrastructure is going back so I will run my own analysis later. Generally, it is a finishing touch.

I'm not an autoconf wizzard, I don't muck with it enough to be super proficient. I'll spend some time on it, but if it takes to long I would like to defer and place it a separate PR after this gets merged.

OK, agree. We can handle it in separate pull eventually.

Also list.c is used by library code and ledmon/ledctl, so it should be in the lib dir too. I've added a commit to the end which does the re-org, but I'll likely remove it with future changes or simply squash the entire PR to one commit to address any needed changes.

Ok.

I need to do internal testing of slots management feature, then I will release new ledmon version. After that I will take you PR to baseline. I need around one month.
Led me know if that works for you.

src/lib/include/led/libled.h Outdated Show resolved Hide resolved
src/libled.c Outdated
led_status_t led_slot_list_next(struct led_ctx *ctx, struct led_slot_list_entry **slot_entry)
{
if (!ctx->sl.iter) {
led_status_t rc = led_slots_get(ctx);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I proposed to make it configurable via parameter passed to scan:

led_status_t LED_SYM_PUBLIC led_scan(struct led_ctx *ctx, bool scan_slots);

For reset and next we will return error if ctx->sl.slot_list.head is NULL.
My goal is to left all scans in led_scan() because I want highlight that it is the one place when we are iterating over the system (sysfs, ioctls and others).
The later we should impose led states only, assuming that we have all necessary information to do it.
Let we know how you see it.

src/block.c Outdated
if (!device)
goto error;

struct _host_type *hosts = cntrl ? cntrl->hosts : NULL;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cntrl cannot be NULL here:

	if (!cntrl)
		goto error;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And we are following kernel coding style, definitions should be placed at the beginning of function

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I proposed to make it configurable via parameter passed to scan:

led_status_t LED_SYM_PUBLIC led_scan(struct led_ctx *ctx, bool scan_slots);

For reset and next we will return error if ctx->sl.slot_list.head is NULL.
My goal is to left all scans in led_scan() because I want highlight that it is the one place when we are iterating over the system (sysfs, ioctls and others).
The later we should impose led states only, assuming that we have all necessary information to do it.
Let we know how you see it.

I actually prefer what I did the first time. Where the API caller called into the API and got an abstraction of a list and then iterated on it. This was done as we don't want to expose a linked list implementation.

Regardless, there is no sysfs scan performed in either the implementations I've provided. It simply walks over the in memory representation of what the scan already did and presents it with opaque data types.

Before I re-write this for the third time, please consider what the API end user experiences, how it can be used, the limitations of it etc. For example, in the first implementation a user could scan, get the list, close the library handle and continue to use the list abstraction indefinitely. The API user could also have more than one at a time too, representing change over time. Thus for example, the functionality where ledmon keeps track of state and, adds/updates/removes could be assisted by what the API provides.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before I re-write this for the third time, please consider what the API end user experiences, how it can be used, the limitations of it etc. For example, in the first implementation a user could scan, get the list, close the library handle and continue to use the list abstraction indefinitely. The API user could also have more than one at a time too, representing change over time. Thus for example, the functionality where ledmon keeps track of state and, adds/updates/removes could be assisted by what the API provides.

I believe that context is always must have. The main reason is:

That is the guidance I would like to follow to limit complexity for now but maybe I've taken in too serious. I understand that the approach you proposed is more flexible and could save some memory. To achieve that, we can handle two contexts and compare them. Nowadays we don't need to save every possible memory cell :)
The main purpose of the library is to allow led patterns manipulation. To achieve that context is always needed.

I see your first approach could be valuable for ledmon where we need to compare two system scans. I think that we can handle that in separate pull If we will determine that there is real use case for that because the first question we should ask ourselves is: Do we have user for the new API?

First, I would like to focus on removing ledmon code from lib and we agreed to handle that later. Then, we will know which list can be eventually coped for the user and if it is really needed (we have request or need to do it).

In this comment I wanted to discuss moving slots list generation to led_scan(), the discuss goes into other direction.
Let me know how you see this problem.

Thanks for giving you personal PoV- I really appreciate that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code was walking the systems/slots and allocates an internal structure for each item in the list. It only contains the information presented in the public API, so it's much smaller than what is present internally. Its not a clone of anything. One of the main reasons this is done is that would be very difficult to handle the case where a user gets the iterator, and during iteration they do a led_scan. That could potentially invalidate one or more internal structures if we simply provide iterating over the internal data structures if devices are added/removed.

You right, it is not exact clone, it is a new list made from original one. "Clone" was not perfect wording for that.

IMHO we should accept the risk you mentioned. I think that it is not a problem. User should know that after led_scan() pointers will be invalidated and if user need old list then he must make a list copy itself. The solution for the problem should stay in application, not in library.
The user should decide whose properties are needed, and should be copied to the internal temporary list. All we need is to provide API to access them. It also gives a direct control of allocated memory (application allocates, application frees).

I'm not sure what the problem is with making an iterator public. This is simply a place holder for their existing location in the internal list. The main point of exposing iteration is we then don't expose any kind of linked list structure(s). Iterators can be used across different languages bindings too.

Sorry wrong word. I meant "something what need to be resolved/ addressed", maybe concepts / ideas were better. Sorry for confusing you.

I'm confused by this. The public API as access to iterator, eg. next. We can easily add a previous and reset to what I initially did. If the user wanted to do nested for loops they would need to get two different list instances. At the moment I'm struggling to see the use case, but they could do it if they needed to.

And as you said- it is more intuitive. I have strong feeling that I don't remember correctly your first implementation and that is all based on my incorrect assumption that you returned copy of list to the caller. I cannot verify that after squashing. I'm sure that I saw it somewhere but I don't remember exactly where.

I still have an old branch. I can resurrect and implement previous and reset. I'll also include a client example showing its use. I have one already that I'm using for testing.

You don't need to implement functionalities that are not needed. The goal of my comment was to find examples where public iterator brings are real advantage, even if they are not needed now.

Conclusion: I agree that your first approach were better but I would like to not use "internal structure for each item" just present everything as is and document that pointers will be invalidated after led_scan().

Copy link
Contributor Author

@tasleson tasleson Feb 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conclusion: I agree that your first approach were better but I would like to not use "internal structure for each item" just present everything as is and document that pointers will be invalidated after led_scan().

I've modified the PR. The controller list abstraction is using the internal data structures without making a copy. This was pretty straight forward and the code is actually smaller. However, the slot data is not so easy.

  1. Some of the information is generated, it doesn't exist in any current data structures like the slot id and the current LED status which needs to be generated/retrieved. Thus if we don't take the current approach of allocating a list of structures which store this information when we iterate through the code to retrieve it, we are then forced to require the API user to check a status code during list iteration to see if any errors occurred as the data would need to be retrieved during the next or previous iteration. From an API user perspective, it's much nicer to retrieve the list and check the status once that all is good and then simply walk the data without having to constantly check if it's ok.
  2. In my first implementation the API user would specify the controller to retrieve slots for. Later I changed this to simply retrieve all slots on the system as the returned slot information includes the enumerated controller type. From an API user perspective I believe this is better. Otherwise the API user needs to loop across the different supported controllers that support the slots API and retrieve them individually. To do this iteratively without allocating a new list structure will require some really interesting code as we will need to internally iterate over three different linked lists, each of which contains dissimilar data.

I'll continue to investigate ideas on how to implement this without it being a significant increase in complexity, but I'm not seeing it at the moment.

Update: I'm in the process of implementing this for slots. We will however need to have a return code for each iteration. I'm also thinking we should change the controller iterator to match even though it doesn't need a return code, to make them consistent.

Copy link
Contributor

@mtkaczyk mtkaczyk Feb 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not fear to make a slots list inside the context to contain all necessary properties -that fits the design I want.
I don't want to return copied list, with no reference inside the context. If you need to create additional list inside context to combine all necessary properties for slots- that is fine.

Copy link
Contributor Author

@tasleson tasleson Feb 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not fear to make a slots list inside the context to contain all necessary properties -that fits the design I want. I don't want to return copied list, with no reference inside the context. If you need to create additional list inside context to combine all necessary properties for slots- that is fine.

I'm not exactly sure what you're saying here. Regardless, I'm going to leave this PR the way it is as I don't think I can reduce the list elements for listing slots. I've completed a different implementation that generates the slot data during iteration and placed it in this branch: https://github.com/tasleson/ledmon/tree/library_rfc_less_alloc . This branch isn't complete, as I would want to change the signature of the controller iteration to report back status during each iteration too. This is implemented with a linked list of linked lists with forward/backwards iteration included.

Please take a look at both and let me know which direction you prefer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not exactly sure what you're saying here.

Apologize for being inaccurate. Probably the root cause it that I'm only the reviewer. I will analyze what you prepared and eventually I will come with my code, to not bother you more with more comments. Then we will choose the way.

Please take a look at both and let me know which direction you prefer.

I need few days to do that.

src/block.c Outdated Show resolved Hide resolved
@tasleson
Copy link
Contributor Author

tasleson commented Feb 3, 2023

@mtkaczyk The directory re-work is in the PR

@tasleson
Copy link
Contributor Author

tasleson commented Feb 3, 2023

@mtkaczyk

I need around one month.
Led me know if that works for you.

That works, thank you for giving a time estimate.

@tasleson tasleson force-pushed the library_rfc branch 2 times, most recently from 3150ba0 to 877ec20 Compare February 13, 2023 23:06
@tasleson
Copy link
Contributor Author

Example of using library API

#include <stdlib.h>
#include <stdio.h>

#include <led/libled.h>

static inline void print_slot(struct led_slot_list_entry *s)
{
	printf("cntrl type: %d: slot: %-15s led state: %-15d device: %-15s\n",
		led_slot_cntrl(s), led_slot_id(s), led_slot_state(s), led_slot_device(s));
}

static inline void print_cntrl(struct led_cntrl_list_entry *s)
{
	printf("cntrl type: %d: path: %s\n", led_cntrl_type(s), led_cntrl_path(s));
}

int main() {
	struct led_slot_list_entry *slot = NULL;
	struct led_ctx *ctx = NULL;
	struct led_slot_list *slot_list = NULL;

	struct led_cntrl_list *cntrl_list = NULL;
	struct led_cntrl_list_entry *cntrl = NULL;

	led_status_t status;

	status = led_new(&ctx);
	if (LED_STATUS_SUCCESS != status){
		printf("Failed to initialize library\n");
		return 1;
	}

	status = led_scan(ctx);
	if (LED_STATUS_SUCCESS != status){
		printf("led_scan failed %d\n", status);
		return 1;
	}

	status = led_slots_get(ctx, &slot_list);
	if (LED_STATUS_SUCCESS != status) {
		printf("led_slots_get failed %d\n", status);
		return 1;
	}

	printf("Printing slots in order returned\n");
	while ((slot = led_slot_next(slot_list))) {
		print_slot(slot);
	}

	led_slot_list_reset(slot_list);
	printf("Printing slots in reverse order\n");
	while ((slot = led_slot_prev(slot_list))) {
		print_slot(slot);
	}

	led_slot_list_free(slot_list);

	printf("Printing controllers\n");
	status = led_cntrls_get(ctx, &cntrl_list);
	if (LED_STATUS_SUCCESS != status) {
		printf("led_cntrls_get failed %d\n", status);
		return 1;
	}

	printf("Printing controllers in order returned\n");
	while((cntrl = led_cntrl_next(cntrl_list))) {
		print_cntrl(cntrl);
	}

	led_cntrl_list_reset(cntrl_list);

	printf("Printing controllers in reverse order\n");
	while((cntrl = led_cntrl_prev(cntrl_list))) {
		print_cntrl(cntrl);
	}

	led_cntrl_list_free(cntrl_list);

	led_free(ctx);
	return 0;
}

Show the configure option and show a brief example on compiling/
building/running it by adding a new markdown file in the lib
directory.

Signed-off-by: Tony Asleson <tasleson@redhat.com>
Using the C testing library check, add some unit tests to test
the functionality of the library.

Signed-off-by: Tony Asleson <tasleson@redhat.com>
Library users will expect that when they change a value of an LED that if
retrieve the value after the set that is will reflect its new value.
Additionally a library user should be able to seamlessly use the slot or
device path API if the LED hardware supports it and have the ibpi values
match.  This change allows SES hardware to meet these expectations.

Signed-off-by: Tony Asleson <tasleson@redhat.com>
This allows us to run both the unit test and the command test
with `make check`

Signed-off-by: Tony Asleson <tasleson@redhat.com>
Function led_free was being called in _ledmon_fini() and main.
Remove duplicate call in main.

Signed-off-by: Tony Asleson <tasleson@redhat.com>
When the device is empty we get the following exception

Exception: Text line 'slot: sg3-0           led state: NORMAL
  device:' did not match regex '^slot: (.+)led state:(.+)device:(.+)$'

Signed-off-by: Tony Asleson <tasleson@redhat.com>
The library check is needed for the library unit test which was
added.

Signed-off-by: Tony Asleson <tasleson@redhat.com>
When compiling -g -O0 we get a linker error:

/usr/bin/ld: ledctl-ledctl.o: in function `_ibpi_state_init':
ledmon/src/ledctl/ledctl.c:318: undefined reference to `list_append'

Corrected with making this inline function static which is described:
http://gcc.gnu.org/onlinedocs/gcc/Inline.html

Signed-off-by: Tony Asleson <tasleson@redhat.com>
Issues found during valgrind testing the unit test on SES
hardware.

Signed-off-by: Tony Asleson <tasleson@redhat.com>
Copy link
Contributor

@bkucman bkucman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @tasleson

We finished the tests, I needed some time to analyze the list of defects to exclude false defects, and those requiring changes in test cases and unrelated to your changes. I have good news, we did not find any issues related to the main functionalities.

In this PR, there were problems with logging, which I described in the appropriate places.

during the analysis, I noticed slight differences in the style of function descriptions, I added comments where something needs to be tightened.

I haven't analyzed your last discussion with Mariusz, so give me a moment to familiarize and see if I have anything to add.

Thanks.

src/ledmon/udev.c Outdated Show resolved Hide resolved
src/lib/npem.c Outdated Show resolved Hide resolved
src/lib/npem.c Outdated Show resolved Hide resolved
src/lib/vmdssd.c Outdated Show resolved Hide resolved
src/ledmon/ledmon.c Outdated Show resolved Hide resolved
src/lib/list.h Outdated Show resolved Hide resolved
src/lib/pci_slot.h Outdated Show resolved Hide resolved
src/lib/list.h Outdated Show resolved Hide resolved
src/lib/utils.h Show resolved Hide resolved
src/lib/utils.h Show resolved Hide resolved
Log severity was incorrectly set.

Signed-off-by: Tony Asleson <tasleson@redhat.com>
Ensure we have handled retrieving all the options before we load the
library preferences.

Signed-off-by: Tony Asleson <tasleson@redhat.com>
Instead of returning a pointer to memory that contains no string
we will return NULL when this field is empty.

Signed-off-by: Tony Asleson <tasleson@redhat.com>
Missed setting the library log level from the configuration.

Signed-off-by: Tony Asleson <tasleson@redhat.com>
Make this clearer on why we are adding an additional byte to the
allocation.

Signed-off-by: Tony Asleson <tasleson@redhat.com>
Make this conform to established code standard.

Signed-off-by: Tony Asleson <tasleson@redhat.com>
Change from bool to enum led_cntrl_type  for led_is_management_supported
which allows us to return not only if it is supported, but what
controller provides the preferred support for it.

Remove led_set as this doesn't work for all the different supported
hardware.  You need slot support to have the ability to retrieve the LED
status.

Signed-off-by: Tony Asleson <tasleson@redhat.com>
Remove the calls to get_led as we removed that from the API.

Signed-off-by: Tony Asleson <tasleson@redhat.com>
@mtkaczyk
Copy link
Contributor

This moment finally comes. It has enough good quality to be merged. Thanks to all contributors (especially @tasleson for driving this great feature). It is small step for me to click merge but is is a big step for ledmon project!

I totally understand that it is not yet complete, there will be a bugs. There are still some topis that require clarification. There are still many questions. We will drive them all in separate pulls according to our needs and priorities.

I'm not going to make a release soon. There is a time for corrections in the library API.
The biggest step is done.

Thanks everyone for support!
@bkucman
@ktanska
@mgrzonka
@mku514k
@tasleson

Tony, please give me an ack, then I will merge it.

@tasleson
Copy link
Contributor Author

Tony, please give me an ack, then I will merge it.

My official 'ack', please do merge it.

@mtkaczyk mtkaczyk merged commit ca3178e into intel:master Aug 10, 2023
7 checks passed
@tasleson
Copy link
Contributor Author

tasleson commented Aug 10, 2023

@mtkaczyk
@bkucman
@ktanska
@mgrzonka
@mku514k

Thanks everyone!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Gauging interest in a C library API
4 participants