-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
public API for SID #103
Comments
@prajnoha I've been making the response formatting happen in the sid daemon itself, as mentioned in #102 (comment) and I have an issue I'm not sure if we need to solve. Currently, |
Sure, the Right now, I think, we don't need to add the extra formatting option - maybe if we have more |
The issue with not adding a seperate ENV formatting type is that the SID daemon is doing the formatting. So we can't use |
Ah yes, I see, got mislead by the fact that |
Ben, just wanted to ask - with your changes, we're still going to use the memfds, right? I mean:
Or you want to follow a different scheme? |
I was planning on only having the kv store dump use a memfd, like things currenly are. The response header is already using a regular buffer, and it doesn't really make sense to push that to the memfd. For things like the stats and version commands, there's not really enough information being passed to make a difference. But passing the full ascii formatted kv store dump could be pretty big. If we weren't using a memfd for that, then we probably shouldn't be passing that data as formatted ascii. We should pass it in as compact a format as possible. And if we pass that data in a binary format, I don't really see the point in passing the rest of the commands as ascii. But there really isn't a reason why we couldn't pass all the command reply data in a memfd. It seems like overkill to just pass a couple of lines of text, but it would keep the interface common for all the commands. Actually, I can think of one reason why we might want to treat the kv store dump differently from the other commands. It's possible that callers will want to query sid for more specific information than just a complete dump. The library callers likely don't want to grab the complete dump and process it themselves, to answer their queries. SID understands the how information is stored in the kv store, and can more easily do this processing itself. The question is where to process it: in the client library or in the daemon itself. The downside to processing it in the daemon is that we need to extend the SID client/server API whenever we add new commands or options to process the data differently. If the client just works from a dump of the whole kv store (or in the future possibly a partial dump) the client/server API doesn't need to change if the client adds new ways to process the dump. However, having the server pass formatted ascii output just adds extra work for the client to do. Formatting the output on the server side means that the client would have to parse the formatted output, then process it, and then reformat the processed output. Of course the question, of whether we want to pass the dump command reply data as binary or formatted ascii is orthogonal to the question of whether we want to pass command reply data in memfds or not. |
That makes sense, yes. That way, we can have it clearly separate - the metadata/header (which won't be actually used by the very end user of API - it's more like flags/directions for the API code itself). Then the data part can be directly passed to the API user.
Sure, for commands like
This is a must - I'm counting with this to be implemented in the future. Right now, we use full dump for our own debugging purposes mostly, but in the API for wider use we really need to support queries for both selecting the records and even properties within those records - that's the second most wanted feature from SID besides instantiating devices/stacks and collecting info. This is going to be a whole chapter for us to support. For example, one of such queries could be "give me a list of all devices in the stack above that are affected if I change device X" and similar. We need to come up with a sane query format. But this is certainly for a separate discussion, we just need to count with this right now that something like this will come.
...I'm more inclined to the processing on daemon side and supporting that query format in the request. If that query format is powerful and generic enough, I don't expect much changes in the API itself.
The best would be if client can get the result in a form so that there's minimal or even none extra processing to be done on client side. So the clients are really thin. Sure, if the client wants to put the data into some report with rows and columns, that extra processing is unavoidable (just like if we wanted to stuff the records in some nice form in a GUI). So it's more a question of a good format to use for which the clients can use existing external libraries to parse the output easily (which JSON certainly is).
Yes. BTW, one of the formats supported could as well be the binary one, besides ascii ones... If it was binary, we could possibly do it a way that it's easy and straightforward for clients to use predefined structure to iterate over data. Though it's more prone to changes and ordering, but we have the protocol versioned, that versioning can include formats too. |
That said, right now, we're considering static data - so working on some instantaneous state or db snapshot. But in the protocol and for the API, we should also count with possible streaming support in the future - like monitoring incoming events, monitoring triggers being fired etc... |
What are we thinking for the KV store moving forward? Are we going to consider moving to something like lightning db, or extend the code we've got to sync to disk? |
Sure. I'll do that then. |
As for syncing to disk, I think we should be able to provide that functionality even with existing db backend without any major problems - in the manner of the A more effective alternative (if we wanted to avoid even the extra copying between the memfd and the file) would be for the buffer code to support writing to a file directly as a new type of buffer backend which would use mmap-ed file (instead of anonymous memory like as it is with memfd). The code should be then almost the same as with "dump" code, only changing the buffer backend to that mmap-ed file and skipping the part where the "dump" code sends the memfd to client. Then adding the part which would read the file and reconstruct the db out of it. However, we need to take care that certain records/properties are not persistent over reboots, like major:minor. So we need to use persistent identifiers here instead whenever we reference devices in the db, which we don't do right at the moment - we're using exactly that major:minor... That's actually something I'm working on right now, where I'm trying to generate a persistent unique identifier for a device ("unique" for the purpose of SID) which would allow us to:
(That also includes the logic to match the persistent identifier with the actual device with major:minor as seen on the system.) What I'm more concerned about is the ability to do effective selection/querying on the database where our current db backend comes short - it's simply a hash. We'd need to be able to select subsets of records, ranges. For that, we need a different backend than the hash (or at least extend the existing hash heavily). Whether we do it on our own or whether we reuse an existing database, that's a question whether we can find a database that will suit our needs and it won't restrict us more than bringing any good. It's true that LMDB has very close to this, but I hadn't tried that in depth so I could say it's the one we'd use for sure. It's still an open question... |
Consider making the CLIs thin layers above a publicly usable API.
See:
#102 (comment)
#89 (comment)
The text was updated successfully, but these errors were encountered: