Let compiler infer types and ports in fg.connect() #66

nils-werner · 2022-10-01T16:24:28Z

Hey,

I am new to Rust, so some of the things I am saying may be wrong, or technically impossible. But looking at the example in your README I was a bit surprised to see

fg.connect_message(src, "out", snk, "in")?;

because it struck me as a bit "outdated" and error-prone. I would have expected more something like

fg.connect(src.out, snk.in)?;

Not using strings as port-names, but instead struct fields, which allows us to catch more errors during compile time.
This idea is a bit out there, but ports could be generics for their signal type, e.g. StreamPort<Complex<f32>>. This allows the compiler to even catch connection type errors and maybe even infer all connection types just from the output type of the first block?
And because a field has a type (e.g. StreamPort vs MessagePort), we can infer what kind of connection we are building, so no need for connect_*() but one connect() for all port types.

Does that make sense, and would that work? As I said, I am new to Rust so my thinking may be wrong here...

The text was updated successfully, but these errors were encountered:

bastibl · 2022-10-01T19:47:22Z

Hi! Thanks for bringing this up. I completely agree that it would be nice to have a more type-safe API, in particular for such central, user-facing functions like connect. Until now, I was mainly focused on getting things up and running (and convincing myself that the general approach is reasonable) and didn't have the time to explore this in detail. My gut feeling is also that there is a lot to improve and that something like what you proposed could work. But it's not straightforward and one would have to take some time to dive into the topic. Just some exemplary questions for what you propose:

src in your example is actually just some type of block identifier (at the moment a usize), since the block was already added to the flowgraph. To allow something like src.out, it would have to be something typed and specific to your block. How should that look like?

While you can implement blocks with struct fields that return typed output ports, how could the flowgraph learn about them? You have to have some kind of function fg.add_block(&mut self, b: ???). But what should that b be. It has to implement some type of generic block trait and this trait cannot know about arbitrary struct fields like out that were added. (I.e., if the function is fg.add_block(&mut self, b: impl Block), you cannot access b.out like in your example.)

To overcome the previous issue, ports have to be in some sort of vector. This allows the flowgraph to access them through, for example, block.stream_output(0). But what type should that vector have? It couldn't be StreamPort<x> for different, arbitrary xes (i.e., vec![a as StreamPort<u8>, b as StreamPort<f32>] is not possible) . It has to be behind a dynamically dispatched, generic port trait.

It goes on and on and on like this. While things may seem trivial, implementing a typed API is not straightforward. I think there are good answers to all these issues, but one just has to give it a try.

If you come up with something, it will be very much appreciated. I'm happy to help as good as I can. It's also on my list, but I just didn't have the time yet :-/

JuliDi · 2022-10-02T09:31:46Z

src in your example is actually just some type of block identifier (at the moment a usize), since the block was already added to the flowgraph.

Could we make just all connections have some sort of "ID" similar to the aforementioned usize identifier?
It would supposedly have to bear some information about which block it belongs to, so it might need to be a bit more sophisticated ID.

One example, which I am not sure about whether it would really work:
If src.out were something like a tuple of (blockID: usize, connectorname: String) we would have the currently necessary information (src, "out") in one attribute that could be passed to the connect function.

loic-fejoz · 2022-10-02T10:04:52Z

Not so simple neither because some blocks may be template based (not only type, but number of inputs/outputs) and thus it would require extra works with macro at compile-time.
Nothing impossible I guess but not that obvious.

Also concerning the syntax, an alternative: I have started to experiment to provide a simpler syntax: https://github.com/loic-fejoz/FutureSDR/tree/feature/flowgraph-macro/futuresdr-macros

#[flowgraph(fg)]
{
    blk1.out2 > blk2.samples;
    blk2 > blk3.input;
    blk3.output > blk4;
};

It provides the simplicity you mentioned initially, but not the compile-time verification. And anyway, not all program will be known at compile-time (think graph editor/interpreter as other have started to build).

I would also add that what would be great is also to have the capability to check item size at compile-time in some case.

nils-werner · 2022-10-02T10:56:55Z

src in your example is actually just some type of block identifier (at the moment a usize), since the block was already added to the flowgraph.

That sounds like it's breaking Rust philosophy: If your flowgraph internally drops its src variable, your block identifier would be dangling. Wouldn't it be better if add_block() returned an immutable reference or a view or something similar? That way the compiler won't even allow those kinds of errors.

While you can implement blocks with struct fields that return typed output ports, how could the flowgraph learn about them? You have to have some kind of function fg.add_block(&mut self, b: ???)

Does the flowgraph need to know about these ports at that point? Isn't it enough if it learns about it in connect()?

It couldn't be StreamPort for different, arbitrary xes (i.e., vec![a as StreamPort, b as StreamPort] is not possible)

If you put all possible StreamPort types in an enum, or they all have the same common Trait, I believe you can

It goes on and on and on like this. While things may seem trivial, implementing a typed API is not straightforward. I think there are good answers to all these issues, but one just has to give it a try.

Maybe the help from somebody with more Rust experience is needed... (ping @bytesnake 😉)

Not so simple neither because some blocks may be template based (not only type, but number of inputs/outputs) and thus it would require extra works with macro at compile-time.

In that case can't we just make .out an array of StreamPorts and connect src.out[0]?

Also concerning the syntax, an alternative

Neat! I was also toying around with a similar thought yesterday

bastibl · 2022-10-04T12:35:45Z

@JuliDi I completely agree. Recently, I worked on something similar: There is now a FlowgraphDescription and BlockDescription that is basically exactly that. It can be attained through the flowgraph handle (allowing it to be updated when the flowgraph is running) and, since these structs are serializable, they are available through the REST API. See this blog post: https://www.futuresdr.org/blog/better-flowgraph-interaction/ I want to further extend this interface to allow more ways to interact with the flowgraph, while it is running. This will require adding, for example, more IDs to edges to allow referring to them. To my understanding, that's in line with your suggestion.

@loic-fejoz your comment motivated me pick up your work on macros. They are now merged. Thank you :-) https://www.futuresdr.org/blog/macros/

@nils-werner I know, it's Rust, so why not type everything? And I'd really love to. But in reality that's not always as easy as it may seem from a brief glance over the API.

That sounds like it's breaking Rust philosophy: If your flowgraph internally drops its src variable, your block identifier would be dangling. Wouldn't it be better if add_block() returned an immutable reference or a view or something similar? That way the compiler won't even allow those kinds of errors.

This is not against any Rust philosophy that I'd know of. It is the exact same thing as adding something to a vector and referring to it with an index in the vector. Also if the outside had a reference, the flowgraph could not use the block, since it couldn't borrow it mutably. Same, if you had an outside mutable reference. So this would go against fundamental Rust concepts. Only option would be to have a Arc<Mutex<Block>> reference. But then the outside world could lock the mutex and the flowgraph scheduler had no way to avoid it or schedule it at some reasonable point in time. Also the outside world could call arbitrary functions on the block without the flowgraph/runtime knowing it. This would be a complete disaster from a design perspective. And, in addition, the flowgraph would have to acquire a mutex every time it wants to do anything with the block, which would be a complete performance disaster. And apart from all that, we want remote nodes to allow interacting with the flowgraphs. These cannot have memory references and, therefore, will need some form of IDs in any way.

Regarding ports: as I mentioned in the previous reply, you cannot make a vector of typed StreamPorts.

It couldn't be StreamPort for different, arbitrary xes (i.e., vec![a as StreamPort<u8>, b as StreamPort<f32>] is not possible) . It has to be behind a dynamically dispatched, generic port trait.

I'm not really sure, I get your points. As I said, I'd love to have the API more typed but it's really not trivial. If you have concrete suggestions, your PRs are very welcome!

bytesnake · 2022-10-04T14:07:31Z

Thank you for this awesome project!

Maybe the help from somebody with more Rust experience is needed... (ping @bytesnake wink)

@nils-werner asked me to take a look (he knows gnuradio but not rust, for me it's the other way around), I read some hours this morning and wrote things on my mind to issues. I will need more time, and probably prototypes to fully understand. One thing that is not clear to me is what a SDR library makes fundamentally different to computational graphs (such as https://docs.rs/adapton/latest/adapton/#adapton-programming-model), is the edge state a ringbuffer and the computation happens on views instead of passed messages? For inspiration regarding the type description a look at graph librarys (such as petgraph) is sometimes helpful.

safety regarding scalar types: Checking type size is not enough #67
distinction of messages/data: Use enumeration to distinguish between samples and messages #68
combinators: Add combinators for blocks #69

Not using strings as port-names, but instead struct fields, which allows us to catch more errors during compile time.

you would typical use associated types for this (to reduce type overhead) and add type equality constraints when bounding parameters of chaining methods (like this https://stackoverflow.com/questions/55135637/how-to-require-that-the-associated-types-from-two-traits-are-the-same)

The whole story is often more complicated than on first sight but a middle-ground of compile/runtime checking is often achievable (similar to ndarray checking on dimension and storage but not dimension shapes; as this would require sophisticated const generics)

Now that GATs are stabilized, we can also bound the associated type's type and make sure that predecessors have the right scalar to operate on, but to understand the effects I need a prototype ...

This idea is a bit out there, but ports could be generics for their signal type, e.g. StreamPort<Complex>. This allows the compiler to even catch connection type errors and maybe even infer all connection types just from the output type of the first block?
And because a field has a type (e.g. StreamPort vs MessagePort), we can infer what kind of connection we are building, so no need for connect_*() but one connect() for all port types.

you can create type list like https://stackoverflow.com/a/40222903 and bound the inner (to have at least sanity checks in place) but rather should implement conversion traits for input/output types and bound them during chaining. An edge state with shared stream and message port could contain a ringbuffer and message queue and implement polling as variant on either one of them (and probably give priority to messages) You could then also require a state with queue for blocks implementing message behaviour.

Until now, I was mainly focused on getting things up and running (and convincing myself that the general approach is reasonable) and didn't have the time to explore this in detail.

thanks for doing this in Rust, necessary crab dance 🦀

src in your example is actually just some type of block identifier (at the moment a usize), since the block was already added to the flowgraph. To allow something like src.out, it would have to be something typed and specific to your block. How should that look like?

you could return a struct with block identifier + phantom markers for the input/output type and check the scalar type during connection

src in your example is actually just some type of block identifier (at the moment a usize), since the block was already added to the flowgraph. To allow something like src.out, it would have to be something typed and specific to your block. How should that look like?

no this is hard to do but you could do recursion on associated types (like https://stackoverflow.com/questions/40219725/constructing-hetereogenous-type-lists-in-rust/40222903#40222903) and add associated types for a Join composer

To overcome the previous issue, ports have to be in some sort of vector. This allows the flowgraph to access them through, for example, block.stream_output(0). But what type should that vector have? It couldn't be StreamPort for different, arbitrary xes (i.e., vec![a as StreamPort, b as StreamPort] is not possible) . It has to be behind a dynamically dispatched, generic port trait.

or enumeration over the inner type and conversion bounds to the input, but I need a prototype to check

It goes on and on and on like this. While things may seem trivial, implementing a typed API is not straightforward. I think there are good answers to all these issues, but one just has to give it a try.

haha I still have a headache when we implemented cross validation in linfa https://github.com/rust-ml/linfa/blob/master/src/dataset/impl_dataset.rs#L843 :D

Also concerning the syntax, an alternative: I have started to experiment to provide a simpler syntax: https://github.com/loic-fejoz/FutureSDR/tree/feature/flowgraph-macro/futuresdr-macros

the macro is great, can you also group them to construct DAGs in blocks?

That sounds like it's breaking Rust philosophy: If your flowgraph internally drops its src variable, your block identifier would be dangling. Wouldn't it be better if add_block() returned an immutable reference or a view or something similar? That way the compiler won't even allow those kinds of errors.

no because you can't dereference your identifier anymore when the graph drops (and add_node is takes ownership of the node)

I'm not really sure, I get your points. As I said, I'd love to have the API more typed but it's really not trivial. If you have concrete suggestions, your PRs are very welcome!

I will find spare time and experiment a bit, do you have a sample of previous attempts or a design document? I will also probably have questions regarding gnuradio and ask Nils

bastibl · 2022-10-04T15:05:28Z

I can just say, I looked into other code. And, for example, petgraph is great. But here, any block/node can have an arbitrary number of inputs and outputs and each input and output could have a different type. I don't see how this could be reflected in the type system... If you see a way to do it, please open a PR...

etschneider · 2022-10-08T17:47:03Z

I think this idea (pure compile time checking) is impossible if it is intended to be able to dynamically re/configure from an external source. For instance serializing a flowgraph, or an interactive flowgraph designer (like eSDR).

That said, I think a more structured block/port specification interface would be helpful, as opposed to arbitrary strings and integers.

bytesnake · 2022-11-06T10:30:06Z

I think this idea (pure compile time checking) is impossible if it is intended to be able to dynamically re/configure from an external source

you can specialize your implementation with a trait object when you want to configure during runtime

etschneider · 2022-11-06T19:11:29Z

I think this idea (pure compile time checking) is impossible if it is intended to be able to dynamically re/configure from an external source

you can specialize your implementation with a trait object when you want to configure during runtime

Yes, I suppose it would be possible to support both options somehow. I think I'd need to see an example to really grok it though.

bastibl closed this as completed Nov 5, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Let compiler infer types and ports in fg.connect() #66

Let compiler infer types and ports in fg.connect() #66

nils-werner commented Oct 1, 2022 •

edited

bastibl commented Oct 1, 2022 •

edited

JuliDi commented Oct 2, 2022 •

edited

loic-fejoz commented Oct 2, 2022

nils-werner commented Oct 2, 2022 •

edited

bastibl commented Oct 4, 2022

bytesnake commented Oct 4, 2022 •

edited

bastibl commented Oct 4, 2022

etschneider commented Oct 8, 2022 •

edited

bytesnake commented Nov 6, 2022

etschneider commented Nov 6, 2022

Let compiler infer types and ports in fg.connect() #66

Let compiler infer types and ports in fg.connect() #66

Comments

nils-werner commented Oct 1, 2022 • edited

bastibl commented Oct 1, 2022 • edited

JuliDi commented Oct 2, 2022 • edited

loic-fejoz commented Oct 2, 2022

nils-werner commented Oct 2, 2022 • edited

bastibl commented Oct 4, 2022

bytesnake commented Oct 4, 2022 • edited

bastibl commented Oct 4, 2022

etschneider commented Oct 8, 2022 • edited

bytesnake commented Nov 6, 2022

etschneider commented Nov 6, 2022

nils-werner commented Oct 1, 2022 •

edited

bastibl commented Oct 1, 2022 •

edited

JuliDi commented Oct 2, 2022 •

edited

nils-werner commented Oct 2, 2022 •

edited

bytesnake commented Oct 4, 2022 •

edited

etschneider commented Oct 8, 2022 •

edited