-
Notifications
You must be signed in to change notification settings - Fork 622
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Global Network
#1840
Comments
I like this idea. I wonder if we should rename In parallel to this I'd like to zoom out a bit and think about what structures here are
It seems like there ought to be an API where the user could construct some basic data structure with the correct |
Doing this at compile-time would require dependent types (and require users produce proofs all over the place that e.g. two parsed-from-string addresses have the same network). But maybe there's a runtime solution:
Finally, [*] Ok, so the network encodings for things like |
Despite using the language of "constraints" and "unification" I actually think this could be implemented entirely in terms of bitmaps and bitwise AND. So initially the "network" field of each structure would be all-bits-1 "every network possible". Then:
Finally, we'd need some set of escape hatches -- a way to explicitly override the network, a way to say "ambiguous serialization is OK" (maybe we just want a bit in the map for this..), and maybe a way to directly set/clear network bits. I think, with all this in place, "most" users would have their network usage automatically checked for consistency, without them needing to explicitly think about it except maybe at startup time, and without them being encouraged to "fix" a network at the type level so that it's impossible to use their stuff on testnet. |
IDK, bitcoind calls it chain but all other applications I know of call it network. I wouldn't change it. But maybe we should use Your ideas seem complicated so I will have to take more time to respond to them properly. My instict is to not over-complicate but there may be good reasons. Also I think that if every input is gated on valid network it should produce valid outputs. So we really only need this in Raw types like |
Right now from a user perspective,
I think that to achieve this we'll probably need something roughly as complicated as what I'm describing. Basically for the compiler to do it for us we'd need dependent types, and since we don't have them, we have to do our own run-time typing.
In your view, when converting a testnet address to a scriptpubkey, embedding this in a mainnet transaction, then broadcasting it to the mainchain, which step is the mistake and how should we determine that?
Hmmm, yeah, definitely agreed here. And it's even worse because every |
Hand wavey input: We should try really hard to make uses of |
Global variables break composability and lead to action-at-distance. I understand the sentiment, as passing such a "context" seems like pure waste. I think generally contemporary programming languages have poor support for "context" passing. (Relevant: https://tmandry.gitlab.io/blog/posts/2021-12-21-context-capabilities/) But all in all global variables ... 🤮 ... There's an already long thread of complex ideas involved, and that is only internally. As a user now I'll have to pay attention to which APIs can be used with a flag, which can' etc. And poor developers that will have to debug issues due to someone somewhere abusing mutable global variables, etc. IMO, if application writers want a global, they can write it themselves. Even write a global-singleton wrapper around |
I was in between a long response. But @dpc got the most of it. I strongly agree with dpc, I think this is bite us in the long run by break composability. This will increase more cognitive load for reviewers to think about "mutable global" things when looking at local changes. IMO the benefits are not that impressive either.
|
Yes, I think my proposal should mitigate much of the complication in vast majority of cases - you just get
I admit this sucks but I don't really know how to fix it. We could add
Assuming the application is single-network (vast majority of applications) there are two places where the mistake could happen: when parsing the address or when connecting to the network. We have the first covered now, so only the second remains. I think JSON RPC should have some methods to either check the network by calling Multi-network applications don't get protection but I don't remember ever seeing one, so it's safe to assume we don't need to bother. @dpc, @sanket1729 firstly I need to say that I hate global mutable variables with passion. But I think there is a case to be made for those which are loaded at the beginning of the program, set once and then never mutated again. Basically
Clearly marked in documentation and you can't use them by accident. Also you won't get bogus suggestions from RLS. Seems fine to me.
That can happen already. And banning setting the variable twice should make it less likely to happen. I would also be in favor of saving the
and
unfortunately this doesn't work well due to orphan rules (can't impl
The current
I don't understand what you mean, it looks like you meant "not adding"? |
If the goal is not have to pass If the goal is to have
If someone really needs to have the parsing of
then also a normal newtype-wrapper:
maybe with For good or bad, this is how these sorts of things are handled in Rust all the time, and people are used to it. |
I really don't think we do. What happens now is that you parse an address and then find that basic functionality is missing and then you have to carefully scan the docs to figure out what generics you need, and what functions to call to get those generics. IMO this generic solution is undiscoverable and confusing. If we want to make the user check the network at parse time we should get rid of If we want to make it so that the user can just check the network and have that check propagated with the address from there on out, we basically cannot do it with the Rust type system. At best we can propagate a binary "I checked the network" which is what we've tried to do, but the result is noisy and hard to use. |
Yeah, that one would be probably not enough.
I don't think so. Reading the docs is a fundamental part of development. I read them all the time. We have a whole section about it in the docs. If people come up with specific ways the section can be improved I'd be happy to.
The issue is people use
It will lead to something worse: moving validation further down the line which makes error handling and debugging harder.
They are literally in the docs. But it just occurred to me that we could add a note to each of them that they need checked address. |
I don't read documentation end-to-end and I don't think most anybody else does either. If I need a function then I scroll (or more likely search) the list of available functions. If search yields no results then I look for other functions that seem to need the missing functionality and read their source code. If searching finds something, but it's gated on some set of generics I'll get annoyed, then find source code to read. If this happens too many times I just drop the library, or revert to some old version before they started doing crazy architecture experiments. (I use Only for well-maintained and mature software like serde (and rust-bitcoin is definitiely not mature!) would I consider actually reading the documentation. Otherwise that sounds like a good way to waste a bunch of time without actually learning how to use the library. Having said this, I checked the
I take your point about
It will lead to moving validation to somewhere where it's logically needed, instead of making users deal with it when they don't have to and won't understand why they have to. |
nACK on the idea of adding a global to the library. Like the OP says, applications do that, I don't think libraries should introduce sidechannels in tons of functions. Our functions should be deterministic. It's really not hard for an application to have a |
@apoelstra could you hint what in
I don't see any mention of type safety there. It's for user safety. When I upgraded rust-bitcoin after we forced the check I discovered several other instances missing it! Also I noticed a similar bug in LND - if they used a library forcing the check they wouldn't have it. Note that I personally had to help numerous people recover their coins sending to a wrong address (was BTC vs LTC but still same category of issue). Not only that, I once was the one who hastily forgot to manually check the address even when being aware of the problem (was an ATM and the manufacturer was slow fixing this). So forgetting a check like this is a serious problem that is unsolvable for 99% of users. In my eyes, this is totally worth a bit more annoying API - because it's the correct API. Programming is complex and rather than hiding this complexity under the carpet where it'll kick you in the ass later we should make tools to deal with it. Especially in security-critical software like Bitcoin.
That's enough. It's similar to Rust not auto-inserting
If this is actually logical in user code (I highly doubt so) they can just pass around @stevenroose we would still have the same deterministic functions, just in addition we would have simpler options. As I say single-network applications are extremely common. Actually, the only one I know that isn't is Ride The Lightning but that one doesn't directly work on Bitcoin stuff, just makes API calls into LN node. Every single wallet, Bitcoin node, LN node, ... everything else is like this. So why should everyone reinvent this when it can be in the library? Especially considering orphan rules? If not for orphan rules and annoyance of extension traits I'd be onboard with putting it into a separate library.
That's actually a terrible design since you have to rebuild the whole application if you want to test something. (Although it's billion times better than having it both hard-coded and scattered around the code. Yes, sadly, there's one such popular wallet with mainnet hard-coded. No, I have no clue how they test it.) Having these helper functions could actually incentivize people to design the application properly. Even if they hard code |
Note: I only read the documentation when I'm investigating something specific. Modern life is just too complex and busy. And as a maintainer of a handful open source project noticed that just because something is documented, does not prevent large chunk of users routinely not being aware of it. Even if you put it in bold capital letters on the front and center. That's why I like type system forcing user's hand or at least long informative names bringing attention to certain aspects.
Agreed. As a user I'm happy to deal with an extra type steps if it ensures correctness. I'm only allergic to global values breaking composition, and introducing conditionally available set of APIs. I've already mentioned 2 separate "If the goal is". This seems to be the 3rd one. If the goal is to force users (developers) to verify network address, session types seem like the way to go:
etc. Then if the crucial places at which point the address should have been already verified, require A runtime flag is also an option, if the API noise of the above seem undesirable. |
For sure.
But it's not correct, since it won't actually prevent you confusing two (checked) addresses that correspond to different networks, and because it won't let you do perfectly safe things like deserializing then re-serializing an address without adding extra Sure, in practice this is good enough for safety, but because it's ad-hoc and "good enough" there's no mental model the user can have that will let them predict what they're able to do and not do with |
Isn't this very reasonable? In a general case if you serialize you loose a lot of context, metdata, and sometimes even more core properties (ordering, types, etc.). If you deserialize, you can't really be sure if you got correct addresses. You can assume it if you trust the source or have some other pre-existing knowledge. As for mixing checked addresses from different networks: it's not unlike core tenants of Rust like e.g. So being a little bit "ad-hoc" does not seem like a big deal to me. It's just a pragmatic tradeoff. Forgetting to check the address is a common source of important bugs, so you guide the developer to do the right thing as much as practical.
You can lead a horse to water. Nothing prevents Rust developers from throwing random By having explicit guide rails and checks, it makes it easier for other people to verify it when reviewing the code. |
@dpc I'm talking about deserializing then serializing. There is no data loss here. @Kixunil I'd be fine with an API where you could do anything with an unchecked address except get a scriptpubkey out of it. And this should be documented. Right now the |
@dpc the goal is to make it easier to use the We already use session types, so I'm not sure what you mean by that example.
I can't even imagine hypothetical scenario where an application needs to check equality of networks of two addresses while at the same time not check it against some global network setting. As I said, there's at most one application I know of which can't be excluded straight away.
Rust can't prevent all bugs and yet people use Rust just fine. I think this is just guessing. To conclusively determine whether the API works we need to ask a bunch of people who actually use it. Notably, we should probably also document that
Yeah, this was my original idea as well. The cool thing is we can do this now without breaking anything. I'm pretty open to this, will think about it more.
😢 This is a pretty strong evidence for the proposal above but note that bunch of upgrade problems were caused by lack of |
Ok! Cool, I think we're (sufficiently) in agreement. Let's try to work toward a "
Oh, phew :). |
My bad, sorry. I'm out of context, got here because of "global variables", haven't even checked, and didn't use Looking ... Wouldn't you want some Also - some Seems like the documentation and API insist on using the generic methods, but that's not necessary. Generic methods are required when dealing with abstractions that need them. For concrete use:
seems handier than:
and autocompletion might suggest it due to common prefix.
Is there somewhere I can read about the biggest pain points? I don't understand "easier to use the Address API in contexts that are affected by orphan rules".
But there's a context-loss. |
Yeah, that could help. Will be happy to accept a PR after I decide if I like
Yeah, it's a bit shorter and if people already like to say
We don't implement I also realized there's an additional utility in having this behind a feature flag: if you decide to refactor the code to remove the global, just turn it off and fix all compile errors. |
But that's the whole point? When you parse/deserialize, you don't know if what you're parsing is actually valid. So you have to pause, scratch your head, maybe read some the docs, and come up with a solution. You can use You can use some hypothetical You can use some hypothetical At least 3 decent options, that are all local, and fairly standard. The |
Not really following entirely, but from some snipper above: it should definitely be |
I just realized that many applications allow the user to configure them for a single network which is then used in the whole program. One would normally parse the network from configuration and then pass it around in arguments. This can be a bit annoying.
Maybe we could also provide a global (internally
AtomicU8
) that user could set inmain
and then use alternative functions that don't requireNetwork
. The functions would obviously need to panic if the network is not set. To avoid accidental use by libraries (and alsotarget_has_atomic
issues) this should be behind a feature flag. When the flag is activated we could also provide an implementation ofFromStr
/Deserialize
forAddress<NetworkChecked>
. (NetworkUnchecked
should keep its behavior regardless of the flag.)Note that this is intended for applications only, libraries should continue to support arguments. Also our library should support both interfaces.
The text was updated successfully, but these errors were encountered: