-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Oracle VM changes #1243
Oracle VM changes #1243
Conversation
Draft
@neo-project/core @neo-project/ngd-shanghai @realloc Please take a look to this changes |
Can you draw a diagram to describe how it works? |
@erikzhang I updated the description, it's based on our main proposal. The VM and the mempool is equal in all the proposal that we discussed, so we need to define the consensus mechanism (not included in this PR). |
|
Is this method asynchronous? The download may take a long time. |
The idea is that OracleNodes will take the Transaction if this tx contains the So, it's sync but it will be executed in other thread. |
If it is synchronous, the download progress will block the execution of the other transactions. |
The syscall in self is synchronous, how is called, could be asynchronous, but the VM script must be synchronous.
No, because the consensus only will take the OracleTX if they have the Oracle Result, so in this moment there are no download procedures, all is cached and agreed by Oracle Nodes. This is the flow:
|
switch (usage) | ||
{ | ||
case TransactionAttributeUsage.OracleExpectedResult: | ||
{ | ||
attrib = new OracleExpectedResult(); | ||
break; | ||
} | ||
default: throw new FormatException(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use Activator.CreateInstance()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I followed the same pattern as NodeCapability
/// <param name="filter">Filter</param> | ||
/// <param name="output">output</param> | ||
/// <returns>True if was filtered</returns> | ||
public static bool FilterJson(string input, string filter, out string output) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need filter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In order to reduce the size and the agreement.
Imagine this Response
{"name":"neo":"timestamp":1231239139123}
Every download you get different values, but the SC only want the "name" property, so the OracleNodes have an easy work
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SYSCALL should not do too much work, otherwise it will be difficult to price.
We should create an oracle standard that requires a timestamp or id on the Url, and then asks the same request to return the json must be exactly the same. The filtering work should be handed over to the contract.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But then all the oracles will be centralized by the projects, because they will need to put a web service for handle the request, with filters we can consume third party services also, like coinmarketcap for example
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In Oracle case, shouldn't the syscall actually do nothing but take the result from OracleResultCache for CN and regular nodes?
The actual work is done only by Oracle nodes, but they get paid for resulting Oracle Tx, so pricing should not be the problem here. Also, filtering on SC side would increase the OracleTx size and, as a result, increase Oracle usage price and blockchain disk consumption.
Shouldn't Oracle, serving as a way to talk to external world, do as much as possible on Oracle side to decrease the load on SC and blockchain, improve Developer's Experience and provide better support for external world connectivity?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It won't be centralized. We shall have a standard. In fact, your PR has asked all the returns to be JSON, which is actually part of the standard I am talking about. We also need some additional specifications. For example, it is required to return the exact same response for the same request.
As for filtering, we have the JSON API, so the contract itself can easily filter JSON.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why filtering specifically for oracles? Arent we doing that already as callbacks? I mean, this has general application, not only for oracles.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@igormcoelho Filtering for Oracles should be done to save space in blockchain by recording only the filtered result, reduce Oracle Tx size and, indirectly, increase TPS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean, its not important just for oracles... its a general purpose approach for filtering data on smart contracts, and if this is the direction, we may need to extend other things as well. I responded on oracle filter issue with more details.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we move the discussion to the issue? #1259
public enum HTTPMethod : byte | ||
{ | ||
GET = 0, | ||
POST = 1, | ||
PUT = 2, | ||
DELETE = 3 | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can support GET
only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For us is easy to allow multiple methods, why we should limit this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The built-in Oracle should be as simple as possible, don't do a lot of complicated things. PUT
and DELETE
are rarely used. The problem with POST
is that we may also need to submit additional data, we also need to encode the data, and then the encoding must be consistent. This will add too much complexity. In fact, GET
can already meet most of the needs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GET
is good for me, but I think that NeoFS require more than this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
POST is used as base method for JSON RPC and most REST APIs. Oracles would be used for integration with external systems, sot supporting full set of HTTP methods would be beneficial. At least GET and POST, as most widely used, should be there for sure, aren't they?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As for NeoFS, it must be decoupled from Oracle.
Oracle is the interaction protocol between blockchain and external entities. NeoFS is an external entity. This is the reason why it is efficient to store data in distributed storage and not in blockchain itself. Consider getting data from the internet web page and from the NeoFS - both cases require interaction outside of the blockchain. Oracle designed to solve exactly this issue, so maybe there is no need to decouple NeoFS from the Oracle.
Various protocol support (http-oracle and neofs-oracle for example) will also confirm viability of Oracle protocol.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HTTP POST semantically, its function is to submit data to the server. So theoretically the same data should only be submitted once. But our Oracle is a distributed system with multiple nodes doing the same job. In this case, I don't think POST is still following the original semantics. Oracle should only get the data, in which case GET is sufficient.
Not exactly. Even if we're to look at the old RFC 2616 that says this:
The POST method is used to request that the origin server accept the entity enclosed in the request as a new subordinate of the resource identified by the Request-URI in the Request-Line.
it at the same time says that
The actual function performed by the POST method is determined by the server and is usually dependent on the Request-URI.
And more modern versions of the standard (RFC 7231) are even more liberal in their interpretation of POST
(or just following the line of what is really done via POST
on the net):
The POST method requests that the target resource process the representation enclosed in the request according to the resource's own specific semantics.
So it basically moves these questions to the upper layer and we have quite a lot of upper layers on top of HTTP these days.
Like JSON RPC, let's say we're making a bet with @realloc on whether there will be a 0.70.0 release of neo-go before the end of November. For a release proof we need to make a JSON RPC getversion
call to our mainnet nodes, so we need to be able to do that via our new shiny Oracles and if they can't do a POST
we're out of luck.
Also note that while in our bet example we could probably arrange some GET
-able proof for the matter, many of the endpoints providing data may be outside of the smart contract developer control and may not have any other means to get data but making a POST
request.
So I think there is no point in limiting our HTTP Oracles to just GET
requests, support for other methods doesn't break any standard compliance and there are useful real-world scenarios that require us to have broader spectrum of supported methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Simple. Decoupled. These two points are my most basic requirements for both Oracle and NeoFS.
Simple, the simpler the better. No POST
, only GET
.
Decoupled. I need NeoFS to work without Oracle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If NeoFS will work not through Oracle protocol but a simple syscall it may halt the execution of SC while waiting for NeoFS response. Maybe it would be more reliable to have all interactions with external systems go via a single Oracle protocol. It could be the simplest solution in the end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@realloc Can you create a new issue and list the features that NeoFS will offer to smart contracts? (I think it may be a SYSCALL list.) In this way, we can easily discuss the specific implementation and whether it can be decoupled.
There is an attack method. Send a lot of |
Co-Authored-By: Erik Zhang <erik@neo.org>
The oracles nodes should agree that this is a fault (Server,Timeout...), and these TX must to enter in the next block. When the CN try to access to the DownloadCache they will see that the download returns a fault. |
|
||
// Try again, because it possible that the SC use the attributes of the TX | ||
|
||
return MakeTransaction(snapshot, script, attributes, cosigners, balances_gas, OracleExecutionType.WithoutOracles); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is for me the complicated work, the SC can change the required fee according to the downloaded result, and the attributes attached. So maybe we should enable to set the fee as a parameter.
If you submit a large number of transactions, visit a specially designed Url. Access from this Url will be blocked for 30 seconds and then an error will be returned. This is a possible way of attack. |
The Download process have a timeout, and if this happens, they will throw a Timeout error. And the attacker must pay the fee. Also the regular CN still working, Oracle must have his own consensus and only provide results to CN |
If the download process times out, won't the transaction be discarded from the memory pool? |
No, the idea is that the current memory pool will wait for the result of the oracle agreement, when they have the result the transaction (with content or error) the tx will be validated and be able to be chosen by the CN, without any download procedure. In this way the tx could wait more, but it doesn't affect the current consensus. |
@erikzhang, the line of reasoning we thought was this, if the transactions fail it will be published anyway because it is an agreement. The A possible attack can be made by the |
#TODO:
This proposal does not need Virtual Machine modifications. It only needs to add a the new download syscall. This Syscall gets from URL arguments and xpath filter.
Thus, we reduce stored content in
DownloadExecutionCache
to the minimum needed for Smart Contract. As an example, only the filtered content is agreed and stored, optimizing storage space, meaning 1 MB JSON will not be stored into the chain but just the values the smart contracts needs.XPATH filter also helps to achieve determinism in webs/APIs that have variations in each request and these variations are outside the data that Smart Contract needs.
As an example, the following API’s response
{“name”:”NEO”,”time”:213123123213}
where time varies in each request, but the Smart Contract only needs “name”, this non determinism problema won’t exist during Oracle negotiation.(* Note: Maximum content size and total TBD).
ApplicationEngine execution
As you can see in the following diagram, during a Smart Contract execution will check if the result exits in the
DownloadExecutionCache
every time Oracle’s syscall is used.If the node represents an authorized Oracle or a client building his/her transactions, it will download the content, apply the filter and generate
DownloadExecutionCache
.In case the client is building the transaction, it will include results hash in
OracleExpectedResult
before signing and broadcasting the transaction.During execution, the virtual machine (syscall) will return “Failure” if there is no total or partial
DownloadExecutionCache
, or if the final OracleAgreement is false.