Oracle VM changes #1243

shargon · 2019-11-14T21:13:24Z

It contains the classes and methods related to the execution of Oracle Syscalls.
This pr does not contains any related code to oracle consensus or oracle pool.

#TODO:

Review Filters implementation
OracleService require a policy (in a smart contract)
OracleService should cache the results in order to optimize the same call between different TX

This proposal does not need Virtual Machine modifications. It only needs to add a the new download syscall. This Syscall gets from URL arguments and xpath filter.

URL is the destination address where the Oracle will download the data.
XPATH filter is used to shorten downloaded content and ease access to information for developers. It can be used in JSON or HTML responses.

Thus, we reduce stored content in DownloadExecutionCache to the minimum needed for Smart Contract. As an example, only the filtered content is agreed and stored, optimizing storage space, meaning 1 MB JSON will not be stored into the chain but just the values the smart contracts needs.
XPATH filter also helps to achieve determinism in webs/APIs that have variations in each request and these variations are outside the data that Smart Contract needs.
As an example, the following API’s response {“name”:”NEO”,”time”:213123123213} where time varies in each request, but the Smart Contract only needs “name”, this non determinism problema won’t exist during Oracle negotiation.
(* Note: Maximum content size and total TBD).

ApplicationEngine execution

As you can see in the following diagram, during a Smart Contract execution will check if the result exits in the DownloadExecutionCache every time Oracle’s syscall is used.
If the node represents an authorized Oracle or a client building his/her transactions, it will download the content, apply the filter and generate DownloadExecutionCache.
In case the client is building the transaction, it will include results hash in OracleExpectedResult before signing and broadcasting the transaction.
During execution, the virtual machine (syscall) will return “Failure” if there is no total or partial DownloadExecutionCache, or if the final OracleAgreement is false.

Draft

shargon · 2019-11-16T11:59:03Z

@neo-project/core @neo-project/ngd-shanghai @realloc Please take a look to this changes

erikzhang · 2019-11-17T13:40:32Z

Can you draw a diagram to describe how it works?

shargon · 2019-11-17T17:10:08Z

@erikzhang I updated the description, it's based on our main proposal. The VM and the mempool is equal in all the proposal that we discussed, so we need to define the consensus mechanism (not included in this PR).

shargon · 2019-11-17T17:13:50Z

OracleResult => We need to do the consensus of this
OracleExpectedResult => Transaction Attribute, especify the expected hash or not
OracleExecutionCache => Contains the execution result

erikzhang · 2019-11-19T06:17:14Z

Is this method asynchronous? The download may take a long time.

shargon · 2019-11-19T08:45:01Z

Is this method asynchronous? The download may take a long time.

The idea is that OracleNodes will take the Transaction if this tx contains the OracleExpectedHash attribute, and oracles will have a pool (queue) for execute this TX with a specific timeout. While they have been executed every TX, they have the class for the agreement, OracleResult

So, it's sync but it will be executed in other thread.

erikzhang · 2019-11-19T08:55:11Z

If it is synchronous, the download progress will block the execution of the other transactions.

shargon · 2019-11-19T09:46:40Z

The syscall in self is synchronous, how is called, could be asynchronous, but the VM script must be synchronous.

If it is synchronous, the download progress will block the execution of the other transactions.

No, because the consensus only will take the OracleTX if they have the Oracle Result, so in this moment there are no download procedures, all is cached and agreed by Oracle Nodes. This is the flow:

Receive a TX.
Search if it's an oracleTX (TransactionAttributes => OracleResult).
If not, regular behavior.
If it's OracleTX, it won't be verified until OracleResult was agreed and received by all CNs.
Now this TX could be taken and included in a block with its cache, because there are no downloads.

neo/IO/Helper.cs

neo/Network/P2P/Payloads/Transaction.cs

erikzhang · 2019-11-19T13:01:22Z

neo/Network/P2P/Payloads/TransactionAttribute.cs

+            switch (usage)
+            {
+                case TransactionAttributeUsage.OracleExpectedResult:
+                    {
+                        attrib = new OracleExpectedResult();
+                        break;
+                    }
+                default: throw new FormatException();
+            }


Use Activator.CreateInstance()?

I followed the same pattern as NodeCapability

neo/Network/P2P/Payloads/TransactionAttributeUsage.cs

neo/Oracle/OracleExecutionCache.cs

erikzhang · 2019-11-19T13:12:38Z

neo/Oracle/OracleFilters.cs

+        /// <param name="filter">Filter</param>
+        /// <param name="output">output</param>
+        /// <returns>True if was filtered</returns>
+        public static bool FilterJson(string input, string filter, out string output)


Why do we need filter?

In order to reduce the size and the agreement.
Imagine this Response

{"name":"neo":"timestamp":1231239139123}

Every download you get different values, but the SC only want the "name" property, so the OracleNodes have an easy work

SYSCALL should not do too much work, otherwise it will be difficult to price.

We should create an oracle standard that requires a timestamp or id on the Url, and then asks the same request to return the json must be exactly the same. The filtering work should be handed over to the contract.

But then all the oracles will be centralized by the projects, because they will need to put a web service for handle the request, with filters we can consume third party services also, like coinmarketcap for example

In Oracle case, shouldn't the syscall actually do nothing but take the result from OracleResultCache for CN and regular nodes?

The actual work is done only by Oracle nodes, but they get paid for resulting Oracle Tx, so pricing should not be the problem here. Also, filtering on SC side would increase the OracleTx size and, as a result, increase Oracle usage price and blockchain disk consumption.

Shouldn't Oracle, serving as a way to talk to external world, do as much as possible on Oracle side to decrease the load on SC and blockchain, improve Developer's Experience and provide better support for external world connectivity?

It won't be centralized. We shall have a standard. In fact, your PR has asked all the returns to be JSON, which is actually part of the standard I am talking about. We also need some additional specifications. For example, it is required to return the exact same response for the same request.

As for filtering, we have the JSON API, so the contract itself can easily filter JSON.

why filtering specifically for oracles? Arent we doing that already as callbacks? I mean, this has general application, not only for oracles.

@igormcoelho Filtering for Oracles should be done to save space in blockchain by recording only the filtered result, reduce Oracle Tx size and, indirectly, increase TPS.

I mean, its not important just for oracles... its a general purpose approach for filtering data on smart contracts, and if this is the direction, we may need to extend other things as well. I responded on oracle filter issue with more details.

Can we move the discussion to the issue? #1259

erikzhang · 2019-11-19T13:16:15Z

neo/Oracle/Protocols/HTTP/OracleHTTPRequest.cs

+        public enum HTTPMethod : byte
+        {
+            GET = 0,
+            POST = 1,
+            PUT = 2,
+            DELETE = 3
+        }


I think we can support GET only.

For us is easy to allow multiple methods, why we should limit this?

The built-in Oracle should be as simple as possible, don't do a lot of complicated things. PUT and DELETE are rarely used. The problem with POST is that we may also need to submit additional data, we also need to encode the data, and then the encoding must be consistent. This will add too much complexity. In fact, GET can already meet most of the needs.

GET is good for me, but I think that NeoFS require more than this

POST is used as base method for JSON RPC and most REST APIs. Oracles would be used for integration with external systems, sot supporting full set of HTTP methods would be beneficial. At least GET and POST, as most widely used, should be there for sure, aren't they?

As for NeoFS, it must be decoupled from Oracle.

Oracle is the interaction protocol between blockchain and external entities. NeoFS is an external entity. This is the reason why it is efficient to store data in distributed storage and not in blockchain itself. Consider getting data from the internet web page and from the NeoFS - both cases require interaction outside of the blockchain. Oracle designed to solve exactly this issue, so maybe there is no need to decouple NeoFS from the Oracle.

Various protocol support (http-oracle and neofs-oracle for example) will also confirm viability of Oracle protocol.

HTTP POST semantically, its function is to submit data to the server. So theoretically the same data should only be submitted once. But our Oracle is a distributed system with multiple nodes doing the same job. In this case, I don't think POST is still following the original semantics. Oracle should only get the data, in which case GET is sufficient.

Not exactly. Even if we're to look at the old RFC 2616 that says this:

The POST method is used to request that the origin server accept the entity enclosed in the request as a new subordinate of the resource identified by the Request-URI in the Request-Line.

it at the same time says that

The actual function performed by the POST method is determined by the server and is usually dependent on the Request-URI.

And more modern versions of the standard (RFC 7231) are even more liberal in their interpretation of POST (or just following the line of what is really done via POST on the net):

The POST method requests that the target resource process the representation enclosed in the request according to the resource's own specific semantics.

So it basically moves these questions to the upper layer and we have quite a lot of upper layers on top of HTTP these days.

Like JSON RPC, let's say we're making a bet with @realloc on whether there will be a 0.70.0 release of neo-go before the end of November. For a release proof we need to make a JSON RPC getversion call to our mainnet nodes, so we need to be able to do that via our new shiny Oracles and if they can't do a POST we're out of luck.

Also note that while in our bet example we could probably arrange some GET-able proof for the matter, many of the endpoints providing data may be outside of the smart contract developer control and may not have any other means to get data but making a POST request.

So I think there is no point in limiting our HTTP Oracles to just GET requests, support for other methods doesn't break any standard compliance and there are useful real-world scenarios that require us to have broader spectrum of supported methods.

Simple. Decoupled. These two points are my most basic requirements for both Oracle and NeoFS.

Simple, the simpler the better. No POST, only GET.

Decoupled. I need NeoFS to work without Oracle.

If NeoFS will work not through Oracle protocol but a simple syscall it may halt the execution of SC while waiting for NeoFS response. Maybe it would be more reliable to have all interactions with external systems go via a single Oracle protocol. It could be the simplest solution in the end.

@realloc Can you create a new issue and list the features that NeoFS will offer to smart contracts? (I think it may be a SYSCALL list.) In this way, we can easily discuss the specific implementation and whether it can be decoupled.

erikzhang · 2019-11-19T13:22:58Z

There is an attack method.

Send a lot of OracleTx, including some Urls that cannot be accessed. Since the Urls are inaccessible, the transactions will not enter the block and will always be in the memory pool until timeout. And these transactions do not actually need to pay any fees.

Co-Authored-By: Erik Zhang <erik@neo.org>

shargon · 2019-11-19T13:50:43Z

Send a lot of OracleTx, including some Urls that cannot be accessed. Since the Urls are inaccessible, the transactions will not enter the block and will always be in the memory pool until timeout. And these transactions do not actually need to pay any fees.

The oracles nodes should agree that this is a fault (Server,Timeout...), and these TX must to enter in the next block. When the CN try to access to the DownloadCache they will see that the download returns a fault.

shargon · 2019-11-19T13:57:15Z

neo/Wallets/Wallet.cs

+
+                        // Try again, because it possible that the SC use the attributes of the TX
+
+                        return MakeTransaction(snapshot, script, attributes, cosigners, balances_gas, OracleExecutionType.WithoutOracles);


This is for me the complicated work, the SC can change the required fee according to the downloaded result, and the attributes attached. So maybe we should enable to set the fee as a parameter.

erikzhang · 2019-11-19T14:58:19Z

If you submit a large number of transactions, visit a specially designed Url. Access from this Url will be blocked for 30 seconds and then an error will be returned. This is a possible way of attack.

shargon · 2019-11-19T19:00:35Z

If you submit a large number of transactions, visit a specially designed Url. Access from this Url will be blocked for 30 seconds and then an error will be returned. This is a possible way of attack.

The Download process have a timeout, and if this happens, they will throw a Timeout error. And the attacker must pay the fee.

Also the regular CN still working, Oracle must have his own consensus and only provide results to CN

erikzhang · 2019-11-20T04:17:03Z

If the download process times out, won't the transaction be discarded from the memory pool?

shargon · 2019-11-20T08:49:03Z

If the download process times out, won't the transaction be discarded from the memory pool?

No, the idea is that the current memory pool will wait for the result of the oracle agreement, when they have the result the transaction (with content or error) the tx will be validated and be able to be chosen by the CN, without any download procedure. In this way the tx could wait more, but it doesn't affect the current consensus.

vncoelho · 2019-11-21T14:36:32Z

@erikzhang, the line of reasoning we thought was this, if the transactions fail it will be published anyway because it is an agreement.

The Speaker has the potential to propose the txs and its expectedhash, if the other M-1 Backups agree the transaction will be published as successful. Otherwise, it will fail because of inconsistency during the Byzantine agreement.

A possible attack can be made by the Speaker in order to waste fees of oracle txs. However, I think it is quite easy detectable and NEO holders would vote against that bad actor.

shargon added 22 commits November 14, 2019 20:20

Oracle VM related

c1dd1e4

Draft

Remove tx hash from Request

7da35cf

Format

4e00ada

Clean OracleResult

78f0a56

Format

af402c8

Organization

3572c99

Transaction => OracleExpectedResult

b64d5ed

TODO: HTTP2

0c5e703

Clean

497bca7

Move interops

5b68397

Some UT

5b9646a

Join HTTP1 and HTTP2

71e4a40

Reorder

6ccaa8f

OracleExpectedResult UT

ba8e839

Clean

2871666

Oracle Service UT

6f9aded

POST

d1a4a20

DELETE and PUT

95dd1dd

Oracle Result UT

bd5c9a0

Remove not related to VM

52f743b

Clean OracleFilters

d1f1da3

Ensure multiple calls cached as one

08e9f0f

shargon marked this pull request as ready for review November 16, 2019 11:56

shargon added 3 commits November 16, 2019 13:12

Rename

dd19bed

Format

d285750

Remove dynamic call

9040704

shargon added 5 commits November 18, 2019 13:32

Reduce version

8e7c623

XPath filter

3dde9b8

Json filter

903dbcf

Clean code

34cb37e

Regex filter

4e93dc5

Optimize Syscall

c706770

Refactor TransactionAttributes

5fea844

erikzhang reviewed Nov 19, 2019

View reviewed changes

shargon and others added 2 commits November 19, 2019 14:42

Update neo/Network/P2P/Payloads/TransactionAttributeUsage.cs

d1cc8f0

Co-Authored-By: Erik Zhang <erik@neo.org>

Some fixes

f4890d1

Format

74d9808

shargon commented Nov 19, 2019

View reviewed changes

shargon added 2 commits November 20, 2019 09:52

Remove PUT and DELETE

e2071e7

OraclePolicy

64c30a8

This was referenced Nov 20, 2019

[Oracles] Filters #1259

Closed

[Oracles] Supported Protocols #1260

Closed

belane closed this Feb 17, 2020

shargon deleted the update-vm branch February 17, 2020 15:19

roman-khimov mentioned this pull request Oct 1, 2020

Support for Put/Post operations in Oracle contract #1980

Open


		// Try again, because it possible that the SC use the attributes of the TX

		return MakeTransaction(snapshot, script, attributes, cosigners, balances_gas, OracleExecutionType.WithoutOracles);

Oracle VM changes #1243

Oracle VM changes #1243

Conversation

shargon commented Nov 14, 2019 • edited Loading

ApplicationEngine execution

shargon commented Nov 16, 2019 • edited Loading

erikzhang commented Nov 17, 2019

shargon commented Nov 17, 2019

shargon commented Nov 17, 2019

erikzhang commented Nov 19, 2019

shargon commented Nov 19, 2019

erikzhang commented Nov 19, 2019

shargon commented Nov 19, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

erikzhang commented Nov 19, 2019

shargon commented Nov 19, 2019 • edited Loading

Choose a reason for hiding this comment

erikzhang commented Nov 19, 2019 • edited Loading

shargon commented Nov 19, 2019 • edited Loading

erikzhang commented Nov 20, 2019

shargon commented Nov 20, 2019

vncoelho commented Nov 21, 2019 • edited Loading

shargon commented Nov 14, 2019 •

edited

Loading

shargon commented Nov 16, 2019 •

edited

Loading

shargon commented Nov 19, 2019 •

edited

Loading

shargon commented Nov 19, 2019 •

edited

Loading

erikzhang commented Nov 19, 2019 •

edited

Loading

shargon commented Nov 19, 2019 •

edited

Loading

vncoelho commented Nov 21, 2019 •

edited

Loading