Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add EIP: Web3 URL to EVM Call Message Translation #6860

Merged
merged 24 commits into from Oct 18, 2023

Conversation

qizhou
Copy link
Contributor

@qizhou qizhou commented Apr 10, 2023

When opening a pull request to submit a new EIP, please use the suggested template: https://github.com/ethereum/EIPs/blob/master/eip-template.md

We have a GitHub bot that automatically merges some PRs. It will merge yours immediately if certain criteria are met:

  • The PR edits only existing draft PRs.
  • The build passes.
  • Your GitHub username or email address is listed in the 'author' header of all affected PRs, inside .
  • If matching on email address, the email address is the one publicly listed on your GitHub profile.

@qizhou qizhou requested a review from eth-bot as a code owner April 10, 2023 22:20
@github-actions github-actions bot added c-update Modifies an existing proposal s-final This EIP is Final t-erc labels Apr 10, 2023
@eth-bot
Copy link
Collaborator

eth-bot commented Apr 10, 2023

✅ All reviewers have approved.

@eth-bot eth-bot changed the title Add more details for the examples for EIP-4804 Update EIP-4804: Add more details for the examples for EIP-4804 Apr 10, 2023
@eth-bot eth-bot added the e-consensus Waiting on editor consensus label Apr 10, 2023
@github-actions github-actions bot added the w-ci Waiting on CI to pass label Apr 10, 2023
@github-actions github-actions bot removed the w-ci Waiting on CI to pass label Apr 11, 2023
Copy link

@nand2 nand2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the more detailed examples, they are great!

EIPS/eip-4804.md Outdated Show resolved Hide resolved
EIPS/eip-4804.md Outdated Show resolved Hide resolved
EIPS/eip-4804.md Outdated Show resolved Hide resolved
EIPS/eip-4804.md Outdated Show resolved Hide resolved
EIPS/eip-4804.md Outdated Show resolved Hide resolved
qizhou and others added 2 commits April 19, 2023 10:27
Co-authored-by: Sam Wilson <57262657+SamWilsn@users.noreply.github.com>
@SamWilsn
Copy link
Contributor

@qizhou we have a bit of a difference of opinion among editors, so we'd like to hear from you.

@xinbenlv @g11tech and myself feel that this change introduces new restrictions that could make a theoretical implementation incompatible with the new version. If that is the case, it would be best to introduce a new EIP with the updated restrictions.

On the other hand, @gcolvin would be okay with updating the current EIP with an errata section.

Would you be okay with submitting the updated version as a new EIP?

@qizhou
Copy link
Contributor Author

qizhou commented May 18, 2023

@qizhou we have a bit of a difference of opinion among editors, so we'd like to hear from you.

@xinbenlv @g11tech and myself feel that this change introduces new restrictions that could make a theoretical implementation incompatible with the new version. If that is the case, it would be best to introduce a new EIP with the updated restrictions.

On the other hand, @gcolvin would be okay with updating the current EIP with an errata section.

Would you be okay with submitting the updated version as a new EIP?

Hi @SamWilsn, many thanks for your feedback. I agree that introducing new restrictions in a finalized EIP is not a good practice. In fact, the PR here does not mean to introduce any new restrictions or extensions - it aims to clarify some frequency asked questions when others try to implement or use the standard (e.g., https://github.com/nand2/evm-browser from @nand2 ) and fix the obvious errors (especially the default return bytes in "(bytes32)" is definitively wrong) without changing the semantics. For example, original EIP does not mention the return type of resolveMode(), which creates confusion for implementors to choose between bytes32 or bytes (surprisedly, solidity can return a string in both types).

I can definitively introduce a new EIP to clarify all these, but this creates fragility of the EIP and the users have to check multiple EIPs before fully understand how it works. This raises a general question regarding how to deal with a long-term evolving standard in our EIP process - taking URL standard as an example, it has been evolving for multiple versions over a decade (from RFC 1738 to RFC 2396 to RFC 3986), and the HTTP standard is still evolving after several decades.

One feature of RFCs may help solve our problem is that it supports Errata for an RFC (https://www.rfc-editor.org/errata_search.php?rfc=2396), and it has a couple of fields in header to maintain the evolution history such as

  • updates tells which RFCs the current RFC amend for (e.g., RFC 2396 updates RFC 1738, 1808)
  • updated by tells which following RFCs amend the current RFC (e.g., RFC 2396 is updated by RFC 2732)
  • 'obsoleted by` tells which RFC override the current RFC (e.g., RFC 2396 is obsoleted by RFC 3986)

Based on the practice of RFCs, I would like to propose to either

  1. Move the changes of the PR in the Errata section of EIP-4804 as suggested by @gcolvin since we have to clarify these and make them as a whole; or
  2. Create a new EIP for this with updates field to EIP-4804 and add updated-by field in EIP-4804 to point to the new EIP.

Note that my observation of updates and updated-by in RPC are that they are generally used for extensions, and for Errata in an RFC, it just incorporated into an RFC directly (see https://www.rfc-editor.org/errata_search.php?rfc=2396). So, I would personally suggest to follow the RFC practices and use method 1. What do you think?

@SamWilsn
Copy link
Contributor

Unfortunately forward-pointing fields (eg. superseded-by, updated by, obsoleted by) require modifying the original EIP. Who should have permission to do that? Once an EIP goes Final, it's "owned" by the community at large, so giving that power to the author seems unreasonable. Imagine if an author released a spec with a known vulnerability/weakness, got a bunch of people to implement it, then released a new standard with a better implementation day one. It would be very anti-competitive at best.

On the other hand, backward pointing header fields (eg. updates, supersedes) would be less problematic, but we've never really discussed adding them.

@Pandapip1
Copy link
Member

Pandapip1 commented Sep 14, 2023

+0. I would prefer seperating out the actual errata from the other changes--+1 to the errata.

@xinbenlv
Copy link
Contributor

xinbenlv commented Sep 14, 2023

Wearing the hat of EIP Editor, I vote -1 unless we update the EIP-1, based on the current EIP editing rule, specifications can't be change after Final, despite I personally feel very sad for this vote.

Being an author myself, this is unfortunate and I would really hope it could be corrected. I urge the ERC community discuss our strategy for updating an ERC for the best practice. One example solution would be introducing the superseded-by or "best current practice" as in IETF.

@nand2
Copy link

nand2 commented Sep 25, 2023

Hi! I have been asked by @qizhou to follow up on this ; so in this PR I will remove the changes of the ERC-4804 and do a new ERC-6860 file for the errata / clarifications.

Question : Should I only make a "diff" (show changes only) which would make it more clear what are the changes, but would make it a bit more difficult to read (have to go back and forth between 4804 and this one), or should I copy all from 4804 and highlight the changes done?
What is the best practice?

Thank you.

@qizhou
Copy link
Contributor Author

qizhou commented Sep 25, 2023

Hi! I have been asked by @qizhou to follow up on this ; so in this PR I will remove the changes of the ERC-4804 and do a new ERC-6860 file for the errata / clarifications.

Question : Should I only make a "diff" (show changes only) which would make it more clear what are the changes, but would make it a bit more difficult to read (have to go back and forth between 4804 and this one), or should I copy all from 4804 and highlight the changes done? What is the best practice?

Thank you.

I feel we should create a new one for ERC-6860 (reuse this PR) which copies the ERC-4804 from here. Further, we should add "⚠️ This proposal updates EIP-4804 ⚠️" in the Abstract.

@github-actions
Copy link

There has been no activity on this pull request for 2 weeks. It will be closed after 3 months of inactivity. If you would like to move this PR forward, please respond to any outstanding feedback or add a comment indicating that you have addressed all required feedback and are ready for a review.

@github-actions github-actions bot added the w-stale Waiting on activity label Oct 10, 2023
@github-actions github-actions bot added c-new Creates a brand new proposal and removed c-update Modifies an existing proposal labels Oct 10, 2023
@github-actions github-actions bot removed the w-stale Waiting on activity label Oct 11, 2023
Copy link
Contributor Author

@qizhou qizhou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job! This makes a much better version compared to EIP-4804. I have compiled a few comments below.

EIPS/eip-6860.md Outdated
- **chainid** indicates which chain to resolve **contractName** and call the message. If not specified, the protocol will use the same chain as the name service provider, e.g., 1 for eth. If no name service provider is available, the default chainid is 1.
- **query** is an optional component containing a sequence of attribute-value pairs separated by "&".

All non-ASCII characters must be encoded with URI percent-encoding (RFC 3986).
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to https://en.wikipedia.org/wiki/Percent-encoding, only unreserved characters and reserved characters with special purpose need not be percent-encoded, so I feel the statement here can be more accurate.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, and even looking deeper into RFC 3986, I find out that they split the reserved chars into 2 groups:

reserved    = gen-delims / sub-delims
gen-delims  = ":" / "/" / "?" / "#" / "[" / "]" / "@"
sub-delims  = "!" / "$" / "&" / "'" / "(" / ")"
                  / "*" / "+" / "," / ";" / "="

And also that depending of the component, the list of characters that need to be encoded is different:

Path:

path          = path-abempty
...
path-abempty  = *( "/" segment )
segment       = *pchar
pchar         = unreserved / pct-encoded / sub-delims / ":" / "@"

Host (in case of hostname):

reg-name    = *( unreserved / pct-encoded / sub-delims )

So here in this example they allow not-encoded ":" and "@" in path but not in host, because they have a special meaning in host.
So they optimize to make encoding mandatory when really necessary. I'll update the text to do it likewise here.

Copy link

@nand2 nand2 Oct 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I have formalized this by adding :

The URL must be encoded with URI percent-encoding (RFC 3968) with rules very similar to the RFC. Per component, the rules are :

  • userinfo : Identical to userinfo as defined in RFC 3986 Appendix A.
  • contractName : Identical to reg-name as defined in RFC 3986 Appendix A.
  • chainid : Identical to port as defined in RFC 3986 Appendix A.
  • path : Identical to path-abempty as defined in RFC 3986 Appendix A, except the character "!" that needs to be percent-encoded when not used as type/value separator in an argument.
  • query : Identical to query as defined in RFC 3986 Appendix A.

Question : I quite like the full formalization of the structure of the URL in RFC 3986 Appendix A, I think it could be good to make a similar full formalization in an Appendix. What do you think?

Copy link

@nand2 nand2 Oct 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about the "!" character in path : If we assume that in the future we will not support domain name services that allows "!" as part of domain names (likely I guess), then we can skip this exception and be identical to the RFC ("!" are not required to be encoded in path).
I guess that's the way to go.

I have edited:

The URL must be encoded with URI percent-encoding (RFC 3968) with rules very similar to the RFC: Component userinfo follows userinfo encoding as defined in RFC 3986 Appendix A, component contractName follows reg-name, chainid follows port, path follows path-abempty, and query follows query.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I have formalized this by adding :

The URL must be encoded with URI percent-encoding (RFC 3968) with rules very similar to the RFC. Per component, the rules are :

  • userinfo : Identical to userinfo as defined in RFC 3986 Appendix A.
  • contractName : Identical to reg-name as defined in RFC 3986 Appendix A.
  • chainid : Identical to port as defined in RFC 3986 Appendix A.
  • path : Identical to path-abempty as defined in RFC 3986 Appendix A, except the character "!" that needs to be percent-encoded when not used as type/value separator in an argument.
  • query : Identical to query as defined in RFC 3986 Appendix A.

Question : I quite like the full formalization of the structure of the URL in RFC 3986 Appendix A, I think it could be good to make a similar full formalization in an Appendix. What do you think?

Yes, I think we can make a similar structure to formalize the details for each component.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about the "!" character in path : If we assume that in the future we will not support domain name services that allows "!" as part of domain names (likely I guess), then we can skip this exception and be identical to the RFC ("!" are not required to be encoded in path). I guess that's the way to go.

I have edited:

The URL must be encoded with URI percent-encoding (RFC 3968) with rules very similar to the RFC: Component userinfo follows userinfo encoding as defined in RFC 3986 Appendix A, component contractName follows reg-name, chainid follows port, path follows path-abempty, and query follows query.

I guess if the address contains ! we can still use address!abc%33.eth to support it?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think we can make a similar structure to formalize the details for each component.

Ok I will do that

I guess if the address contains ! we can still use address!abc%33.eth to support it?

Yes; to be more precise (I think I was a bit confusing) :

  • If we choose to be similar to the RFC, that means that the ! type/value delimiter can be either written as ! or %21. The problem is then : we cannot know if this is part of the domain name or not.. For example, if we have address%21abc%21.xyz, then we cannot know if this is the abc!.xyz address (detected with the <type>!<value> syntax), or the address!abc!.xyz address (detected with auto-detection part 5 as an address)
  • The "proper" way to handle that is to require the ! type/value delimiter to always be written as !, not %21. It then makes it a tiny bit of difference with the RFC.

I will do the Appendix with the formal definition, and let's continue from there :)

Copy link
Contributor Author

@qizhou qizhou Oct 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. If we put ! as type/value delimiter defined in ABNF, then we should never use %21%. I am not sure what is the specific difference with the RFC? (Looks like our domainName is stricter than reg-name in RFC 3986? or we may defer reg-name to NS lookup in future EIPs as RFC3986 does?)

That said, we could clarify the address part as below (I use reg-name for domainName, but feel free to comment)

  • address!vitalik.eth is a valid address to lookup vitalik from ENS;
  • address%21vitalik.eth is invalid as it cannot be recognized as any argument;
  • address!%21vitalik.eth is a valid address to lookup !vitalik from ENS;
  • address!vitalik%2Eeth is invalid as it cannot be recognized as a domain name with a TLD;
  • address!vitalik%2E.eth is a valid address to lookup vitalik. from ENS;
  • address!!vitalik.eth is a valid address to lookup !vitalik from ENS.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that domainName stricter than reg-name without reason (we could break with some upcoming domain name services), I have modified it to be identical to RFC 3986 ; and indeed already you have made domain name resolution deferred in upcoming EIPs (such as 6821) ; since 6821 is lower than 6860 I have linked 6821 for ENS and kept the mention "will be discussed in later EIPs for other name services.".

I agree on case 1/ and case 2/ : the ! type/value delimiter is in the ABNF so it's clear %21 is not authorized as a delimiter.
Case 2/ and 6/ : I agree
Case 4/ and 5/ : I would disagree on that; the general workflow would be : 1/ extract <value> from <type>!<value>, 2/ percent-decode <value> and then 3/ give <percent-decoded-value> to the domain name resolver code to do his work -- these would be 2 different layers. (Trying https://google%2Ecom on a browser, it does percent-decode it into google.com before doing resolution). So I would think :

  • address!vitalik%2Eeth is a valid address to lookup vitalik.eth from ENS
  • address!vitalik%2E.eth is invalid as it would lookup vitalik..eth from ENS

What do you think?

My modifications are at :
ethereum/ERCs@master...nand2:ERCs:patch-1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me. I just create the PR with the patch here ethereum/ERCs#84

EIPS/eip-6860.md Outdated Show resolved Hide resolved
EIPS/eip-6860.md Outdated Show resolved Hide resolved
EIPS/eip-6860.md Outdated Show resolved Hide resolved
EIPS/eip-6860.md Outdated Show resolved Hide resolved
EIPS/eip-6860.md Outdated Show resolved Hide resolved
EIPS/eip-6860.md Outdated

The protocol will find the address of **vitalik.eth** from ENS on chainid 1 (Mainnet) and then call the method "balanceOf(address)" of the address. The returned data from the call of the contract will be treated as raw bytes and will be encoded in JSON format like `["0x000000000000000000000000000000000000000000000000000009184e72a000"]` and returned to the frontend, with the information that the MIME type is ``application/json``.

### Changes versus [ERC-4804](./eip-4804.md)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should use EIP-4804 (also EIP over ERC) in the whole document.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, if I do so, I get errors from the EIP Walidator, which then prevents merging :

error[markdown-refs]: references to proposals with a `category` of `ERC` must use a prefix of `ERC`

nand2 and others added 6 commits October 12, 2023 08:40
Co-authored-by: Qi Zhou <qizhou@quarkchain.org>
Co-authored-by: Qi Zhou <qizhou@quarkchain.org>
Co-authored-by: Qi Zhou <qizhou@quarkchain.org>
Co-authored-by: Qi Zhou <qizhou@quarkchain.org>
Co-authored-by: Qi Zhou <qizhou@quarkchain.org>
@github-actions
Copy link

The commit cec1561 (as a parent of 9880cd6) contains errors.
Please inspect the Run Summary for details.

@github-actions github-actions bot added the w-ci Waiting on CI to pass label Oct 12, 2023
@github-actions github-actions bot removed the w-ci Waiting on CI to pass label Oct 12, 2023
@nand2
Copy link

nand2 commented Oct 18, 2023

Ok I have added a full "ABNF" definition to formalize web3:// URLs -- ABNF being a RFC2234 format used at least in RFC3968. Let me know if that looks ok for you!

I have a few remarks/questions while I made these changes :

In auto mode with ?returns= empty/undefined, we also use the end of the path to see if there is a file extension, and if yes, we determine the MIME type to send to the frontend. I have noticed that it only makes sense is the last argument is of type string, the only potential other argument type with a value ending with ".xxx" is address, and that would be the TLD of the domain name e.g. eth,w3q, and that makes no sense to get a MIME from "eth"/"w3q"/.... So I have explicity written that we look for file extension if the last argument type is string only. Tell me if this looks good to you.

At the beginning of the ERC, it is written This standard translates an RFC 2396 URI but RFC 2396 is obsoleted by RFC 3986 (in which we base ourselves for the ABNF def), should we replace this mention?

Lastly, in manual mode, there was a mention that "Path ? query" is sent as calldata. Path cannot be empty and must be at least '/'. I understood that as "Calldata sent cannot be null and must be at least 0x2f", but that the path itself is allowed to be empty -- especially knowing that the path can be empty in auto mode. I just wanted to confirm that yes, path can be empty in manual mode, but calldata must be at least 0x2f.

@eth-bot eth-bot enabled auto-merge (squash) October 18, 2023 14:48
Copy link
Collaborator

@eth-bot eth-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All Reviewers Have Approved; Performing Automatic Merge...

@eth-bot eth-bot merged commit 9115ba1 into ethereum:master Oct 18, 2023
10 checks passed
@qizhou
Copy link
Contributor Author

qizhou commented Oct 18, 2023

Ok I have added a full "ABNF" definition to formalize web3:// URLs -- ABNF being a RFC2234 format used at least in RFC3968. Let me know if that looks ok for you!

Great job!

In auto mode with ?returns= empty/undefined, we also use the end of the path to see if there is a file extension, and if yes, we determine the MIME type to send to the frontend. I have noticed that it only makes sense is the last argument is of type string, the only potential other argument type with a value ending with ".xxx" is address, and that would be the TLD of the domain name e.g. eth,w3q, and that makes no sense to get a MIME from "eth"/"w3q"/.... So I have explicity written that we look for file extension if the last argument type is string only. Tell me if this looks good to you.

Sounds good to me! Thanks for the clarification!

At the beginning of the ERC, it is written This standard translates an RFC 2396 URI but RFC 2396 is obsoleted by RFC 3986 (in which we base ourselves for the ABNF def), should we replace this mention?

Yes, we should follow the latest RFC here. Please update it to RFC 3986.

Lastly, in manual mode, there was a mention that "Path ? query" is sent as calldata. Path cannot be empty and must be at least '/'. I understood that as "Calldata sent cannot be null and must be at least 0x2f", but that the path itself is allowed to be empty -- especially knowing that the path can be empty in auto mode. I just wanted to confirm that yes, path can be empty in manual mode, but calldata must be at least 0x2f.

I think path can be empty (per HTTP URI standard) and in such a case, calldata should be empty also (although the browser or the contract may directly redirect to a URI with path "/" to make relative path work).

@nand2
Copy link

nand2 commented Oct 26, 2023

Lastly, in manual mode, there was a mention that "Path ? query" is sent as calldata. Path cannot be empty and must be at least '/'. I understood that as "Calldata sent cannot be null and must be at least 0x2f", but that the path itself is allowed to be empty -- especially knowing that the path can be empty in auto mode. I just wanted to confirm that yes, path can be empty in manual mode, but calldata must be at least 0x2f.

I think path can be empty (per HTTP URI standard) and in such a case, calldata should be empty also (although the browser or the contract may directly redirect to a URI with path "/" to make relative path work).

I agree, that sounds more logical indeed to me; I will update the ERC and the code. (Be aware that some of your existing manual mode websites could behave weirdly if we send them empty calldata)

Also later I will also add support for relative URLs in the ABNF (they have in RFC3968)

@nand2
Copy link

nand2 commented Oct 26, 2023

Ah now that I am looking at absolute paths... This makes it a bit weird, because you can only reference the homepage one way: we cannot both reference the homepage with / or without /. (<a href="/">)

Also, I read on the HTTP RFC :

If the abs_path is not present in the URL, it MUST be given as "/" when
used as a Request-URI for a resource

So for HTTP, both google.com and google.com/ are the same (it will send GET / for both). So if we aim to be close to HTTP, maybe it was a good idea to have a 0x2f calldata for both urls web3://xxx and web3://xxx/ (for manual mode)?

What do you think?

@qizhou
Copy link
Contributor Author

qizhou commented Nov 2, 2023

Ah now that I am looking at absolute paths... This makes it a bit weird, because you can only reference the homepage one way: we cannot both reference the homepage with / or without /. (<a href="/">)

Also, I read on the HTTP RFC :

If the abs_path is not present in the URL, it MUST be given as "/" when
used as a Request-URI for a resource

So for HTTP, both google.com and google.com/ are the same (it will send GET / for both). So if we aim to be close to HTTP, maybe it was a good idea to have a 0x2f calldata for both urls web3://xxx and web3://xxx/ (for manual mode)?

What do you think?

Sounds good to me. Many thanks for the check!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c-new Creates a brand new proposal e-consensus Waiting on editor consensus e-review Waiting on editor to review s-draft This EIP is a Draft t-erc
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants