Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add type hints to eth module #1398

Closed
cburgdorf opened this issue Oct 16, 2018 · 38 comments · Fixed by #1456
Closed

Add type hints to eth module #1398

cburgdorf opened this issue Oct 16, 2018 · 38 comments · Fixed by #1456

Comments

@cburgdorf
Copy link
Contributor

cburgdorf commented Oct 16, 2018

Background

Type hints allow us to perform static type checking, among other things. They raise the security bar by catching bugs at development time that, without type support, may turn into runtime bugs.

This stackoverflow answer does a great job at describing their main benefits.

What is wrong?

We currently enforce type hints for the entire p2p and trinity package. While the eth package does have some type hints already, the coverage isn't at 100 % and while it is agreed up on that new code needs to land with type hints, there is not technical enforcement of this rule in place yet.

This needs to be fixed by:

  1. Adding all missing type hints.
  2. Enforcing (stricter) type checking in CI

How

There does exist tooling (monkeytype) to the generation of type hints for existing code bases. From my personal experience monkeytype can be helpful but does still require manual fine tuning. Also, manually adding these type hints does serve as a great boost to the general understanding of the code base as it forces one to think about the code.

  1. Run mypy --follow-imports=silent --warn-unused-ignores --ignore-missing-imports --no-strict-optional --check-untyped-defs --disallow-incomplete-defs --disallow-untyped-defs --disallow-any-generics -p eth

  2. Eliminate every reported error by adding the right type hint

  3. This should probably be done incrementally in roughly the following steps.

  • eth.tools (eligible for payout of 150 DAI)
  • eth.vm (eligible for payout of 350 DAI)
  • everything that is not eth.vm (eligible for payout of 300 DAI)

The reason for this order is, that it makes it easy to incrementally update the tox.ini to enforce stricter rules for such each package without cluttering tox.ini too much.

  1. Send frequent pull requests for chunks of work

Definition of done

This issue is done when the following criteria are met:

  1. The following configuration in tox.ini is changed.

py-evm/tox.ini

Line 107 in 8f74e6b

mypy --follow-imports=silent --warn-unused-ignores --ignore-missing-imports --no-strict-optional --check-untyped-defs --disallow-incomplete-defs -p eth

It needs to be:

mypy --follow-imports=silent --warn-unused-ignores --ignore-missing-imports --no-strict-optional --check-untyped-defs --disallow-incomplete-defs --disallow-untyped-defs --disallow-any-generics -p eth
  1. Usage of type: ignore (silencing the type checker) is minimized and there's a reasonable explanation for its usage

Stretch goals

When this issue is done, stretch goals can be applied (and individually get funded) to tighten type support to qualify:

  1. mypy --strict --follow-imports=silent --ignore-missing-imports --no-strict-optional -p eth -p p2p -p trinity

  2. mypy --strict --follow-imports=silent --ignore-missing-imports -p eth -p p2p -p trinity

@Bhargavasomu
Copy link
Contributor

I would like to work on this. Could you please assign this to me?

@gitcoinbot
Copy link

Issue Status: 1. Open 2. Started 3. Submitted 4. Done


This issue now has a funding of 800.0 DAI (800.0 USD @ $1.0/DAI) attached to it.

1 similar comment
@gitcoinbot
Copy link

Issue Status: 1. Open 2. Started 3. Submitted 4. Done


This issue now has a funding of 800.0 DAI (800.0 USD @ $1.0/DAI) attached to it.

@gitcoinbot
Copy link

gitcoinbot commented Oct 16, 2018

Issue Status: 1. Open 2. Started 3. Submitted 4. Done


Work has been started.

These users each claimed they can complete the work by 2 months from now.
Please review their action plans below:

1) bhargavasomu has been approved to start work.

I would make PR for type hinting of 2 modules per day and will try my level best to complete this task as soon as possible. I estimate that this issue would be solved by me within 5 days. I have a pretty good knowledge of the codebase and can solve this issue without much time waste. Maybe this request could be extended to enable type hinting of the left over models also. Please assign this to me as I feel that I can complete this fastly as I have sufficient knowledge of the codebase and mypy

Learn more on the Gitcoin Issue Details page.

2) svenski123 has applied to start work (Funders only: approve worker | reject worker).

I note mypy currently reports 520 type related errors and notes in the eth.
Adding correct type hints, updating tox to enfore type checks and testing should take no more than a day.
Review of usage of type: ignore in eth, p2p and other modules and removal were appropriate would take a little longer.

I do see that there is other interest in addressing this issue, however as the issue is still currently unassigned in gitcoin I assume offers are still welcome - if this assumption is incorrect, please accept my apologies.

Learn more on the Gitcoin Issue Details page.

3) bhargavasomu has applied to start work (Funders only: approve worker | reject worker).

I am well versed with type hinting in python and have a good understanding of the codebase. If assigned to me, I will complete this task within at max a 3 days.

Learn more on the Gitcoin Issue Details page.

@cburgdorf
Copy link
Contributor Author

@gitcoinbot I hereby approve @Bhargavasomu working on this issue

@spm32
Copy link

spm32 commented Oct 17, 2018

@Bhargavasomu You're good to go!

@cburgdorf
Copy link
Contributor Author

@svenski123 @shalabhaggarwal the issue was given to @Bhargavasomu on a first come, first serve basis. For now it should be considered blocked (unless no sign of progress is being made within the next few days).

That said, we'll probably open up more bounties for similar tasks in other related projects soon.

@pipermerriam
Copy link
Member

@svenski123 and/or @shalabhaggarwal

I don't think bounties have been added yet, but I've applied for around 10+ new bounties to be added to the py-trie, eth-bloom, py_ecc and eth-keys repositories this morning for similar tasks. Keep an eye out.

@pipermerriam
Copy link
Member

Noting that

HeaderParams = Union[Optional[int], bytes, Address, Hash32]
should be updated to use a mypy_extensions.TypedDict

@Bhargavasomu
Copy link
Contributor

Bhargavasomu commented Oct 22, 2018

@cburgdorf , the __call__function in eth.vm.opcode which is an abstract function, uses the variable computation which should be of the type BaseComputation. But if we import this, it will cause cyclic dependencies, as computation.py will import opcode.py. So should I use the type Any or is there some way we cam use to overcome the cyclic dependencies?

Edit - Figured it out

@cburgdorf
Copy link
Contributor Author

you can put the import behind a If TYPE_CHECKING check to overcome the cyclic import issue. It's explained here: https://mypy.readthedocs.io/en/latest/common_issues.html#import-cycles

@Bhargavasomu
Copy link
Contributor

In eth.vm.stack the return type of the function pop is
Union[int, Hash32, Tuple[Union[int, Hash32], ...]].
This would lead to a bunch of assert statements, wherever this function is used, especially in the files belonging to eth.vm.logic.
Should I go ahead with putting the assert statements?

@cburgdorf
Copy link
Contributor Author

Should I go ahead with putting the assert statements?

I think the Stack is one of the places where we want to try hard to avoid adding assert statements because it may end up having a huge negative impact on the EVM performance.

I think my preference would be split this into two methods:

  • one that returns only a single item
  • one that returns multiple items

Some name suggestions:

Single Multiple
pop_single pop_multiple
pop pop_multiple
pop pop_num

/cc @pipermerriam @carver

@carver
Copy link
Contributor

carver commented Oct 23, 2018

Maybe I shouldn't be commenting at 6am but...

Another option is to go as far as enumerating all the possibilities, like:

  • pop1int()
  • pop2int()
  • pop3int()
  • pop1bytes()
  • pop2bytes()
  • pop3bytes()
  • Is that it? Not sure how the "any type" hint is used

@cburgdorf
Copy link
Contributor Author

@carver that API looks...extreme 😅 I'd prefer just to split between single / multiple which should be enough to overcome the type ambiguity.

@pipermerriam
Copy link
Member

I'd like to remove the support for multiple types in the stack. I think it was a bad design decision. Only type that should probably be allowed on the stack is int. And 👍 for something like pop1() pop2(), pop3() API.

@cburgdorf
Copy link
Contributor Author

cburgdorf commented Oct 23, 2018

I'd like to remove the support for multiple types in the stack. I think it was a bad design decision

👍

And 👍 for something like pop1() pop2(), pop3()

Just to be clear, are we talking about shorthand APIs that delegate to something like pop_num(2) or do you really mean something that is implemented as:

def pop3(self):
    yield self.pop()
    yield self.pop()
    yield self.pop()

Because, that just strikes me as a bit redundant compared to.

def pop_num(self, num):
    for _ in range(num):
        yield self.pop()

And then

def pop3(self):
    yield from self.pop_num(3)

@Bhargavasomu
Copy link
Contributor

@cburgdorf the above method looks promising and we could still have the same function signature as the present one, just in case we need the flexibility.

@pipermerriam
Copy link
Member

@Bhargavasomu my proposal to remove multi-type support from the stack should be done independently of this PR as it's going to touch a lot of code, but should be very straight-forward to change.

@pipermerriam
Copy link
Member

@cburgdorf I'm inclined to do something that's really dumb and easy to read.

def pop3(self):
    return (self.pop(), self.pop(), self.pop())

@carver
Copy link
Contributor

carver commented Oct 23, 2018

The primary benefit I see is the strict return type signature.

def pop(self) -> int:
  ...

def pop2(self) -> Tuple[int, int]:
  return (self.pop(), self.pop())

def pop3(self) -> Tuple[int, int, int]:
  return (self.pop(), self.pop(), self.pop())

@pipermerriam
Copy link
Member

Though, on second thought, this is a part of the code where raw performance matters so reducing the number of call frames would be ideal.

def pop3(self):
    try:
        return (self.values.pop(), self.values.pop(), self.values.pop())
    except IndexError:
        raise InsufficientStack("...")

3 fewer call frames at the expense of requiring inline exception handling for each method...

@pipermerriam
Copy link
Member

And deeper down the rabbit hole.

def pop3(self):
    popped = self.values[-3:]
    if len(popped) != 3:
        raise InsufficientStack("..."
    del self.values[-3:]
    return tuple(popped)

I'm in the middle of something else but it'd be nice to profile these different approaches and see if any are clearly faster.

@cburgdorf
Copy link
Contributor Author

@ceresstation We just merged #1420 as a first milestone to this. Can we please pay out 150 DAI (out of the entire 800 DAI) to @Bhargavasomu already?

@Bhargavasomu
Copy link
Contributor

Bhargavasomu commented Oct 28, 2018

I guess the type hinting of the eth.vm module has to wait till the pop function of the stack is modified. I am going ahead with modifying the stack pop function throughout the whole codebase, including the testcases. The modified code would be submitted as part of another PR.

There is some part of the codebase which pushes bytes(Hash32 and Address) into the stack. So I guess we should convert the bytes to int via big_endian_to_int while pushing it into the stack (Making the Stack values of only type int). Is that it @pipermerriam?

@frankchen07
Copy link

^ @ceresstation, when you get the chance!

@gitcoinbot
Copy link

gitcoinbot commented Nov 7, 2018

Issue Status: 1. Open 2. Started 3. Submitted 4. Done


Workers have applied to start work.

These users each claimed they can complete the work by 1 month, 3 weeks ago.
Please review their action plans below:

1) armaspablosebastian has applied to start work (Funders only: approve worker | reject worker).

There's not much mystery, I'm going to run mypy and I'm going to add all missing type hints in the required order.

Learn more on the Gitcoin Issue Details page.

@Bhargavasomu
Copy link
Contributor

@cburgdorf does #1451 complete the type hinting of the eth.vm module, or are there any more improvements I am supposed to do? Or can I type hint the rest of the sub-modules left in eth

@Bhargavasomu
Copy link
Contributor

@cburgdorf in eth.vm.spoof I see that SpoofTransaction doesn't inherit BaseTransaction which doesn't make sense logically. Is there anything we could do about this or was it designed in such a way on purpose? This is because we have an issue in eth.estimators.gas where the argument to state.execute_transaction() is SpoofTransaction, but the function expects the type BaseTransaction.

That said, I am using cast as of now. Please let me know if we could do better.

@Bhargavasomu
Copy link
Contributor

Bhargavasomu commented Nov 8, 2018

@cburgdorf in eth.rlp.headers there is an overloading, and mypy is throwing this error
error: Overloaded function implementation does not accept all possible arguments of signature 1.

But looking at the code, we have used the same signature(__init__) twice which I think is not needed. Can I remove one of the __init__?

@Bhargavasomu
Copy link
Contributor

I guess one of them is the implementation and cannot remove either of the two. Sorry for the misunderstanding and spam.

@cburgdorf
Copy link
Contributor Author

Or can I type hint the rest of the sub-modules left in eth

Yes, you can. I'll make sure to trigger the payout for the eth.vm part

Regarding SpoofTransaction. I can't tell what to do here from the top of my head. So for now, I'd say if a cast gets you unblocked, go for it. We can figure out what to do at a later stage.

@cburgdorf
Copy link
Contributor Author

@ceresstation Can we trigger another payout of 350 DAI to @Bhargavasomu for completing the second milestone.

@Bhargavasomu
Copy link
Contributor

@cburgdorf I am getting the following error when I am trying to type hint the overloaded function __init__ (the third instance) in eth.rlp.headers.
eth/rlp/headers.py:111: error: Overloaded function implementation does not accept all possible arguments of signature 1

On googling I got this, python/mypy#4619. But this doesn't give a substantial fix. Can I use type: ignore for this?

@cburgdorf
Copy link
Contributor Author

@Bhargavasomu It's hard for me to tell without seeing it in context with the applied type hint and then probably diving into it myself. Just put an type: ignore on it for now and I'll take a look when I review the PR.

@cburgdorf
Copy link
Contributor Author

@ceresstation the last remaining milestone can be paid out to @Bhargavasomu

@gitcoinbot
Copy link

⚡️ A tip worth 800.00000 DAI (800.0 USD @ $1.0/DAI) has been granted to @Bhargavasomu for this issue from @ceresstation. ⚡️

Nice work @Bhargavasomu! To redeem your tip, login to Gitcoin at https://gitcoin.co/explorer and select 'Claim Tip' from dropdown menu in the top right, or check your email for a link to the tip redemption page.

@spm32
Copy link

spm32 commented Nov 14, 2018

Sorry for waiting until the end to pay you @Bhargavasomu, thanks for bearing with me, we had a few small technical difficulties with the milestone based payments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants