Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: added ape cache plugin #680

Merged
merged 133 commits into from Aug 18, 2022

Conversation

johnson2427
Copy link
Contributor

@johnson2427 johnson2427 commented Apr 29, 2022

What I did

Added CacheQueryProvider to ape cache and created initial database structure and methods to instantiate an sqlite database for caching provider data.

fixes: #632

How I did it

created cli and added init, purge and query functions

How to verify it

I used the infura plugin

ape plugins install infura

Then instantiate the caching database

ape cache init --network ethereum:mainnet:infura
ape console --network ethereum:mainnet:infura

Run this multiple times:

In [1]: chain.blocks.query("number", start_block=0, stop_block=100)

You should see that the query occurs quicker after the first call. This will take a few seconds to grab the data from the provider

Also run transaction calls multiple times:

In [2]: chain.blocks[-2].transactions

You should see that the query is quicker after the first call. That block number can change, so you may not see it after the first attempt. But you will have DynamicTransactionAPI datatypes in the return if it comes from the provider. You will just receive a dictionary if it comes from the database.

Then exit the IPython kernel
And run the below commands

ape cache query --network ethereum:mainnet:infura "SELECT *  FROM blocks"
ape cache query --network ethereum:mainnet:infura "SELECT * FROM transactions"

Both should return data from the database.

Also run pytests

Checklist

  • All changes are completed
  • New test cases have been added
  • Documentation has been updated

Copy link
Contributor

@antazoey antazoey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coming along!

My main two comments:

  • Avoid string interpolation in SQL - use query parameters instead... It is better practice and has more security checks
  • What happened to DefaultQueryProvider? Are we required to have a DB now? This would be a pretty big breaking change, right?

src/ape_cache/_cli.py Outdated Show resolved Hide resolved
src/ape_cache/_cli.py Outdated Show resolved Hide resolved
src/ape_cache/_cli.py Outdated Show resolved Hide resolved
src/ape_cache/_cli.py Outdated Show resolved Hide resolved
src/ape_cache/_cli.py Outdated Show resolved Hide resolved
src/ape_cache/query.py Outdated Show resolved Hide resolved
src/ape_cache/query.py Show resolved Hide resolved
src/ape_cache/query.py Outdated Show resolved Hide resolved
src/ape_cache/query.py Outdated Show resolved Hide resolved
src/ape/managers/query.py Outdated Show resolved Hide resolved
antazoey
antazoey previously approved these changes Apr 29, 2022
Copy link
Contributor

@antazoey antazoey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the docs in ape_cache can just be deleted! We don't really generate anything from those.

Address @fubuloubu 's thing and make the tests happy and it is an approval from me :)

Bonus if you use the logger though! It is better because I sometimes like to disable output using the --verbosity flag and it's more consistent.

Thank you :)

src/ape_cache/__init__.py Outdated Show resolved Hide resolved
src/ape/managers/query.py Show resolved Hide resolved
src/ape/managers/query.py Show resolved Hide resolved
src/ape_cache/_cli.py Outdated Show resolved Hide resolved
src/ape/managers/query.py Outdated Show resolved Hide resolved
src/ape/managers/query.py Outdated Show resolved Hide resolved
src/ape/managers/query.py Outdated Show resolved Hide resolved
src/ape_cache/schemas.py Outdated Show resolved Hide resolved
setup.py Show resolved Hide resolved
src/ape_cache/_cli.py Show resolved Hide resolved
src/ape_cache/_cli.py Outdated Show resolved Hide resolved
src/ape_cache/_cli.py Outdated Show resolved Hide resolved
src/ape_cache/base.py Show resolved Hide resolved
src/ape_cache/query.py Show resolved Hide resolved
src/ape_cache/query.py Outdated Show resolved Hide resolved
src/ape_cache/query.py Outdated Show resolved Hide resolved
src/ape_cache/query.py Outdated Show resolved Hide resolved
src/ape_cache/schemas.py Outdated Show resolved Hide resolved
@johnson2427
Copy link
Contributor Author

Should we force the query to pull all data from the provider if it doesn't exist? That way we could populate the cache db. If you wanted to only see the number column, then pandas will return that column at the end, yet the rest of that data will be stored. It prevents you from having to hit the provider multiple times for data that you are already there for.

@fubuloubu
Copy link
Member

Should we force the query to pull all data from the provider if it doesn't exist? That way we could populate the cache db. If you wanted to only see the number column, then pandas will return that column at the end, yet the rest of that data will be stored. It prevents you from having to hit the provider multiple times for data that you are already there for.

By default, it already is pulling all the data, however it is then filtering out all the data that wasn't asked for. So pushing that data to the cache db prior to that filter being applied would be an acceptable way to proceed. Since it's an iterator, you would need itertools.tee to fork the iterator into two different iterators

src/ape_cache/query.py Outdated Show resolved Hide resolved
@johnson2427
Copy link
Contributor Author

johnson2427 commented May 2, 2022

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
ape-infura 0.2.0 requires eth-ape<0.3.0,>=0.2.1, but you have eth-ape 0.1.0b3.dev192+g9a790289.d20220502 which is incompatible.

Not sure how to handle this issue.

This this is a quick fix that I'm not seeing, let me know

@NotPeopling2day
Copy link
Contributor


ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.

ape-infura 0.2.0 requires eth-ape<0.3.0,>=0.2.1, but you have eth-ape 0.1.0b3.dev192+g9a790289.d20220502 which is incompatible.

Not sure how to handle this issue.

This this is a quick fix that I'm not seeing, let me know

Sounds like your environment is behind? Try fetching the latest version of ape core from upstream and see if that fixes it.

@johnson2427
Copy link
Contributor Author


ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.

ape-infura 0.2.0 requires eth-ape<0.3.0,>=0.2.1, but you have eth-ape 0.1.0b3.dev192+g9a790289.d20220502 which is incompatible.

Not sure how to handle this issue.
This this is a quick fix that I'm not seeing, let me know

Sounds like your environment is behind? Try fetching the latest version of ape core from upstream and see if that fixes it.

yeah, it is, I'm up to date with the main, rebased, yet when I install, it still installs 0.1... not sure why. I'll remake my venv

Copy link
Contributor

@antazoey antazoey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some small code optimizations.

I have an idea for some tests, but it might not be easy:

  1. Populate the .db files on your machine for various providers.
  2. Add those files to test fixtures and use them in tests

And so that leads to me another idea: Allow connecting to HistoricalProvider which only requires the .db files and no active connection. This will help with testing against historical data.

Edit: the testing / historical-provider design can happen later, but tests would be nice somehow.

src/ape_cache/query.py Outdated Show resolved Hide resolved
src/ape_cache/query.py Outdated Show resolved Hide resolved
src/ape_cache/query.py Outdated Show resolved Hide resolved
src/ape_cache/query.py Outdated Show resolved Hide resolved
src/ape_cache/query.py Outdated Show resolved Hide resolved
@johnson2427
Copy link
Contributor Author

Only issue left is I cannot store the data as is. We get an error for unsupported types.

Copy link
Contributor

@antazoey antazoey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of my other comments were resolved though I don't see the resolution (maybe not pushed yet?).

Looking fantastic though! I can't wait to use, especially for mainnet-fork.
I want to make a pre-populated mainnet-fork .db for testing.

src/ape_cache/_cli.py Outdated Show resolved Hide resolved
src/ape_cache/_cli.py Outdated Show resolved Hide resolved
fubuloubu
fubuloubu previously approved these changes May 11, 2022
Copy link
Member

@fubuloubu fubuloubu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A little bit more refactoring and we can merge this!

Then move optimization work to another PR (partial row support, mgirations, etc.)

src/ape_cache/query.py Show resolved Hide resolved
src/ape_cache/query.py Outdated Show resolved Hide resolved
src/ape_cache/query.py Outdated Show resolved Hide resolved
src/ape_cache/query.py Outdated Show resolved Hide resolved
src/ape_cache/query.py Outdated Show resolved Hide resolved
src/ape_cache/query.py Outdated Show resolved Hide resolved
src/ape_cache/query.py Outdated Show resolved Hide resolved
src/ape/managers/query.py Outdated Show resolved Hide resolved
src/ape/managers/query.py Outdated Show resolved Hide resolved
@antazoey antazoey self-requested a review May 12, 2022 17:54
antazoey
antazoey previously approved these changes May 12, 2022
Copy link
Contributor

@antazoey antazoey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

@antazoey antazoey dismissed their stale review May 12, 2022 17:57

hold up, seeing something with a logger?

src/ape_cache/_cli.py Outdated Show resolved Hide resolved
@antazoey
Copy link
Contributor

Minimally, go to tests/integration/cli/test_misc.py and add cache to that list. I am concerned that the plugin currently does not work yet it has passing tests.

@johnson2427
Copy link
Contributor Author

@fubuloubu I respectfully argue against using decode_receipt on the caching side. There's really no need to do so at the moment. We just need the data to return a dataframe at the moment.

fubuloubu
fubuloubu previously approved these changes Aug 18, 2022
Copy link
Member

@fubuloubu fubuloubu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one major refactor and a bunch of minor comments, and we are good

src/ape_cache/query.py Show resolved Hide resolved
src/ape_cache/query.py Outdated Show resolved Hide resolved
src/ape_cache/query.py Outdated Show resolved Hide resolved
src/ape_cache/query.py Outdated Show resolved Hide resolved
)

@singledispatchmethod
def perform_query(self, query: QueryType) -> Optional[Iterator]: # type: ignore
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def perform_query(self, query: QueryType) -> Optional[Iterator]: # type: ignore
def perform_query(self, query: QueryType) -> Iterator:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

21465f7

Still need the type ignore. Without it I get:

src/ape_cache/query.py:314: error: Signature of "perform_query" incompatible with supertype "QueryAPI"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm

src/ape_cache/query.py Outdated Show resolved Hide resolved
src/ape_cache/query.py Outdated Show resolved Hide resolved
src/ape_cache/query.py Outdated Show resolved Hide resolved
src/ape_cache/query.py Show resolved Hide resolved
src/ape_cache/query.py Show resolved Hide resolved
src/ape_cache/query.py Outdated Show resolved Hide resolved
src/ape_cache/query.py Outdated Show resolved Hide resolved
fubuloubu
fubuloubu previously approved these changes Aug 18, 2022
@johnson2427
Copy link
Contributor Author

@fubuloubu I'm not touching that Squash and Merge button! Too much pressure!

fubuloubu
fubuloubu previously approved these changes Aug 18, 2022
Copy link
Member

@fubuloubu fubuloubu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved, with many issues coming out of this which @NotPeopling2day is tracking

fubuloubu
fubuloubu previously approved these changes Aug 18, 2022
@fubuloubu fubuloubu dismissed stale reviews from NotPeopling2day and themself via 6c79a10 August 18, 2022 17:48
@fubuloubu fubuloubu enabled auto-merge (squash) August 18, 2022 17:48
Copy link
Contributor

@antazoey antazoey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@fubuloubu fubuloubu merged commit 656b987 into ApeWorX:main Aug 18, 2022
@johnson2427 johnson2427 deleted the feat/cache_query_plugin branch September 2, 2022 22:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Query: Add Caching Database
5 participants