Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: warn large resource issue#429 #609

Merged
merged 2 commits into from
Feb 23, 2022
Merged

Conversation

Vinay26k
Copy link
Contributor

PR for Issue #429

changes in the PR:

  • implementation of _check_file_size(path) warns user about file > 1MB
  • function call before calculating hashing of resource file

pipeline.yaml

tasks:
  - source: tasks.raw.get
    product: products/raw/get.csv
    params:
      resources_:
        file: conf.yaml
    .....

conf.yaml: dummy file with ~9MB of data
script used to test:

def get(resources_, product):
    print(" -------------- Debug --------------")
    print(open(resources_['file']).read()[:10])
    ....

output:
File too large. Resource <DIR-PATH>/conf.yaml[9.98MB] > 1MB

@edublancas
Copy link
Contributor

edublancas commented Feb 21, 2022

looks good! can you add a test? please add it to the test_resources.py file

you can check for warnings using this: https://docs.pytest.org/en/6.2.x/warnings.html

and to simulate a big file, you can patch the os.stat call:

from ploomber.products import _resources

def test_process_resources_warns_on_large_file(monkeypatch):
    # this will cause the os.stat call in the _resources module to return 2E+6
    monkeypatch.setattr(_resources.os, 'stat', lambda _: 2E+6)
    # check warning is displayed

@edublancas edublancas merged commit 8da1f78 into ploomber:master Feb 23, 2022
@edublancas
Copy link
Contributor

thanks a lot for your contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants