Skip to content
This repository has been archived by the owner on Nov 29, 2022. It is now read-only.

Develop high level script(s) for managing scraping/archiving #52

Closed
zschira opened this issue Sep 13, 2022 · 1 comment
Closed

Develop high level script(s) for managing scraping/archiving #52

zschira opened this issue Sep 13, 2022 · 1 comment
Assignees

Comments

@zschira
Copy link
Member

zschira commented Sep 13, 2022

The FERC datasets will need a script to manage scraping both the DBF and XBRL data. It may also be useful to create a single high level script for scraping data from all sources.

@zschira zschira self-assigned this Sep 13, 2022
@zschira zschira transferred this issue from catalyst-cooperative/pudl Sep 13, 2022
@zaneselvans
Copy link
Member

We already depend indirectly on the click and typer CLI frameworks, and I think they both provide hooks for tab completion and hierarchical scripts, which might be useful in this context. I've often imagined having a hierarchical script for PUDL with unified help messages & interface like

$ pudl scrape ferc1 ferc2 ferc6 ferc60 ferc714
$ pudl archive ferc1 ferc2 ferc6 ferc60 ferc714
$ pudl datastore update-cache ferc1 ferc2 ferc6 ferc60 ferc714
$ pudl ferc2sqlite settings/ferc2sqlite.yml

@zschira zschira changed the title Develop high level script(s) for managing scraping Develop high level script(s) for managing scraping/archiving Oct 11, 2022
@zschira zschira closed this as completed Nov 21, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants