Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Technique example: data loader, Python to parquet #1422

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

allisonhorst
Copy link
Contributor

@allisonhorst allisonhorst requested review from Fil and mbostock June 3, 2024 13:48
@Fil
Copy link
Contributor

Fil commented Jun 6, 2024

can we link to https://arrow.apache.org/docs/python/generated/pyarrow.parquet.write_table.html explicitly mentioning that there are many options (and recommend compression); and maybe show compression in action?

@allisonhorst
Copy link
Contributor Author

@Fil I added the compression codec explicitly in the loader (compression="snappy"), and include a sentence pointing to the write_table docs and different compression algorithms. Look okay?

can we link to https://arrow.apache.org/docs/python/generated/pyarrow.parquet.write_table.html explicitly mentioning that there are many options (and recommend compression); and maybe show compression in action?

Copy link
Member

@mbostock mbostock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mind copying the new virtual environment pattern from #1468?

<div class="note">

To run this data loader, you’ll need python3 and the geopandas, matplotlib, io, and sys modules installed and available on your `$PATH`.

</div>

<div class="tip">

We recommend using a [Python virtual environment](https://observablehq.com/framework/loaders#venv), such as with venv or uv, and managing required packages via `requirements.txt` rather than installing them globally.

</div>

@jaanli
Copy link

jaanli commented Jun 15, 2024

Quick question - would dbt work here?

Comment on lines +66 to +69
Plot.barX(dams,
Plot.groupY({x: "count"}, {y: "Primary Purpose", fill: "Hazard Potential Classification", sort: {y: "x", reverse: true}
})
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please prettier this. 🙏

Also, you can use sort: {y: "-x"} to shorten.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep will do!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants